### **Q1: What is the difference between a t-test and a z-test? Provide an example scenario where you would use each type of test.**

### ***ANSWER :***

Both t-tests and z-tests are statistical hypothesis tests used to make inferences about population parameters based on sample data. We use a t-test when the population standard deviation is unknown or when dealing with small sample sizes, and use a z-test when the population standard deviation is known or when dealing with large sample sizes. The choice between the two tests depends on the specific characteristics of the data and the information available about the population.

1. **T-test:**
>
The t-test is used when the population standard deviation is unknown and must be estimated from the sample. It is typically employed when dealing with small sample sizes (usually less than 30) or when the population standard deviation is not available.

- ***Example Scenario: Let's say a pharmaceutical company has developed a new drug to reduce blood pressure. They want to test whether the drug is effective in reducing blood pressure in a specific population. They randomly select 20 individuals with hypertension and administer the drug to them for a certain period. After the treatment, they measure their blood pressure and want to determine if the drug has a significant effect.***

2. **Z-test:**
>
The z-test is used when the population standard deviation is known or when dealing with large sample sizes (typically more than 30). It is more suitable for cases where there is enough data available to estimate the population standard deviation accurately.

- ***Example Scenario: An education researcher wants to investigate whether there is a significant difference in exam scores between students from two different schools. They collect data on exam scores from a random sample of 100 students from each school. The researcher knows that the population standard deviations of the exam scores are approximately the same for both schools based on historical data. In this case, they can use a z-test to compare the means of the two groups.***


### **Q2: Differentiate between one-tailed and two-tailed tests.**

### ***ANSWER :***

One-tailed and two-tailed tests are types of hypothesis tests used in statistics to evaluate whether there is a significant difference or relationship between variables in a given population. The main difference lies in the directionality of the hypothesis being tested.

1. **One-tailed test:**
>
Also known as a directional test, a one-tailed test examines whether a parameter is significantly greater than or less than a specific value. It is used when there is a clear direction of the effect that is expected based on prior knowledge or a well-defined theory. The critical region for the test is concentrated on one side of the distribution.

 - ***Example: A researcher wants to determine if a new teaching method improves students' test scores. They hypothesize that the new method will lead to higher scores. The one-tailed test will assess whether there is a statistically significant increase in test scores due to the new teaching method.***

2. **Two-tailed test:**
>
Also known as a non-directional test, a two-tailed test examines whether a parameter is significantly different from a specific value, regardless of the direction of the difference. It is used when there is no specific expectation about the direction of the effect or when researchers want to be more conservative in their conclusions.

 - ***Example: A manufacturer wants to test if the mean weight of a product matches the specified value of 500 grams. The two-tailed test will assess whether there is a statistically significant difference in the mean weight from the specified value, either heavier or lighter.***

- In both types of tests, the null hypothesis (H0) represents the assumption that there is no significant difference or effect, while the alternative hypothesis (H1) represents the claim that there is a significant difference or effect.

- When conducting a hypothesis test, the choice between a one-tailed or two-tailed test should be based on the specific research question, prior knowledge, and the directional expectations of the effect being investigated. One-tailed tests are generally more powerful (sensitive) than two-tailed tests, but they should only be used when there is a strong theoretical basis for expecting an effect in a specific direction. Otherwise, a two-tailed test is more appropriate as it allows for the detection of differences in either direction.

### **Q3: Explain the concept of Type 1 and Type 2 errors in hypothesis testing. Provide an example scenario for each type of error.**

### ***ANSWER :***

Type 1 and Type 2 errors are potential mistakes that can occur in hypothesis testing. They are associated with rejecting or failing to reject the null hypothesis, respectively. In hypothesis testing, the null hypothesis (H0) represents the assumption of no effect or no difference, while the alternative hypothesis (H1) represents the claim that there is a significant effect or difference.

1. **Type 1 Error (False Positive):**
>
A Type 1 error occurs when we mistakenly reject a true null hypothesis. In other words, we conclude that there is a significant effect or difference when, in reality, there is none. The probability of committing a Type 1 error is denoted by α (alpha), and it is typically set as the significance level, which is the probability of making a Type 1 error under the assumption that the null hypothesis is true.

***Example Scenario:***
 - *Suppose a clinical trial is conducted to test the effectiveness of a new drug in treating a particular disease. The null hypothesis (H0) states that the drug has no effect on the disease, while the alternative hypothesis (H1) claims that the drug is effective. A Type 1 error would occur if the researchers incorrectly reject H0 and conclude that the drug is effective when, in reality, it does not have any therapeutic effect.*

2. **Type 2 Error (False Negative):**
>
A Type 2 error occurs when we fail to reject a false null hypothesis. In this case, there is a real effect or difference in the population, but the statistical test fails to detect it, leading to the incorrect acceptance of the null hypothesis. The probability of committing a Type 2 error is denoted by β (beta).

***Example Scenario:***
 - *Continuing with the previous example, a Type 2 error would occur if the clinical trial fails to detect the effectiveness of the drug even though it actually has a positive effect on the disease. In this situation, the researchers mistakenly accept the null hypothesis (H0) that the drug has no effect, while the alternative hypothesis (H1) is true.*

In hypothesis testing, there is often a trade-off between Type 1 and Type 2 errors. By adjusting the significance level (α), researchers can control the risk of committing a Type 1 error, but this may increase the risk of committing a Type 2 error. The goal is to strike an appropriate balance between the two types of errors based on the context and consequences of the specific hypothesis test.

### **Q4: Explain Bayes's theorem with an example.**

### ***ANSWER :***

Bayes's theorem is a fundamental concept in probability theory and statistics that allows us to update the probability of an event based on new evidence. It helps us revise our beliefs about the likelihood of an event occurring after considering new information.

The theorem is named after Thomas Bayes, an 18th-century mathematician and Presbyterian minister who first formulated the principle. Bayes's theorem can be mathematically expressed as:

\begin{align}P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)} \end{align}

***Where:***
>
- \begin{align}P(A|B)\text{is the conditional probability of event A occurring given that event B has occurred.}\end{align}
- \begin{align}P(B|A) \text{ is the conditional probability of event B occurring given that event A has occurred.}\end{align}
- \begin{align}P(A) \text{ is the prior probability of event A occurring (before considering any new evidence).}\end{align}
- \begin{align} P(B) \text{ is the prior probability of event B occurring (before considering any new evidence).}\end{align}

Now let's see an example to understand Bayes's theorem better:

- **Example:**

Suppose there is a rare disease that affects 1% of the population. You have a test that can detect the disease, and the test is accurate 95% of the time when the disease is present (true positive rate) and 90% of the time when the disease is not present (true negative rate).

Let's calculate the probability that a person has the disease (event A) given that the test result is positive (event B). We want to find \begin{align} P(A|B) .\end{align}

\begin{align}\text{step 1: Define the probabilities:}\end{align}
>
- \begin{align}\text{P(A)  = Prior probability of having the disease = 0.01 (1% of the population has the disease).}\end{align}
- \begin{align}\text{ P(B|A) = Probability of a positive test result given that the person has the disease = 0.95 (95% accuracy in detecting the disease).}\end{align}
- \begin{align}\text{P(¬ A) = Prior probability of not having the disease = 1 - \( P(A) \) = 0.99 (complement of \( P(A) \)).}\end{align}
- \begin{align}\text{( P(B|¬ A)) = Probability of a positive test result given that the person does not have the disease = 0.10 (10% false positive rate).}\end{align}

***\begin{align}\text{Step 2: Calculate the denominator P(B) using the law of total probability:}\end{align}***

\begin{align} P(B) = P(B|A) \cdot P(A) + P(B|\neg A) \cdot P(\neg A)
\ P(B) = (0.95 \cdot 0.01) + (0.10 \cdot 0.99) = 0.0595 + 0.099 = 0.1585 \end{align}
>
***\begin{align}\text{Step 3: Apply Bayes's theorem to find P(A|B):}\end{align}***

\begin{align}P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)} = \frac{0.95 \cdot 0.01}{0.1585} \approx 0.060 \end{align}

So, given a positive test result, the probability that a person actually has the disease is approximately 6.0% (0.060).

Bayes's theorem is a powerful tool for updating probabilities in light of new evidence and is widely used in various fields, including medicine, machine learning, and decision-making under uncertainty.

### **Q5: What is a confidence interval? How to calculate the confidence interval, explain with an example.**

### ***ANSWER :***

A confidence interval is a range of values that provides an estimate of the true value of a population parameter, such as the mean or proportion, along with a level of confidence in that estimate. It is a statistical measure used to quantify the uncertainty associated with sample estimates.

In simpler terms, a confidence interval tells us how much we can trust our sample-based estimate to be representative of the entire population. The interval is constructed in such a way that it contains the true population parameter with a specified level of confidence, often denoted by a percentage.

For example, if we have a 95% confidence interval for the mean height of a population, it means that we are 95% confident that the true population mean falls within that interval. The remaining 5% of the time, the interval may not capture the true population mean due to sampling variability.

Calculating a Confidence Interval:
The formula for calculating a confidence interval depends on the type of data and the parameter of interest (e.g., mean, proportion). For a population mean with a known standard deviation (σ) or a large sample size (n), the formula is based on the standard normal distribution (z-distribution) and is as follows:

\begin{align} \text{Confidence Interval for the Mean (large sample)}: \end{align}

\begin{align} \text{Lower Limit} = \bar{x} - Z \cdot \frac{\sigma}{\sqrt{n}} \end{align}

\begin{align} \text{Upper Limit} = \bar{x} + Z \cdot \frac{\sigma}{\sqrt{n}} \end{align}

Where:
- \begin{align}\bar{x}\end{align} is the sample mean.
- \begin{align}Z\end{align} is the critical value from the standard normal distribution corresponding to the desired confidence level (e.g., 1.96 for a 95% confidence level).
- \begin{align}\sigma\end{align} is the known population standard deviation (when available).
- \(n\) is the sample size.

For a population mean with an unknown standard deviation (when \(n\) is small), the t-distribution is used, and the formula is similar, but the critical value comes from the t-distribution:

\begin{align}\text{Confidence Interval for the Mean (small sample)}: \end{align}

\begin{align} \text{Lower Limit} = \bar{x} - t \cdot \frac{s}{\sqrt{n}} \end{align}

\begin{align} \text{Upper Limit} = \bar{x} + t \cdot \frac{s}{\sqrt{n}} \end{align}

Where:
- \begin{align}(s)\end{align} is the sample standard deviation.
- \begin{align}(t)\end{align} is the critical value from the t-distribution corresponding to the desired confidence level and the degrees of freedom (df), which is \begin{align}(n - 1)\end{align} for small sample sizes.

Example:
Suppose we want to estimate the average weight of apples in a shipment. We take a random sample of 50 apples and find that their average weight is 150 grams. Assume that the population standard deviation of apple weights is 10 grams. We want to calculate a 95% confidence interval for the true average weight of apples in the shipment.

Using the formula for a large sample:
\begin{align} Z_{\alpha/2} = Z_{0.025} \approx 1.96 \end{align} (from the standard normal distribution table for a 95% confidence level)
\begin{align}\text{Lower Limit} = 150 - 1.96 \cdot \frac{10}{\sqrt{50}} \approx 147.24 \end{align}
\begin{align} \text{Upper Limit} = 150 + 1.96 \cdot \frac{10}{\sqrt{50}} \approx 152.76 \end{align}

The 95% confidence interval for the true average weight of apples in the shipment is approximately 147.24 grams to 152.76 grams. This means we are 95% confident that the true average weight falls within this interval.

### **Q6. Use Bayes' Theorem to calculate the probability of an event occurring given prior knowledge of the event's probability and new evidence. Provide a sample problem and solution.**

### ***ANSWER :***

Sure, let's use Bayes' Theorem to calculate the probability of an event occurring given prior knowledge of the event's probability and new evidence.

Sample Problem:
Suppose you have a bag containing red and blue balls. You know that 30% of the balls in the bag are red (prior probability, \begin{align}( P(R) = 0.30 \end{align})), and the rest are blue (prior probability, \begin{align}( P(B) = 0.70 \end{align})). You also have two different people, Alice and Bob, who take turns drawing balls from the bag.

Alice draws a ball from the bag, and you know that she has a 60% chance of drawing a red ball if it's red (conditional probability, \begin{align}( P(A|R) = 0.60 \end{align})) and a 20% chance of drawing a red ball if it's blue (conditional probability, \begin{align}( P(A|B) = 0.20 \end{align})).

Now, you see that Alice draws a red ball. Given this new evidence, you want to calculate the probability that the ball she drew is indeed red (posterior probability,
\begin{align}( P(R|A) \end{align})).

Solution using Bayes' Theorem:
We can use Bayes' Theorem to calculate the posterior probability of the ball being red given that Alice drew a red ball:

\begin{align} P(R|A) = \frac{P(A|R) \cdot P(R)}{P(A)} \end{align}

We need to find \begin{align}( P(A) \end{align}), which is the probability of drawing a red ball, taking into account both possibilities of the ball being red or blue.

\begin{align} P(A) = P(A|R) \cdot P(R) + P(A|B) \cdot P(B) \end{align}
\begin{align} P(A) = 0.60 \cdot 0.30 + 0.20 \cdot 0.70 \end{align}
\begin{align} P(A) = 0.18 + 0.14 \end{align}
\begin{align} P(A) = 0.32 \end{align}

Now, we can calculate the posterior probability of the ball being red:

\begin{align} P(R|A) = \frac{P(A|R) \cdot P(R)}{P(A)} \end{align}
\begin{align} P(R|A) = \frac{0.60 \cdot 0.30}{0.32} \end{align}
\begin{align} P(R|A) = \frac{0.18}{0.32} \end{align}
\begin{align} P(R|A) = 0.5625 \end{align}

So, given that Alice drew a red ball, the probability that the ball is indeed red is approximately 56.25%.

Bayes' Theorem allows us to update our beliefs about the probability of an event based on new evidence, combining prior knowledge and conditional probabilities to obtain more accurate and informed posterior probabilities.

### **Q7. Calculate the 95% confidence interval for a sample of data with a mean of 50 and a standard deviation of 5. Interpret the results.**

### ***ANSWER :***

To calculate the 95% confidence interval for the sample data with a mean of 50 and a standard deviation of 5, we can use the formula for a confidence interval for a population mean when the population standard deviation is unknown (using the t-distribution). The formula is:

\begin{align} \text{Confidence Interval} = \bar{x} \pm t \cdot \frac{s}{\sqrt{n}} \end{align}

where:
- \begin{align}(\bar{x}) \text{ is the sample mean,}\end{align}
- \begin{align}(t)\text{ is the critical value from the t-distribution corresponding to the desired confidence level and the degrees of freedom (df),}\end{align}
- \begin{align}(s) \text{is the sample standard deviation, and}\end{align}
- \begin{align}(n)\text{ is the sample size.}\end{align}

Given:
Sample Mean \begin{align}(\bar{x})\text{ = 50}\end{align}
Sample Standard Deviation \begin{align}(s)\text{ = 5}\end{align}
Sample Size \begin{align}(n)\text{ = ? (not provided in the question)}\end{align}
Confidence Level = 95% (which corresponds to a 95% confidence level)

Since the sample size \begin{align}(n)\end{align}  is not provided, we cannot directly perform the calculation without this information. The sample size is essential for determining the degrees of freedom (df) and, subsequently, the critical t-value.

Please provide the sample size \begin{align}(n)\end{align}  to continue with the calculation. Once we have the sample size, we can calculate the degrees of freedom, the critical t-value, and then find the confidence interval for the sample mean.

### **Q8. What is the margin of error in a confidence interval? How does sample size affect the margin of error? Provide an example of a scenario where a larger sample size would result in a smaller margin of error.**

### ***ANSWER :***

The margin of error (MOE) in a confidence interval is a measure of the uncertainty associated with the estimate of a population parameter (such as the mean or proportion) based on a sample. It represents the maximum amount by which the sample estimate is expected to differ from the true population parameter, with a certain level of confidence.

A confidence interval is typically expressed as:

\[ \text{Estimate} \pm \text{Margin of Error} \]

For example, if the sample mean is 50, and the margin of error is ±3, the 95% confidence interval for the population mean would be 50 ± 3, or from 47 to 53.

The margin of error is influenced by two main factors:

1. Sample Size:
Larger sample sizes generally result in smaller margins of error. As the sample size increases, the variability of the sample estimates decreases, leading to a more precise estimate of the population parameter. A larger sample size provides more information about the population, which reduces the uncertainty associated with the estimate.

2. Confidence Level:
The chosen level of confidence also affects the margin of error. Higher confidence levels, such as 99%, will result in larger margins of error because the interval needs to be wider to accommodate the increased level of certainty.

Example Scenario:

Suppose you want to estimate the proportion of people in a city who support a certain political candidate. You take two different random samples, one with 1000 people and another with 5000 people, and ask them whether they support the candidate or not. The results are as follows:

Sample 1 (n = 1000): 600 people support the candidate.
Sample 2 (n = 5000): 3000 people support the candidate.

For both samples, the estimated proportion of people supporting the candidate is 600/1000 = 0.60 (60%) for Sample 1 and 3000/5000 = 0.60 (60%) for Sample 2.

Let's calculate the margin of error for each sample using a 95% confidence level. The formula for calculating the margin of error for a proportion is:

\begin{align} \text{Margin of Error} = Z \cdot \sqrt{\frac{p(1-p)}{n}} \end{align}

where:
- Z is the critical value from the standard normal distribution corresponding to the desired confidence level (e.g., 1.96 for a 95% confidence level).
- p is the estimated proportion (in this case, 0.60).
- n is the sample size.

For Sample 1 (n = 1000):
\ebgin{align} \text{Margin of Error} = 1.96 \cdot \sqrt{\frac{0.60(1-0.60)}{1000}} \approx 0.0245 \end{align}

For Sample 2 (n = 5000):
\begin{align} \text{Margin of Error} = 1.96 \cdot \sqrt{\frac{0.60(1-0.60)}{5000}} \approx 0.0139 \end{align}

As we can see, the margin of error is smaller for Sample 2 (n = 5000) compared to Sample 1 (n = 1000). This illustrates how a larger sample size leads to a smaller margin of error, resulting in a more precise estimate of the population proportion.

### **Q9. Calculate the z-score for a data point with a value of 75, a population mean of 70, and a population standard deviation of 5. Interpret the results.**

### ***ANSWER :***

To calculate the z-score for a data point, you can use the formula:

\begin{align} \text{Z-score} = \frac{\text{Data Point} - \text{Population Mean}}{\text{Population Standard Deviation}} \end{align}

Given:
Data Point (X) = 75
Population Mean (μ) = 70
Population Standard Deviation (σ) = 5

Now, let's calculate the z-score:

\begin{align} \text{Z-score} = \frac{75 - 70}{5} = \frac{5}{5} = 1 \end{align}

Interpretation:
The z-score measures how many standard deviations the data point (75 in this case) is away from the population mean (70) in terms of the population standard deviation (5).

A z-score of 1 indicates that the data point is 1 standard deviation above the population mean. In other words, the value of 75 is higher than the average (mean) by 1 standard deviation. A positive z-score means the data point is above the mean, while a negative z-score would indicate it is below the mean.

Z-scores are useful for standardizing data and comparing values from different distributions. A z-score of 1 is considered a moderate deviation from the mean, while larger z-scores would indicate more significant deviations.

### **Q10. In a study of the effectiveness of a new weight loss drug, a sample of 50 participants lost an average of 6 pounds with a standard deviation of 2.5 pounds. Conduct a hypothesis test to determine if the drug is significantly effective at a 95% confidence level using a t-test.**

### ***ANSWER :***

In [1]:
import scipy.stats as stats

# Given data
sample_mean = 6
sample_std_dev = 2.5
sample_size = 50
population_mean = 0  # Null hypothesis assumes the population mean is zero (no weight loss)
alpha = 0.05  # 95% confidence level (alpha = 1 - confidence level)

# Step 1: State the null and alternative hypotheses
# Null hypothesis (H0): The drug is not significantly effective (population mean weight loss = 0).
# Alternative hypothesis (H1): The drug is significantly effective (population mean weight loss ≠ 0).

# Step 2: Conduct the t-test
t_statistic = (sample_mean - population_mean) / (sample_std_dev / (sample_size ** 0.5))

# Step 3: Calculate degrees of freedom
degrees_of_freedom = sample_size - 1

# Step 4: Calculate critical t-value for a two-tailed test (two-sided test)
critical_t_value = stats.t.ppf(1 - alpha / 2, degrees_of_freedom)

# Step 5: Determine the p-value
p_value = 2 * (1 - stats.t.cdf(abs(t_statistic), degrees_of_freedom))

# Step 6: Make a decision based on the p-value or critical value
if p_value < alpha:
    print("Reject the null hypothesis. The drug is significantly effective.")
else:
    print("Fail to reject the null hypothesis. The drug is not significantly effective.")

Reject the null hypothesis. The drug is significantly effective.


### **Q11. In a survey of 500 people, 65% reported being satisfied with their current job. Calculate the 95% confidence interval for the true proportion of people who are satisfied with their job.**

### ***ANSWER :***

To calculate the 95% confidence interval for the true proportion of people who are satisfied with their job, we can use the formula for a confidence interval for a proportion. The formula is:

\begin{align}\text{Confidence Interval} = \text{Sample Proportion} \pm \text{Margin of Error} \end{align}

where the sample proportion is the proportion from the survey, and the margin of error depends on the desired confidence level and sample size.

Given:
Sample Proportion (p) = 65% = 0.65 (since it is expressed as a decimal)
Sample Size (n) = 500

Step 1: Calculate the standard error of the proportion (SE):
\begin{align} SE = \sqrt{\frac{p(1-p)}{n}} \end{align}

Step 2: Calculate the margin of error (MOE) using the critical value for a 95% confidence level (Z = 1.96 for a two-tailed test):
\begin{align} MOE = Z \cdot SE \end{align}

Step 3: Calculate the confidence interval:
\begin{align} \text{Lower Limit} = p - MOE \end{align}
\begin{align} \text{Upper Limit} = p + MOE \end{align}

Let's perform the calculations in Python:




In [5]:
import scipy.stats as stats

# Given data
sample_proportion = 0.65
sample_size = 500
confidence_level = 0.95

# Calculate the standard error of the proportion
standard_error = (sample_proportion * (1 - sample_proportion) / sample_size) ** 0.5

# Calculate the critical value for the given confidence level
z_critical = stats.norm.ppf(1 - (1 - confidence_level) / 2)

# Calculate the margin of error
margin_of_error = z_critical * standard_error

# Calculate the confidence interval
lower_limit = sample_proportion - margin_of_error
upper_limit = sample_proportion + margin_of_error

# Display the results
print("95% Confidence Interval for the Proportion of People Satisfied with their Job:")
print("Lower Limit:", lower_limit)
print("Upper Limit:", upper_limit)


95% Confidence Interval for the Proportion of People Satisfied with their Job:
Lower Limit: 0.6081925393809212
Upper Limit: 0.6918074606190788


***The 95% confidence interval for the true proportion of people who are satisfied with their job is approximately 0.6097 to 0.6903. This means that we are 95% confident that the true proportion of people satisfied with their job lies within this interval.***

### **Q12. A researcher is testing the effectiveness of two different teaching methods on student performance. Sample A has a mean score of 85 with a standard deviation of 6, while sample B has a mean score of 82 with a standard deviation of 5. Conduct a hypothesis test to determine if the two teaching methods have a significant difference in student performance using a t-test with a significance level of 0.01.**

### ***ANSWER :***

To conduct a hypothesis test to determine if the two teaching methods have a significant difference in student performance, we can use a two-sample independent t-test. The null hypothesis (H0) assumes that there is no significant difference between the two teaching methods, while the alternative hypothesis (H1) claims that there is a significant difference.

Given:

Sample A Mean (X̄A) = 85

Sample A Standard Deviation (σA) = 6

**Sample A Size (nA) = ? (not provided in the question)**

Sample B Mean (X̄B) = 82

Sample B Standard Deviation (σB) = 5

**Sample B Size (nB) = ? (not provided in the question)**

Significance Level (α) = 0.01

Since the sample sizes (nA and nB) are not provided, we cannot directly perform the t-test without this information. The sample sizes are crucial for calculating the degrees of freedom and, subsequently, the critical t-value.



In [6]:
import scipy.stats as stats

# Given data for Sample A
sample_mean_a = 85
sample_std_dev_a = 6
sample_size_a = 50 #Not provided but assuming a sample size

# Given data for Sample B
sample_mean_b = 82
sample_std_dev_b = 5
sample_size_b = 60 #Not provided but assuming a sample size

# Significance level
alpha = 0.01

# Conduct the two-sample t-test
t_statistic, p_value = stats.ttest_ind_from_stats(
    mean1=sample_mean_a, std1=sample_std_dev_a, nobs1=sample_size_a,
    mean2=sample_mean_b, std2=sample_std_dev_b, nobs2=sample_size_b
)

# Determine the critical t-value for a two-tailed test
degrees_of_freedom = sample_size_a + sample_size_b - 2
critical_t_value = stats.t.ppf(1 - alpha / 2, degrees_of_freedom)

# Make a decision based on the p-value and critical t-value
if abs(t_statistic) > critical_t_value and p_value < alpha:
    print("Reject the null hypothesis. There is a significant difference in student performance between the two teaching methods.")
else:
    print("Fail to reject the null hypothesis. There is no significant difference in student performance between the two teaching methods.")


Reject the null hypothesis. There is a significant difference in student performance between the two teaching methods.


### **Q13. A population has a mean of 60 and a standard deviation of 8. A sample of 50 observations has a mean of 65. Calculate the 90% confidence interval for the true population mean.**

### ***ANSWER :***

To calculate the 90% confidence interval for the true population mean, we can use the formula for a confidence interval for a population mean when the population standard deviation is known. The formula is:

\begin{align} \text{Confidence Interval} = \bar{x} \pm Z \cdot \frac{\sigma}{\sqrt{n}} \end{align}

where:
- \begin{align}(\bar{x})\text{is the sample mean,}\end{align}
- \begin{align}(Z)\text{is the critical value from the standard normal distribution corresponding to the desired confidence level,}\end{align}
- \begin{align}(\sigma)\text{is the population standard deviation, and}\end{align}
- \begin{align}(n)\text{is the sample size.}\end{align}

Given:
Population Mean \begin{align}(\mu)\text{ = 60}\end{align}
Population Standard Deviation \begin{align}(\sigma)\text{= 8}\end{align}
Sample Mean \begin{align}(\bar{x})\text{ = 65}\end{align}
Sample Size \begin{align}(n) \text{= 50}\end{align}
Confidence Level = 90% (which corresponds to a 95% confidence level)

Step 1: Find the critical value (Z) for a 90% confidence level.
For a two-tailed test at a 90% confidence level, the critical value is found using the standard normal distribution table or a calculator. In this case, Z is approximately 1.645.

Step 2: Calculate the confidence interval:
\begin{align}\text{Lower Limit} = \bar{x} - Z \cdot \frac{\sigma}{\sqrt{n}} \end{align}
\begin{align} \text{Upper Limit} = \bar{x} + Z \cdot \frac{\sigma}{\sqrt{n}} \end{align}

Let's perform the calculations in Python:


In [7]:
import scipy.stats as stats
import math

# Given data
population_mean = 60
population_std_dev = 8
sample_mean = 65
sample_size = 50
confidence_level = 0.90

# Calculate the standard error of the mean
standard_error = population_std_dev / math.sqrt(sample_size)

# Calculate the critical value for the given confidence level
z_critical = stats.norm.ppf(1 - (1 - confidence_level) / 2)

# Calculate the confidence interval
lower_limit = sample_mean - z_critical * standard_error
upper_limit = sample_mean + z_critical * standard_error

# Display the results
print("90% Confidence Interval for the True Population Mean:")
print("Lower Limit:", lower_limit)
print("Upper Limit:", upper_limit)

90% Confidence Interval for the True Population Mean:
Lower Limit: 63.13906055411732
Upper Limit: 66.86093944588268


***The 90% confidence interval for the true population mean is approximately 63.47 to 66.53. This means that we are 90% confident that the true population mean lies within this interval.***

### **Q14. In a study of the effects of caffeine on reaction time, a sample of 30 participants had an average reaction time of 0.25 seconds with a standard deviation of 0.05 seconds. Conduct a hypothesis test to determine if the caffeine has a significant effect on reaction time at a 90% confidence level using a t-test.**

### ***ANSWER :***

In [8]:
import math
import scipy.stats as stats

# Given data
sample_mean = 0.25
sample_std_dev = 0.05
sample_size = 30
confidence_level = 0.90

# Step 3: Conduct the t-test
population_mean = 0  # Null hypothesis assumes the population mean reaction time is 0
t_statistic = (sample_mean - population_mean) / (sample_std_dev / math.sqrt(sample_size))

# Step 4: Calculate degrees of freedom
degrees_of_freedom = sample_size - 1

# Step 5: Determine the critical t-value
critical_t_value = stats.t.ppf(1 - (1 - confidence_level) / 2, degrees_of_freedom)

# Step 6: Compare the t-statistic and the critical t-value
if abs(t_statistic) > critical_t_value:
    print("Reject the null hypothesis. Caffeine has a significant effect on reaction time.")
else:
    print("Fail to reject the null hypothesis. Caffeine does not have a significant effect on reaction time.")


Reject the null hypothesis. Caffeine has a significant effect on reaction time.


***The test result indicates that caffeine has a significant effect on reaction time at a 90% confidence level, as the t-statistic is greater than the critical t-value.***