### Q1: What is the difference between a t-test and a z-test? Provide an example scenario where you would use each type of test.

A t-test and a z-test are both statistical tests used to make inferences about population parameters based on sample data, but they are used under different circumstances due to differences in assumptions and conditions.

1. **T-test:**
   - A t-test is used when the sample size is small (typically less than 30) and/or the population standard deviation is unknown.
   - It is also used when the data follow a normal distribution or when the sample size is large and the population standard deviation is known.
   - Example scenario: Suppose you want to compare the mean heights of two groups of students, one group from School A and the other from School B. You collect a sample of 20 students from each school and measure their heights. Since the sample sizes are relatively small and the population standard deviation is unknown, you would use a t-test to determine if there is a significant difference between the mean heights of the two groups.

2. **Z-test:**
   - A z-test is used when the sample size is large (typically greater than 30) and/or the population standard deviation is known.
   - It is based on the assumption that the population follows a normal distribution.
   - Example scenario: Consider a scenario where a pharmaceutical company develops a new drug to reduce blood pressure. The company wants to test if the mean reduction in blood pressure for patients taking the new drug is significantly different from zero. They conduct a clinical trial with a large sample size (more than 30) and measure the reduction in blood pressure for each patient. Since the sample size is large and the population standard deviation is known (or can be assumed to be known), a z-test would be appropriate to determine if there is a significant difference in mean reduction in blood pressure.

In summary, the choice between a t-test and a z-test depends on the sample size, whether the population standard deviation is known, and the distribution of the data. If the sample size is small and/or the population standard deviation is unknown, a t-test is typically used. If the sample size is large and/or the population standard deviation is known, a z-test is generally preferred.

### Q2: Differentiate between one-tailed and two-tailed tests.

One-tailed and two-tailed tests are two types of hypothesis tests used in statistics, differing in the directionality of the hypothesis being tested.

1. **One-tailed test:**
   - In a one-tailed test, the null hypothesis specifies a particular direction of the effect or difference.
   - The alternative hypothesis is directional, specifying that the effect or difference is either greater than or less than a certain value, but not both.
   - The critical region for rejection of the null hypothesis is located entirely in one tail of the probability distribution.
   - One-tailed tests are typically used when there is a specific hypothesis or directional prediction about the relationship between variables.
   - Example: Testing whether a new drug increases average test scores. The null hypothesis could be that the drug has no effect or decreases test scores, while the alternative hypothesis could be that the drug increases test scores. Here, we're only interested in whether the drug improves scores, not if it reduces them.

2. **Two-tailed test:**
   - In a two-tailed test, the null hypothesis does not specify a direction of the effect or difference; it only states that there is no difference or no effect.
   - The alternative hypothesis is non-directional, stating that the effect or difference is simply not equal to a certain value.
   - The critical region for rejection of the null hypothesis is split between both tails of the probability distribution.
   - Two-tailed tests are used when there is no specific hypothesis about the direction of the effect, and you want to determine if there is a difference or effect, regardless of its direction.
   - Example: Testing whether a coin is fair (i.e., has an equal probability of landing heads or tails). The null hypothesis could be that the coin is fair (p = 0.5), while the alternative hypothesis could be that the coin is not fair (p ≠ 0.5). Here, we're interested in detecting any deviation from fairness, whether the coin tends to land more on heads or tails.

In summary, one-tailed tests are used when there is a specific directional hypothesis, while two-tailed tests are used when there is no directional hypothesis or when you want to test for the possibility of effects in both directions.

### Q3: Explain the concept of Type 1 and Type 2 errors in hypothesis testing. Provide an example scenario for each type of error.

In hypothesis testing, Type I and Type II errors are two potential mistakes that can occur when interpreting the results of a statistical test.

1. **Type I Error (False Positive):**
   - A Type I error occurs when the null hypothesis is incorrectly rejected when it is actually true.
   - In other words, it is the mistake of concluding that there is a significant effect or difference when there is actually no effect or difference in the population.
   - The probability of committing a Type I error is denoted by α (alpha), and it represents the significance level of the test.
   - Example scenario: Suppose a pharmaceutical company is testing a new drug for effectiveness in treating a certain condition. The null hypothesis (H0) states that the drug has no effect. If the company incorrectly rejects the null hypothesis based on the data from a clinical trial, claiming that the drug is effective, when in reality, it's not, this would be a Type I error.

2. **Type II Error (False Negative):**
   - A Type II error occurs when the null hypothesis is incorrectly not rejected when it is actually false.
   - In other words, it is the mistake of failing to detect a significant effect or difference when one truly exists in the population.
   - The probability of committing a Type II error is denoted by β (beta).
   - Example scenario: Continuing with the pharmaceutical company example, suppose the drug being tested is genuinely effective in treating the condition, but the clinical trial fails to detect this effect. In this case, the null hypothesis (H0) stating that the drug has no effect is not rejected, leading to a Type II error. Patients who could benefit from the drug may not receive it due to the failure to detect its effectiveness.

In summary, Type I error involves incorrectly concluding that there is an effect or difference when there isn't (false positive), while Type II error involves failing to detect an effect or difference when there actually is one (false negative). Both types of errors are important considerations in hypothesis testing and can have real-world consequences, especially in fields like medicine, where incorrect decisions based on statistical tests can affect people's health and well-being.

### Q4:  Explain Bayes's theorem with an example.

![image.png](attachment:image.png)
![image-2.png](attachment:image-2.png)
![image-3.png](attachment:image-3.png)

### Q5: What is a confidence interval? How to calculate the confidence interval, explain with an example.

A confidence interval is a range of values that is constructed based on sample data and is believed to contain the true population parameter with a certain level of confidence. It provides a measure of the uncertainty associated with estimating the population parameter from a sample.

To calculate a confidence interval, you typically follow these steps:

1. **Select a Confidence Level:** Determine the desired level of confidence, often denoted by \(1 - \alpha\), where \(\alpha\) is the significance level (the probability of making a Type I error). Common confidence levels include 90%, 95%, and 99%.

2. **Choose a Statistical Distribution:** Based on the sample size and assumptions about the population, select an appropriate probability distribution. For large sample sizes, the normal distribution is commonly used. For smaller sample sizes or when the population standard deviation is unknown, the t-distribution is used.

3. **Calculate the Sample Statistic:** Collect sample data and calculate the sample statistic of interest, such as the sample mean (\(\bar{x}\)), sample proportion (\(p\)), or sample standard deviation (\(s\)).

4. **Determine the Standard Error:** Calculate the standard error of the sample statistic. The standard error quantifies the variability of the sample statistic and is typically calculated using the formula:

![image-2.png](attachment:image-2.png)

5. **Find the Critical Value:** Determine the critical value corresponding to the selected confidence level and the chosen distribution. For example, if using the normal distribution, find the z-value; if using the t-distribution, find the t-value with appropriate degrees of freedom.

6. **Calculate the Margin of Error:** Multiply the standard error by the critical value to find the margin of error. The margin of error represents the maximum likely difference between the sample statistic and the population parameter.

7. **Compute the Confidence Interval:** Use the sample statistic and the margin of error to construct the confidence interval. This involves adding and subtracting the margin of error from the sample statistic.

Now, let's illustrate this process with an example:

Suppose you want to estimate the average score on a standardized test for all students in a school. You collect a random sample of 100 students and find that the sample mean score is 85 with a sample standard deviation of 10.

![image.png](attachment:image.png)


   Lower Bound = Sample Mean - Margin of Error = 85 - 1.96 = 83.04
   Upper Bound = Sample Mean + Margin of Error = 85 + 1.96 = 86.96

Therefore, the 95% confidence interval for the true average test score for all students in the school is approximately 83.04 to 86.96. This means that we are 95% confident that the true population mean lies within this range.

### Q6. Use Bayes' Theorem to calculate the probability of an event occurring given prior knowledge of the event's probability and new evidence. Provide a sample problem and solution.

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

![image-3.png](attachment:image-3.png)

### Q7. Calculate the 95% confidence interval for a sample of data with a mean of 50 and a standard deviation of 5. Interpret the results.

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

![image-3.png](attachment:image-3.png)

### Q8. What is the margin of error in a confidence interval? How does sample size affect the margin of error? Provide an example of a scenario where a larger sample size would result in a smaller margin of error.

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

As we can see, the margin of error is smaller for the sample with a larger sample size. This indicates that the estimate of the population parameter (average commute time) is more precise with the larger sample size. In other words, the larger sample size results in a smaller margin of error, leading to a narrower confidence interval and a more accurate estimation of the population parameter.

### Q9. Calculate the z-score for a data point with a value of 75, a population mean of 70, and a population standard deviation of 5. Interpret the results.

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

### Q10. In a study of the effectiveness of a new weight loss drug, a sample of 50 participants lost an average of 6 pounds with a standard deviation of 2.5 pounds. Conduct a hypothesis test to determine if the drug is significantly effective at a 95% confidence level using a t-test.

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

![image-3.png](attachment:image-3.png)


### Q11. In a survey of 500 people, 65% reported being satisfied with their current job. Calculate the 95% confidence interval for the true proportion of people who are satisfied with their job.

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

### Q12. A researcher is testing the effectiveness of two different teaching methods on student performance. Sample A has a mean score of 85 with a standard deviation of 6, while sample B has a mean score of 82 with a standard deviation of 5. Conduct a hypothesis test to determine if the two teaching methods have a significant difference in student performance using a t-test with a significance level of 0.01.

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

### Q13. A population has a mean of 60 and a standard deviation of 8. A sample of 50 observations has a mean of 65. Calculate the 90% confidence interval for the true population mean.

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

### Q14. In a study of the effects of caffeine on reaction time, a sample of 30 participants had an average reaction time of 0.25 seconds with a standard deviation of 0.05 seconds. Conduct a hypothesis test to determine if the caffeine has a significant effect on reaction time at a 90% confidence level using a t-test

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)