## Q1: What is the difference between a t-test and a z-test? Provide an example scenario where you would use each type of test.


- A t-test is used when the population standard deviation is unknown and must be estimated from the sample data. It is typically employed when working with small sample sizes. For example, if you have a sample of 20 students and want to determine if there is a significant difference in the mean scores of two groups (e.g., experimental group and control group), you would use a t-test.

- On the other hand, a z-test is used when the population standard deviation is known or when the sample size is large (usually greater than 30). It assumes a normal distribution of data. For instance, if you have a large dataset of 500 observations and want to test if the mean height of a population differs significantly from a known average, you would use a z-test.

***

## Q2: Differentiate between one-tailed and two-tailed tests.


- In a one-tailed test, the hypothesis specifies the direction of the effect. It tests if a parameter is either significantly greater or significantly smaller than the hypothesized value. For example, if you hypothesize that a new drug will increase reaction time, a one-tailed test would determine if the reaction time is significantly greater after administering the drug.

- In contrast, a two-tailed test does not specify the direction of the effect and tests if a parameter is significantly different from the hypothesized value. Using the same example, a two-tailed test would determine if the reaction time is significantly different, either greater or smaller, after administering the drug.

***

## Q3: Explain the concept of Type 1 and Type 2 errors in hypothesis testing. Provide an example scenario for each type of error.


- Type 1 error, also known as a false positive, occurs when you reject a null hypothesis that is actually true. In other words, it's the incorrect rejection of a true null hypothesis. For example, if a medical test indicates that a patient has a disease when they are actually healthy, it would be a Type 1 error.

- Type 2 error, also known as a false negative, occurs when you fail to reject a null hypothesis that is actually false. It's the incorrect acceptance of a false null hypothesis. For example, if a medical test fails to identify a disease when the patient is actually sick, it would be a Type 2 error.

***

## Q4: Explain Bayes's theorem with an example.


- Bayes's theorem describes how to update the probability of an event based on prior knowledge and new evidence. 
```
It is defined as:
P(A|B) = (P(B|A) * P(A)) / P(B)
```
 - For example, let's consider a scenario where you want to calculate the probability of a person having a certain medical condition given a positive test result. Suppose the prior probability of a person having the condition is 0.05, and the probability of a positive test result given that the person has the condition is 0.95. If the overall probability of a positive test result is 0.10, you can use Bayes's theorem to calculate the updated probability of having the condition given the positive test result.

***

## Q5: What is a confidence interval? How to calculate the confidence interval, explain with an example.


- A confidence interval is a range of values that is likely to contain the true population parameter with a specified level of confidence.
```
Confidence Interval = sample mean ± (critical value * standard error)
```
- For example, suppose you have a sample of 100 students, and their average test score is 75 with a standard deviation of 10. You want to calculate a 95% confidence interval for the population mean test score. You would use the appropriate critical value (e.g., from the t-distribution for small sample sizes or the z-distribution for large sample sizes) and the formula to calculate the confidence interval.

***

## Q6. Use Bayes' Theorem to calculate the probability of an event occurring given prior knowledge of the event's probability and new evidence. Provide a sample problem and solution.


- Suppose a certain medical test is 95% accurate in detecting a disease when it is present. However, the test also has a false positive rate of 3%, meaning that it incorrectly identifies 3% of healthy individuals as having the disease. If the prevalence of the disease in the population is 2%, what is the probability that a person who tests positive actually has the disease?

```
Let's define:
A = Having the disease
B = Testing positive

Given:
P(A) = 0.02 (prevalence of the disease)
P(B|A) = 0.95 (accuracy of the test)
P(B|¬A) = 0.03 (false positive rate)
```

```
We need to calculate P(A|B), the probability of having the disease given a positive test result, using Bayes's theorem:

P(A|B) = (P(B|A) * P(A)) / [P(B|A) * P(A) + P(B|¬A) * P(¬A)]

P(¬A) = 1 - P(A) = 1 - 0.02 = 0.98

P(A|B) = (0.95 * 0.02) / [(0.95 * 0.02) + (0.03 * 0.98)]

By performing the calculations, you can find the value of P(A|B), which represents the probability of having the disease given a positive test result.
```

***

## Q7. Calculate the 95% confidence interval for a sample of data with a mean of 50 and a standard deviation of 5. Interpret the results.


- To calculate the 95% confidence interval for a sample of data with a mean of 50 and a standard deviation of 5, you need the sample size and the appropriate critical value.
- Assuming a large sample size, you can use the z-distribution and the formula for the confidence interval:
    ```
    Confidence Interval = sample mean ± (critical value * (standard deviation / sqrt(sample size)))
    ```
    
For a 95% confidence level, the critical value is approximately 1.96 (based on the standard normal distribution). Plugging in the values, the confidence interval would be:
```
Confidence Interval = 50 ± (1.96 * (5 / sqrt(sample size)))
```
Interpreting the results, you can say that you are 95% confident that the true population mean falls within the calculated interval.

***

## Q8. What is the margin of error in a confidence interval? How does sample size affect the margin of error? Provide an example of a scenario where a larger sample size would result in a smaller margin of error.


- The margin of error in a confidence interval represents the maximum expected difference between the sample estimate and the true population parameter. It quantifies the uncertainty associated with the estimate.
- The margin of error is influenced by the sample size. As the sample size increases, the margin of error decreases because the estimate becomes more precise.
- For example, let's consider a survey where you want to estimate the proportion of people who support a certain policy with a 95% confidence level. If you have a small sample size, say 100 participants, the margin of error would be larger compared to a survey with a larger sample size, such as 1000 participants. The larger sample size allows for a more accurate estimation and, consequently, a smaller margin of error.

***

## Q9. Calculate the z-score for a data point with a value of 75, a population mean of 70, and a population standard deviation of 5. Interpret the results.


```
z = (data point - population mean) / population standard deviation

In this case:
Data point = 75
Population mean = 70
Population standard deviation = 5

Plugging in the values:
z = (75 - 70) / 5
```

- The resulting z-score indicates the number of standard deviations the data point is away from the population mean. A positive z-score implies that the data point is above the mean, while a negative z-score implies that it is below the mean.

***

## Q10. In a study of the effectiveness of a new weight loss drug, a sample of 50 participants lost an average of 6 pounds with a standard deviation of 2.5 pounds. Conduct a hypothesis test to determine if the drug is significantly effective at a 95% confidence level using a t-test.


To conduct a hypothesis test to determine if the weight loss drug is significantly effective at a 95% confidence level, you would perform the following steps using a t-test:
        
- Step 1: State the null hypothesis (H0) and the alternative hypothesis (Ha).
 - H0: The mean weight loss with the drug is not significantly different from zero.
 - Ha: The mean weight loss with the drug is significantly different from zero.
    
- Step 2: Set the significance level (α) to 0.05.

- Step 3: Calculate the test statistic. In this case, the test statistic is the t-value, given by:
 - t = (sample mean - hypothesized mean) / (sample standard deviation / sqrt(sample size))

- Step 4: Determine the degrees of freedom (df) for the t-distribution. In this case, df = sample size - 1.

- Step 5: Find the critical t-value(s) from the t-distribution table or use statistical software. For a two-tailed test at a 95% confidence level and df = sample size - 1, the critical t-value would be obtained.

- Step 6: Compare the calculated t-value with the critical t-value(s) to make a decision. If the calculated t-value falls within the rejection region (i.e., beyond the critical t-value(s)), you reject the null hypothesis. Otherwise, you fail to reject the null hypothesis.

***

## Q11. In a survey of 500 people, 65% reported being satisfied with their current job. Calculate the 95% confidence interval for the true proportion of people who are satisfied with their job.


```
Confidence Interval = sample proportion ± (critical value * sqrt((sample proportion * (1 - sample proportion)) / sample size))

In this case:
Sample size = 500
Sample proportion = 0.65
```

- The critical value can be obtained from the standard normal distribution for a 95% confidence level. Plugging in the values, the confidence interval would be:
``
Confidence Interval = 0.65 ± (critical value * sqrt((0.65 * (1 - 0.65)) / 500))
``
- Interpreting the results, you can say that you are 95% confident that the true proportion of people satisfied with their job falls within the calculated interval.

***

## Q12. A researcher is testing the effectiveness of two different teaching methods on student performance. Sample A has a mean score of 85 with a standard deviation of 6, while sample B has a mean score of 82 with a standard deviation of 5. Conduct a hypothesis test to determine if the two teaching methods have a significant difference in student performance using a t-test with a significance level of 0.01.

To conduct a hypothesis test to determine if the two teaching methods have a significant difference in student performance, you would perform the following steps using a t-test with a significance level of 0.01:

- Step 1: State the null hypothesis (H0) and the alternative hypothesis (Ha).

 - H0: The two teaching methods have no significant difference in student performance.
 - Ha: The two teaching methods have a significant difference in student performance.
- Step 2: Set the significance level (α) to 0.01.

- Step 3: Calculate the test statistic. In this case, the test statistic is the t-value, given by:
 - t = (sample mean A - sample mean B) / sqrt((sample variance A / sample size A) + (sample variance B / sample size B))

- Step 4: Determine the degrees of freedom (df) for the t-distribution. In this case, df = (sample size A + sample size B) - 2.

- Step 5: Find the critical t-value from the t-distribution table or use statistical software. For a two-tailed test at a 99% confidence level and df = (sample size A + sample size B) - 2, the critical t-value would be obtained.

- Step 6: Compare the calculated t-value with the critical t-value to make a decision. If the calculated t-value falls within the rejection region (i.e., beyond the critical t-value), you reject the null hypothesis. Otherwise, you fail to reject the null hypothesis.

***

## Q13. A population has a mean of 60 and a standard deviation of 8. A sample of 50 observations has a mean of 65. Calculate the 90% confidence interval for the true population mean.


``
Confidence Interval = sample mean ± (critical value * (sample standard deviation / sqrt(sample size)))
``
```
In this case:
Sample mean = 65
Sample size = 50
Population mean = 60
```
- The critical value can be obtained from the t-distribution table or using statistical software for a 90% confidence level and sample size - 1 degrees of freedom. Plugging in the values, the confidence interval would be:
``
Confidence Interval = 65 ± (critical value * (8 / sqrt(50)))
``
- Interpreting the results, you can say that you are 90% confident that the true population mean falls within the calculated interval.

***

## Q14. In a study of the effects of caffeine on reaction time, a sample of 30 participants had an average reaction time of 0.25 seconds with a standard deviation of 0.05 seconds. Conduct a hypothesis test to determine if the caffeine has a significant effect on reaction time at a 90% confidence level using a t-test.

- Step 1: State the null hypothesis (H0) and the alternative hypothesis (Ha).

 - H0: Caffeine has no significant effect on reaction time. (μ = μ0)
 - Ha: Caffeine has a significant effect on reaction time. (μ ≠ μ0) (Two-tailed test)
- Step 2: Set the significance level (α) to 0.10 (90% confidence level).

- Step 3: Calculate the test statistic. In this case, the test statistic is the t-value, given by:
``
t = (sample mean - hypothesized mean) / (sample standard deviation / sqrt(sample size))
``
```
Given:
Sample mean (x̄) = 0.25 seconds
Hypothesized mean (μ0) = 0 (assuming no effect of caffeine)
Sample standard deviation (s) = 0.05 seconds
Sample size (n) = 30
```
```
t = (0.25 - 0) / (0.05 / sqrt(30))
t = 0.25 / (0.05 / 5.477)
t = 0.25 / 0.0091447
t ≈ 27.34
```
- Step 4: Determine the degrees of freedom (df) for the t-distribution. In this case, df = sample size - 1 = 30 - 1 = 29.

- Step 5: Find the critical t-value from the t-distribution table or use statistical software. For a two-tailed test at a 90% confidence level and df = 29, the critical t-value would be obtained. In this case, the critical t-value would be approximately ±1.6991.

- Step 6: Compare the calculated t-value with the critical t-value to make a decision. If the calculated t-value falls outside the rejection region (i.e., beyond the critical t-value), you reject the null hypothesis. Otherwise, you fail to reject the null hypothesis.

- Since the calculated t-value (27.34) is larger than the critical t-value (±1.6991) for a two-tailed test at a 90% confidence level, we reject the null hypothesis. This suggests that caffeine has a significant effect on reaction time based on the given data.