Q1: What is the difference between a t-test and a z-test? Provide an example scenario where you would
use each type of test.

A1:Both t-tests and z-tests are used in statistical hypothesis testing to determine if there is a significant difference between a sample mean and a population mean. The main difference between the two lies in the underlying assumptions and when they are used:

Z-test:
* A z-test is used when the population standard deviation is known.
* It assumes that the sample size is large (typically n ≥ 30) and follows a normal distribution.
* The critical value for the test statistic is obtained from the standard normal distribution (Z-distribution).


Example Scenario: Suppose you want to test if the average weight of a certain breed of dogs is significantly different from the national average weight for that breed (population average weight is known). If you have a large enough sample size and the weights of the dogs in your sample are approximately normally distributed, you can use a z-test.

T-test:
* A t-test is used when the population standard deviation is unknown.
* It is appropriate for small sample sizes or when the sample data is not normally distributed.
* The critical value for the test statistic is obtained from the t-distribution, which has fatter tails than the standard normal distribution.


Example Scenario: Let's say you want to test if a new teaching method has improved the test scores of a group of students. You have limited data and don't know the population's standard deviation. In this case, you would use a t-test to compare the average test scores before and after implementing the new teaching method.

Q2: Differentiate between one-tailed and two-tailed tests.

A2: 
    
One-tailed test:
* Also known as a directional test.
* It tests for a specific direction of difference between the sample and the population mean.
* The critical region is on one side of the distribution (either the upper tail or the lower tail).


Example: A pharmaceutical company develops a new drug and believes it will increase participants' reaction times. A one-tailed test is used to check if the reaction times are significantly faster with the new drug.

Two-tailed test:
* Also known as a non-directional test.
* It tests for any significant difference between the sample and the population mean, regardless of the direction.
* The critical region is divided into two sides of the distribution (both the upper and lower tails).

Example: An electronics manufacturer wants to test if a new manufacturing process has changed the mean lifespan of their products. A two-tailed test is used to determine if there is a significant difference in either direction (longer or shorter lifespan).

Q3: Explain the concept of Type 1 and Type 2 errors in hypothesis testing. Provide an example scenario for
each type of error.

A3:

Type 1 error (False Positive):
* Occurs when we reject a true null hypothesis.
* In other words, we conclude that there is a significant effect when there isn't one in the population.
* The probability of making a Type 1 error is denoted by the symbol "α" (alpha) and is typically set as the significance level (e.g., 0.05).


Example Scenario: In a criminal trial, the null hypothesis is that the defendant is innocent. Making a Type 1 error would mean wrongly convicting an innocent person.

Type 2 error (False Negative):
* Occurs when we fail to reject a false null hypothesis.
* In other words, we conclude that there is no significant effect when there is one in the population.
* The probability of making a Type 2 error is denoted by the symbol "β" (beta).


Example Scenario: In a medical test for a disease, the null hypothesis is that the person is healthy. Making a Type 2 error would mean failing to diagnose a person who actually has the disease.

Q4: Explain Bayes's theorem with an example.

A4: Bayes's Theorem is a fundamental concept in probability theory that allows us to update the probability of an event occurring based on new evidence. The formula is as follows:

P(A|B) = [P(B|A) * P(A)] / P(B)

Where:

P(A|B) is the posterior probability of event A given evidence B.
P(B|A) is the likelihood of evidence B given event A.
P(A) is the prior probability of event A (before considering any evidence).
P(B) is the probability of evidence B. 

Example: Let's say we have a medical test for a rare disease, and we know the test's accuracy:

The probability of having the disease (prior probability) P(A) is 0.01 (1% of the population has the disease).
The probability of the test correctly detecting the disease (true positive rate) P(B|A) is 0.95 (95%).
The probability of the test giving a false positive result (detecting the disease when the person is healthy) is P(B|~A) = 0.05 (5%).
Now, suppose a person receives a positive test result (evidence B). We want to calculate the probability that the person actually has the disease (P(A|B)).

Using Bayes's Theorem:

P(A|B) = [P(B|A) * P(A)] / P(B)

P(A|B) = [0.95 * 0.01] / (0.95 * 0.01 + 0.05 * 0.99)

P(A|B) = 0.95 / 0.0995

P(A|B) ≈ 0.9548

So, given a positive test result, the probability that the person actually has the disease is approximately 95.48%.

Q5: What is a confidence interval? How to calculate the confidence interval, explain with an example.

A5: A confidence interval (CI) is a range of values within which we are reasonably confident the true population parameter lies. It provides a measure of the uncertainty associated with estimating the population parameter from a sample.

To calculate a confidence interval for the population mean (μ), we need three pieces of information:

Sample mean (x̄) - the average of the sample data.
Sample standard deviation (s) - the measure of variability within the sample.
Confidence level (CL) - the level of confidence we want in our estimate, typically expressed as a percentage.
The formula for the confidence interval is:

CI = x̄ ± Z * (s / √n)

Where:
* Z is the critical value from the standard normal distribution corresponding to the chosen confidence level.
* n is the sample size.

Example: Let's say we have a sample of 100 students' test scores, and we want to calculate a 95% confidence interval for the population mean test score. The sample mean is 85, and the sample standard deviation is 10.

1. Find the critical value (Z) for a 95% confidence level. For a 95% confidence level, the critical value is approximately 1.96 (you can find this value from a standard normal distribution table).

2. Calculate the confidence interval:

CI = 85 ± 1.96 * (10 / √100) = 85 ± 1.96

Interpretation: The 95% confidence interval for the population mean test score is (82.04, 87.96). This means we are 95% confident that the true population mean test score lies within this range.

Q6. Use Bayes' Theorem to calculate the probability of an event occurring given prior knowledge of the
event's probability and new evidence. Provide a sample problem and solution.

A6: Suppose we have prior knowledge about the probability of an event (A) occurring, and we receive new evidence (B). Bayes's Theorem allows us to update our probability estimate based on the evidence. The formula is the same as mentioned earlier:

P(A|B) = [P(B|A) * P(A)] / P(B)

Example: Let's consider a scenario of a rare disease and a medical test:

* Prior Probability: The probability of a randomly selected person having the disease (A) is 0.01 (1%).
* Test Accuracy: The probability of the test correctly detecting the disease (B|A) is 0.95 (95%).
* False Positive Rate: The probability of the test giving a false positive result (B|~A) is 0.05 (5%).

Now, let's say an individual tests positive for the disease (evidence B). We want to calculate the updated probability that the person actually has the disease (P(A|B)).

Using Bayes's Theorem:

P(A|B) = [P(B|A) * P(A)] / P(B)

P(A|B) = [0.95 * 0.01] / (0.95 * 0.01 + 0.05 * 0.99)

P(A|B) = 0.95 / 0.0995

P(A|B) ≈ 0.9548

So, given a positive test result, the updated probability that the person actually has the disease is approximately 95.48%.


Q7. Calculate the 95% confidence interval for a sample of data with a mean of 50 and a standard deviation
of 5. Interpret the results.

A7: Given a sample of data with a mean of 50 and a standard deviation of 5, we can calculate the 95% confidence interval using the formula mentioned earlier:

CI = x̄ ± Z * (s / √n)

Where:

x̄ = sample mean (50)

s = sample standard deviation (5)

n = sample size (unknown)

The critical value for a 95% confidence level (Z) can be obtained from the standard normal distribution table, and it is approximately 1.96.

CI = 50 ± 1.96 * (5 / √n)

Interpretation: The 95% confidence interval for the population mean is 
(50 - 1.96 * (5 / √n), 50 + 1.96 * (5 / √n)). 

To interpret the results, you would need to know the actual sample size (n) to compute the specific interval bounds.

Q8. What is the margin of error in a confidence interval? How does sample size affect the margin of error?
Provide an example of a scenario where a larger sample size would result in a smaller margin of error.

A8: The margin of error (MOE) is a measure of the uncertainty in the estimate provided by a confidence interval. It indicates the range within which the true population parameter is likely to lie. The larger the margin of error, the less precise the estimate.

The formula for calculating the margin of error is:

MOE = Z * (s / √n)

Where:

Z is the critical value from the standard normal distribution corresponding to the chosen confidence level.

s is the sample standard deviation.

n is the sample size.

Relationship with Sample Size:

The margin of error is inversely proportional to the square root of the sample size. As the sample size increases, the margin of error decreases, resulting in a more precise estimate.

Example: Suppose you want to estimate the average height of students in a school with a 95% confidence level. You take two 

samples: one with 100 students and another with 400 students.

For a 95% confidence level, the critical value (Z) is approximately 1.96.

Sample 1 (n = 100):

MOE = 1.96 * (s / √100)

Sample 2 (n = 400):

MOE = 1.96 * (s / √400)

The margin of error for Sample 2 (n = 400) will be smaller than the margin of error for Sample 1 (n = 100). As a result, the estimate based on Sample 2 will be more precise than the estimate based on Sample 1.

Q9. Calculate the z-score for a data point with a value of 75, a population mean of 70, and a population
standard deviation of 5. Interpret the results.

A9: The z-score measures how many standard deviations a data point is away from the population mean. The formula for calculating the z-score is:

z = (x - μ) / σ

Where:

x = 75

μ = 70
σ = 5

Now, let's calculate the z-score:

z = (75 - 70) / 5

z = 1

Interpretation: The calculated z-score is 1. This means that the data point with a value of 75 is one standard deviation above the population mean of 70.

Q10. In a study of the effectiveness of a new weight loss drug, a sample of 50 participants lost an average
of 6 pounds with a standard deviation of 2.5 pounds. Conduct a hypothesis test to determine if the drug is
significantly effective at a 95% confidence level using a t-test.

A10: 

Null hypothesis (H0): The new weight loss drug has no significant effect (μ = 0).
Alternative hypothesis (Ha): The new weight loss drug is significantly effective (μ ≠ 0).

Given:

Sample mean (x̄) = 6 pounds

Sample standard deviation (s) = 2.5 pounds

Sample size (n) = 50

Confidence level = 95%

We will use a two-tailed t-test since we want to determine if the drug is significantly effective in either direction.

Using the t-distribution table the critical t-value for a 95% confidence level and 49 degrees of freedom (n-1) is approximately ±2.009.

Calculate the t-score:

t = (x̄ - μ) / (s / √n)

t = (6 - 0) / (2.5 / √50)

t = 6 / 0.3536

t = 16.97

Since the t-score (16.97) is greater than the critical t-value (±2.009), we reject the null hypothesis.

Interpretation: At a 95% confidence level, there is enough evidence to suggest that the new weight loss drug is significantly effective in helping participants lose weight.

Q11. In a survey of 500 people, 65% reported being satisfied with their current job. Calculate the 95%
confidence interval for the true proportion of people who are satisfied with their job.

A11: Given:

Sample proportion (p) = 65% or 0.65 
Sample size (n) = 500
The formula for calculating the confidence interval for a proportion is:

CI = p ± Z * √(p * (1 - p) / n)

Where:

Z is the critical value from the standard normal distribution corresponding to the chosen confidence level (for 95% confidence, Z = 1.96).
Calculate the confidence interval:

CI = 0.65 ± 1.96 * √(0.65 * (1 - 0.65) / 500)

CI = 0.65 ± 0.0418

Interpretation: The 95% confidence interval for the true proportion of people satisfied with their job is (0.6082, 0.6918). This means we are 95% confident that the true proportion of people satisfied with their job falls within this range.



Q12. A researcher is testing the effectiveness of two different teaching methods on student performance.
Sample A has a mean score of 85 with a standard deviation of 6, while sample B has a mean score of 82
with a standard deviation of 5. Conduct a hypothesis test to determine if the two teaching methods have a
significant difference in student performance using a t-test with a significance level of 0.01.

A12: 

Null hypothesis (H0): There is no significant difference between the two teaching methods (μA - μB = 0).
Alternative hypothesis (Ha): There is a significant difference between the two teaching methods (μA - μB ≠ 0).

Given:

Sample A mean (x̄A) = 85

Sample A standard deviation (sA) = 6

Sample A size (nA) = Unknown

Sample B mean (x̄B) = 82

Sample B standard deviation (sB) = 5

Sample B size (nB) = Unknown

Significance level = 0.01 (α = 0.01)

To conduct the t-test, we need to calculate the pooled standard error and the t-statistic:

1. Calculate the pooled standard error (sp):
sp = √[( (nA - 1) * sA^2 + (nB - 1) * sB^2 ) / (nA + nB - 2)]
We need to know the sample sizes (nA and nB) to compute the pooled standard error.

2. Calculate the t-statistic:
t = (x̄A - x̄B) / (sp * √(1/nA + 1/nB))

Once we have the t-statistic, we compare it with the critical t-value for a two-tailed test at a significance level of 0.01 and the appropriate degrees of freedom to determine whether to reject the null hypothesis or not.

Q13. A population has a mean of 60 and a standard deviation of 8. A sample of 50 observations has a mean
of 65. Calculate the 90% confidence interval for the true population mean.

A13: Given:

Population mean (μ) = 60

Population standard deviation (σ) = 8

Sample mean (x̄) = 65

Sample size (n) = 50

The formula for calculating the confidence interval for the population mean is the same as before:

CI = x̄ ± Z * (σ / √n)

Where:

Z is the critical value from the standard normal distribution corresponding to the chosen confidence level (for 90% confidence, Z = 1.645).

Calculate the confidence interval:

CI = 65 ± 1.645 * (8 / √50)

Interpretation: The 90% confidence interval for the true population mean is (63.139, 66.861). This means we are 90% confident that the true population mean falls within this range.

Q14. In a study of the effects of caffeine on reaction time, a sample of 30 participants had an average
reaction time of 0.25 seconds with a standard deviation of 0.05 seconds. Conduct a hypothesis test to
determine if the caffeine has a significant effect on reaction time at a 90% confidence level using a t-test.

A14: 

Null hypothesis (H0): Caffeine has no significant effect on reaction time (μ = 0).
Alternative hypothesis (Ha): Caffeine has a significant effect on reaction time (μ ≠ 0).

Given:

Sample mean (x̄) = 0.25 seconds

Sample standard deviation (s) = 0.05 seconds

Sample size (n) = 30

Confidence level = 90%

To conduct the t-test,

Calculate the t-score:

t = (x̄ - μ) / (s / √n)

t = (0.25 - 0) / (0.05 / √30)

t = 27.38

Compare the t-score with the critical t-value for a two-tailed test at a significance level of 0.10 and the appropriate degrees of freedom.

critical value = 1.699

If the t-score is greater than the critical t-value, we reject the null hypothesis and conclude that caffeine has a significant effect on reaction time at a 90% confidence level. Otherwise, we fail to reject the null hypothesis.

here t>1.699 therefore we reject the null hypothesis