Q1: What is the difference between a t-test and a z-test? Provide an example scenario where you would
use each type of test.

T-test: A t-test is used when the sample size is small (typically less than 30) and/or when the population standard deviation is unknown. It uses the t-distribution.

Example: Testing whether the average score of 20 students on a test is different from the population mean score when the population standard deviation is unknown.

Z-test: A z-test is used when the sample size is large (typically greater than 30) or when the population standard deviation is known. It uses the z-distribution.

Example: Testing whether the mean weight of a sample of 1000 people is different from the known population mean weight when the population standard deviation is known.

Q2: Differentiate between one-tailed and two-tailed tests.

One-tailed Test: A hypothesis test where the alternative hypothesis is directional (e.g., greater than or less than). It tests for the possibility of an effect in one direction only.

Example: Testing if a drug decreases blood pressure (i.e., looking for a decrease only).

Two-tailed Test: A hypothesis test where the alternative hypothesis is non-directional (e.g., different from). It tests for the possibility of an effect in both directions.

Example: Testing if a drug has an effect on blood pressure (either increase or decrease).

Q3: Explain the concept of Type 1 and Type 2 errors in hypothesis testing. Provide an example scenario for
each type of error.

Type 1 Error (False Positive): Rejecting the null hypothesis when it is actually true.

Example: Concluding that a new drug works (it decreases blood pressure) when in reality, it does not.

Type 2 Error (False Negative): Failing to reject the null hypothesis when it is actually false.

Example: Concluding that a new drug has no effect when in reality, it does.


Q4: Explain Bayes's theorem with an example.

In [16]:

P_A = 0.01
P_B_given_A = 0.95
P_B_given_not_A = 0.10
P_not_A = 0.99

P_B = (P_B_given_A * P_A) + (P_B_given_not_A * P_not_A)

# Applying Bayes's Theorem
P_A_given_B = (P_B_given_A * P_A) / P_B

print(f"Probability of having the disease given a positive test: {P_A_given_B:.4f}")


Probability of having the disease given a positive test: 0.0876


Q5: What is a confidence interval? How to calculate the confidence interval, explain with an example.

In [17]:
sample_mean = 50
sample_std = 5
sample_size = 30
confidence_level = 0.95


se = sample_std / np.sqrt(sample_size)


z_score = stats.norm.ppf(1 - (1 - confidence_level) / 2)

margin_of_error = z_score * se


lower_bound = sample_mean - margin_of_error
upper_bound = sample_mean + margin_of_error

print(f"95% Confidence Interval: ({lower_bound}, {upper_bound})")


95% Confidence Interval: (48.210805856282846, 51.789194143717154)


Q7. Calculate the 95% confidence interval for a sample of data with a mean of 50 and a standard deviation
of 5. Interpret the results.

In [18]:
sample_mean = 50
sample_std = 5
sample_size = 30
confidence_level = 0.95


se = sample_std / np.sqrt(sample_size)


z_score = stats.norm.ppf(1 - (1 - confidence_level) / 2)


margin_of_error = z_score * se


lower_bound = sample_mean - margin_of_error
upper_bound = sample_mean + margin_of_error

print(f"95% Confidence Interval: ({lower_bound}, {upper_bound})")


95% Confidence Interval: (48.210805856282846, 51.789194143717154)


Q8. What is the margin of error in a confidence interval? How does sample size affect the margin of error?
Provide an example of a scenario where a larger sample size would result in a smaller margin of error.

Margin of Error is the range above and below the sample estimate that we are confident contains the true population parameter.

Effect of Sample Size: Increasing the sample size reduces the margin of error, making the estimate more precise.

Example: If you have a sample size of 100 versus 1000, the larger sample size would give a smaller margin of error, resulting in a more accurate estimate of the population mean.

Q9. Calculate the z-score for a data point with a value of 75, a population mean of 70, and a population
standard deviation of 5. Interpret the results.

In [19]:
value = 75
mean = 70
std_dev = 5


z_score = (value - mean) / std_dev
print(f"Z-score: {z_score}")


Z-score: 1.0


Q10. In a study of the effectiveness of a new weight loss drug, a sample of 50 participants lost an average
of 6 pounds with a standard deviation of 2.5 pounds. Conduct a hypothesis test to determine if the drug is
significantly effective at a 95% confidence level using a t-test.

In [20]:

sample_mean = 6
population_mean = 0
sample_std = 2.5
sample_size = 50
alpha = 0.05


se = sample_std / np.sqrt(sample_size)

t_stat = (sample_mean - population_mean) / se


df = sample_size - 1


p_value = stats.t.cdf(t_stat, df)


if p_value < alpha:
    result = "Reject the null hypothesis"
else:
    result = "Fail to reject the null hypothesis"

print(f"t-statistic: {t_stat}, p-value: {p_value}")
print(result)


t-statistic: 16.970562748477143, p-value: 1.0
Fail to reject the null hypothesis


Q11. In a survey of 500 people, 65% reported being satisfied with their current job. Calculate the 95%
confidence interval for the true proportion of people who are satisfied with their job.

In [21]:

sample_proportion = 0.65
sample_size = 500
confidence_level = 0.95


se_proportion = np.sqrt((sample_proportion * (1 - sample_proportion)) / sample_size)


z_score = stats.norm.ppf(1 - (1 - confidence_level) / 2)


margin_of_error = z_score * se_proportion


lower_bound = sample_proportion - margin_of_error
upper_bound = sample_proportion + margin_of_error

print(f"95% Confidence Interval for the true proportion: ({lower_bound:.4f}, {upper_bound:.4f})")


95% Confidence Interval for the true proportion: (0.6082, 0.6918)


Q12. A researcher is testing the effectiveness of two different teaching methods on student performance.
Sample A has a mean score of 85 with a standard deviation of 6, while sample B has a mean score of 82
with a standard deviation of 5. Conduct a hypothesis test to determine if the two teaching methods have a
significant difference in student performance using a t-test with a significance level of 0.01.

In [22]:

mean_A = 85
std_A = 6
n_A = 30

mean_B = 82
std_B = 5
n_B = 40


alpha = 0.01


se = np.sqrt((std_A**2 / n_A) + (std_B**2 / n_B))

t_stat = (mean_A - mean_B) / se


df = ((std_A**2 / n_A + std_B**2 / n_B)**2) / (((std_A**2 / n_A)**2 / (n_A - 1)) + ((std_B**2 / n_B)**2 / (n_B - 1)))


p_value = 2 * (1 - stats.t.cdf(abs(t_stat), df))


if p_value < alpha:
    result = "Reject the null hypothesis"
else:
    result = "Fail to reject the null hypothesis"

print(f"t-statistic: {t_stat:.4f}, p-value: {p_value:.4f}")
print(result)


t-statistic: 2.2207, p-value: 0.0304
Fail to reject the null hypothesis


Q13. A population has a mean of 60 and a standard deviation of 8. A sample of 50 observations has a mean
of 65. Calculate the 90% confidence interval for the true population mean.

In [23]:

sample_mean = 65
pop_mean = 60  # Not used directly here, just to know
pop_std = 8
sample_size = 50
confidence_level = 0.90


se = pop_std / np.sqrt(sample_size)


z_score = stats.norm.ppf(1 - (1 - confidence_level) / 2)


margin_of_error = z_score * se


lower_bound = sample_mean - margin_of_error
upper_bound = sample_mean + margin_of_error

print(f"90% Confidence Interval: ({lower_bound:.2f}, {upper_bound:.2f})")


90% Confidence Interval: (63.14, 66.86)


Q14. In a study of the effects of caffeine on reaction time, a sample of 30 participants had an average
reaction time of 0.25 seconds with a standard deviation of 0.05 seconds. Conduct a hypothesis test to
determine if the caffeine has a significant effect on reaction time at a 90% confidence level using a t-test.

In [24]:

sample_mean = 0.25
population_mean = 0.30
sample_std = 0.05
sample_size = 30
alpha = 0.10


se = sample_std / np.sqrt(sample_size)

t_stat = (sample_mean - population_mean) / se


df = sample_size - 1

p_value = 2 * (1 - stats.t.cdf(abs(t_stat), df))


if p_value < alpha:
    result = "Reject the null hypothesis"
else:
    result = "Fail to reject the null hypothesis"

print(f"t-statistic: {t_stat:.4f}, p-value: {p_value:.4f}")
print(result)


t-statistic: -5.4772, p-value: 0.0000
Reject the null hypothesis
