# **Lab: Hypothesis Testing (Z-Test) with Proportions**

In [9]:
import math
from scipy import stats

# Question 1: One-Proportion z-test (Right-tailed Test)
## Null Hypothesis: (H0):p >= 0.65
## Alternative Hypothesis: (Ha):p < 0.65

In [124]:
# Parameters
n = 200
x1 = 120
p0_1 = 0.65

# Sample proportion
p_hat = x1 / n

z = (p_hat - p0_1) / math.sqrt((p0_1 * (1 - p0_1)) / n)
p_value = stats.norm.cdf(z)

print(f"z_score: {z}")
print(f"p_value: {p_value}")

z_score: -1.4824986333222037
p_value: 0.06910383348701273


## Conclusion:
Since the p-value (0.069) is greater than 0.05, we fail to reject the null hypothesis. There is not sufficient evidence to refute the university's claim that at least 65% of its graduates secire a job within six months of graduation.

# Question 2: Two-Proportion z-test (Two-tailed Test)
## Null Hypothesis: (H0):p1 = p2 
## Alternative Hypothesis: (Ha):p1 != p2

In [107]:
# Parameters
n1_2 = 300 # sample size group 1
x1_2 = 180 # successes group 1
n2_2 = 250 # sample size group 2
x2_2 = 140 # successes group 2

# Sample proportion
p1_2 = x1_2 / n1_2
p2_2 = x2_2 / n2_2

# Pooled proportion
p_pooled = (x1_2 + x2_2) / (n1_2 + n2_2)

z = (p1_2 - p2_2) / math.sqrt(p_pooled * (1 - p_pooled) * (1 / n1_2 + 1 / n2_2))
p_value = 2 * (1 - stats.norm.cdf(z))
print(f"z_score: {z}")
print(f"p_value: {p_value}")

z_score: 0.9469631093314982
p_value: 0.3436575774939177


## Conclusion:
Since the p-value (0.343) is greater than 0.05, we fail to reject the null hypothesis. There is no significant difference between the proportions of people who attend home games and away games.

# Question 3: One-Proportion z-test (Left-tailed Test)
## Null Hypothesis: (H0):p >= 0.75
## Alternative Hypothesis: (Ha):p < 0.75

In [99]:
# Parameters
n = 150
x1 = 100
p0_1 = 0.75

# Sample proportion
p_hat = x1 / n

z = (p_hat - p0_1) / math.sqrt((p0_1 * (1 - p0_1)) / n)
p_value = 1 - stats.norm.cdf(z)

print(f"z_score: {z}")
print(f"p_value: {p_value}")

z_score: -2.3570226039551594
p_value: 0.9907889372729505


## Conclusion:
Since the p-value (0.99) is greater than 0.05, we fail to reject the null hypothesis. There is not sufficient evidence to refute the school's claim that at least 75% of its students pass a standardized exam.

# Question 4: Two-Proportion z-test (Right-tailed Test)
## Null Hypothesis: (H0):p1 >= p2 
## Alternative Hypothesis: (Ha):p1 < p2 

In [119]:
# Parameters
n1_2 = 80 # sample size group 1
x1_2 = 45 # successes group 1
n2_2 = 70 # sample size group 2
x2_2 = 35 # successes group 2

# Sample proportion
p1_2 = x1_2 / n1_2
p2_2 = x2_2 / n2_2

# Pooled proportion
p_pooled = (x1_2 + x2_2) / (n1_2 + n2_2)

z = (p1_2 - p2_2) / math.sqrt(p_pooled * (1 - p_pooled) * (1 / n1_2 + 1 / n2_2))
p_value = stats.norm.cdf(z)
print(f"z_score: {z}")
print(f"p_value: {p_value}")

z_score: 0.7654655446197431
p_value: 0.7780028119884952


## Conclusion:
Since the p-value (0.778) is greater than 0.05, we fail to reject the null hypothesis. There is no significant difference between the proportions of male and female employees.

# Question 5: One-Proportion z-test (Two-tailed Test)
## Null Hypothesis: (H0):p = 0.4 
## Alternative Hypothesis: (Ha):p != 0.4  

In [136]:
# Parameters
n = 100
x1 = 30
p0_1 = 0.4

# Sample proportion
p_hat = x1 / n

z = (p_hat - p0_1) / math.sqrt((p0_1 * (1 - p0_1)) / n)
p_value = 2 * (1 - stats.norm.cdf(z))

print(f"z_score: {z}")
print(f"p_value: {p_value}")

z_score: -2.041241452319316
p_value: 1.9587731666628365


## Conclusion:
Since the p-value (1.958) is greater than 0.05, we fail to reject the null hypothesis. There is not sufficient evidence to refute the car dealership's claim that 40% of their sales come from repeat customers. 