# **Lab: Hypothesis Testing (Z-Test) with Proportions**

In [9]:
import math
from scipy import stats

# Question 1: One-Proportion z-test (Right-tailed Test)
## Null Hypothesis: (H0):p >= 0.65
## Alternative Hypothesis: (Ha):p < 0.65

In [172]:
# Parameters
n = 200
x1 = 120
p0_1 = 0.65

# Sample proportion
p_hat = x1 / n

z = (p_hat - p0_1) / math.sqrt((p0_1 * (1 - p0_1)) / n)
p_value = 1 - stats.norm.cdf(z)

print(f"z_score: {z}")
print(f"p_value: {p_value}")

z_score: -1.4824986333222037
p_value: 0.9308961665129872


## Conclusion:
Since the p-value (0.93) is greater than 0.05, we fail to reject the null hypothesis. There is no evidence to suggest that less than 65% of graduates secure jobs within six months.

# Question 2: Two-Proportion z-test (Two-tailed Test)
## Null Hypothesis: (H0):p1 = p2 
## Alternative Hypothesis: (Ha):p1 != p2

In [107]:
# Parameters
n1_2 = 300 # sample size group 1
x1_2 = 180 # successes group 1
n2_2 = 250 # sample size group 2
x2_2 = 140 # successes group 2

# Sample proportion
p1_2 = x1_2 / n1_2
p2_2 = x2_2 / n2_2

# Pooled proportion
p_pooled = (x1_2 + x2_2) / (n1_2 + n2_2)

z = (p1_2 - p2_2) / math.sqrt(p_pooled * (1 - p_pooled) * (1 / n1_2 + 1 / n2_2))
p_value = 2 * (1 - stats.norm.cdf(z))
print(f"z_score: {z}")
print(f"p_value: {p_value}")

z_score: 0.9469631093314982
p_value: 0.3436575774939177


## Conclusion:
Since the p-value (0.34) is greater than 0.05, we fail to reject the null hypothesis. There is no significant difference between the proportions of people who attend home games and away games.

# Question 3: One-Proportion z-test (Left-tailed Test)
## Null Hypothesis: (H0):p >= 0.75
## Alternative Hypothesis: (Ha):p < 0.75

In [164]:
# Parameters
n = 150
x1 = 100
p0_1 = 0.75

# Sample proportion
p_hat = x1 / n

z = (p_hat - p0_1) / math.sqrt((p0_1 * (1 - p0_1)) / n)
p_value = stats.norm.cdf(z)

print(f"z_score: {z}")
print(f"p_value: {p_value}")

z_score: -2.3570226039551594
p_value: 0.009211062727049475


## Conclusion:
Since the p-value (0.009) is less than 0.05, we reject the null hypothesis. The proportion of students passing is significantly less than 75%.

# Question 4: Two-Proportion z-test (Right-tailed Test)
## Null Hypothesis: (H0):p1 >= p2 
## Alternative Hypothesis: (Ha):p1 < p2 

In [183]:
# Parameters
n1_2 = 80 # sample size group 1
x1_2 = 45 # successes group 1
n2_2 = 70 # sample size group 2
x2_2 = 35 # successes group 2

# Sample proportion
p1_2 = x1_2 / n1_2
p2_2 = x2_2 / n2_2

# Pooled proportion
p_pooled = (x1_2 + x2_2) / (n1_2 + n2_2)

z = (p1_2 - p2_2) / math.sqrt(p_pooled * (1 - p_pooled) * (1 / n1_2 + 1 / n2_2))
p_value = 1 - stats.norm.cdf(z)
print(f"z_score: {z}")
print(f"p_value: {p_value}")

z_score: 0.7654655446197431
p_value: 0.2219971880115048


## Conclusion:
Since the p-value (0.22) is greater than 0.05, we fail to reject the null hypothesis. There is no significant evidence to suggest that males are promoted at a higher rate than females.

# Question 5: One-Proportion z-test (Two-tailed Test)
## Null Hypothesis: (H0):p = 0.4 
## Alternative Hypothesis: (Ha):p != 0.4  

In [178]:
# Parameters
n = 100
x1 = 30
p0_1 = 0.4

# Sample proportion
p_hat = x1 / n

z = (p_hat - p0_1) / math.sqrt((p0_1 * (1 - p0_1)) / n)
p_value = 2 * (1 - stats.norm.cdf(abs(z)))

print(f"z_score: {z}")
print(f"p_value: {p_value}")

z_score: -2.041241452319316
p_value: 0.04122683333716348


## Conclusion:
Since the p-value (0.041) is less than 0.05, we reject the null hypothesis. The proportion of repeat customers is significantly different from 40%.