# Lab: Hypothesis Testing (Z-Test) with Proportions

## Introduction

### Definition of commonly used functions

In [31]:
import math
from scipy import stats


def isNullHypothesisRejected(alpha, p_value):
    return True if p_value < alpha else False


def printConclusion(isNullHypothesisRejected):
    if isNullHypothesisRejected:
        print(
            "Reject the null hypothesis."
        )
    else:
        print(
            "Fail to reject the null hypothesis."
        )

## Question 1: One-Proportion z-test (Left-tailed Test)

**Scenario**: A university claims that at least 65% of its graduates secure a job within six months of graduation. You survey 200 graduates and find that 120 of them are employed within six months. At a 5% significance level, test if the university's claim holds.

### State the Null Hypothesis (H₀) and Alternative Hypothesis (H₁).
- Null Hypothesis (H₀): The proportion is at least 65%.  

       H₀: p ≥ 0.65

- Alternative Hypothesis (H₁): The proportion is less than 65%.  

       H₁: p < 0.65

In [32]:
# Parameters
alpha = 0.05
n = 200  # sample size
x = 120  # number of successes
p0 = 0.65  # hypothesized proportion

# Sample proportion
p_hat = x / n

# Standard error
se = math.sqrt((p0 * (1 - p0)) / n)

# z-test statistic
z = (p_hat - p0) / se

# p-value (left-tailed test)
p_value = stats.norm.cdf(z)

# Output the results
print(f"Z-score: {z}")
print(f"P-value: {p_value}")

# Output the conclusion
isRejected = isNullHypothesisRejected(alpha, p_value)
printConclusion(isRejected)

Z-score: -1.4824986333222037
P-value: 0.06910383348701271
Fail to reject the null hypothesis.


## Question 2: Two-Proportion z-test (Two-tailed Test)

**Scenario**: A sports team wants to compare the proportion of people attending their home games to those attending away games. Out of 300 home games, 180 attendees showed up. Out of 250 away games, 140 attendees showed up. At a 5%
significance level, is there a significant difference between the proportions of attendees at home and away games?

### State the Null Hypothesis (H₀) and Alternative Hypothesis (H₁).
- Null Hypothesis (H₀): The proportions are the same.  

       H₀: p1 = p2

- Alternative Hypothesis (H₁): The proportions are different.  

       H₁: p1 ≠ p2

In [33]:
# Parameters
n1 = 300  # sample size group 1
x1 = 180  # successes group 1
n2 = 250  # sample size group 2
x2 = 140  # successes group 2
# Sample proportions
p1 = x1 / n1
p2 = x2 / n2

# Pooled proportion
p_hat = (x1 + x2) / (n1 + n2)
# Standard error
se = math.sqrt(p_hat * (1 - p_hat) * ((1 / n1) + (1 / n2)))
# z-test statistic
z = (p1 - p2) / se
# p-value (two-tailed test)
p_value = 2 * (1 - stats.norm.cdf(abs(z)))

# Output the results
print(f"Z-score: {z}")
print(f"P-value: {p_value}")

# Output the conclusion
isRejected = isNullHypothesisRejected(alpha, p_value)
printConclusion(isRejected)

Z-score: 0.9469631093314982
P-value: 0.3436575774939179
Fail to reject the null hypothesis.


## Question 3: One-Proportion z-test (Left-tailed Test)

**Scenario**: A school claims that at least 75% of its students pass a standardized exam. You survey 150 students and find that 100 of them passed. Is the school's claim valid at the 5% significance level?

### State the Null Hypothesis (H₀) and Alternative Hypothesis (H₁).
- Null Hypothesis (H₀): The proportion is at least 75%.  

       H₀: p ≥ 0.75

- Alternative Hypothesis (H₁): The proportion is less than 75%.  

       H₁: p < 0.75

In [34]:
# Parameters
alpha = 0.05
n = 150  # sample size
x = 100  # number of successes
p0 = 0.75  # hypothesized proportion

# Sample proportion
p_hat = x / n

# Standard error
se = math.sqrt((p0 * (1 - p0)) / n)

# z-test statistic
z = (p_hat - p0) / se

# p-value (left-tailed test)
p_value = stats.norm.cdf(z)

# Output the results
print(f"Z-score: {z}")
print(f"P-value: {p_value}")

# Output the conclusion
isRejected = isNullHypothesisRejected(alpha, p_value)
printConclusion(isRejected)

Z-score: -2.3570226039551594
P-value: 0.00921106272704948
Reject the null hypothesis.


## Question 4: Two-Proportion z-test (Right-tailed Test)

**Scenario**: A company is comparing the promotion rates between male and female employees. The company claims that males are promoted at a higher rate than females. Out of 80 male employees, 45 have been promoted, and out of 70 female employees, 35 have been promoted. Test if males are promoted at a higher rate than females at the 5% significance level.

### State the Null Hypothesis (H₀) and Alternative Hypothesis (H₁).
- Null Hypothesis (H₀): The proportion of promoted males is higher than that of females.  

       H₀: p1 > p2

- Alternative Hypothesis (H₁): The proportion of promoted males is less than or equal to that of females.  

       H₁: p1 ≤ p2

In [35]:
# Parameters
n1 = 80  # sample size group 1
x1 = 45  # successes group 1
n2 = 70  # sample size group 2
x2 = 35  # successes group 2
# Sample proportions
p1 = x1 / n1
p2 = x2 / n2

# Pooled proportion
p_hat = (x1 + x2) / (n1 + n2)
# Standard error
se = math.sqrt(p_hat * (1 - p_hat) * ((1 / n1) + (1 / n2)))
# z-test statistic
z = (p1 - p2) / se
# p-value (Right-tailed test)
p_value = 1 - stats.norm.cdf(z)

# Output the results
print(f"Z-score: {z}")
print(f"P-value: {p_value}")

# Output the conclusion
isRejected = isNullHypothesisRejected(alpha, p_value)
printConclusion(isRejected)

Z-score: 0.7654655446197431
P-value: 0.2219971880115048
Fail to reject the null hypothesis.


## Question 5: One-Proportion z-test (Two-tailed Test)

**Scenario**: A car dealership claims that 40% of their sales come from repeat customers. You sample 100 sales records and find that 30 of them are from repeat customers. Test whether the dealership's claim is accurate at a 5% significance level.

### State the Null Hypothesis (H₀) and Alternative Hypothesis (H₁).
- Null Hypothesis (H₀): The proportion is 40%.  

       H₀: p = 0.40

- Alternative Hypothesis (H₁): The proportion is different from 40%.  

       H₁: p ≠ 0.40

In [36]:
# Parameters
alpha = 0.05
n = 100  # sample size
x = 30  # number of successes
p0 = 0.40  # hypothesized proportion

# Sample proportion
p_hat = x / n

# Standard error
se = math.sqrt((p0 * (1 - p0)) / n)

# z-test statistic
z = (p_hat - p0) / se

# p-value (Two-tailed test)
p_value = 2 * (1 - stats.norm.cdf(abs(z)))

# Output the results
print(f"Z-score: {z}")
print(f"P-value: {p_value}")

# Output the conclusion
isRejected = isNullHypothesisRejected(alpha, p_value)
printConclusion(isRejected)

Z-score: -2.041241452319316
P-value: 0.04122683333716348
Reject the null hypothesis.
