# Week 10 Lab: Hypothesis Testing

## Making Decisions Under Uncertainty

**SCIE1500 - Analytical Methods for Scientists**

---

### Learning Objectives

By the end of this lab, you will be able to:

1. Calculate expected values and variances from probability distributions
2. Formulate null and alternative hypotheses correctly
3. Distinguish between one-tailed and two-tailed tests (**CRITICAL for exam Q35**)
4. Calculate p-values using `binom.pmf` and `binomtest`
5. Make statistical decisions at the 5% significance level
6. Visualize rejection regions and p-values

---

### What to Submit

1. **During Lab:** Complete **Exercise A** and show your results to your lab demonstrator
2. **By Due Date:** Upload screenshots of completed **Exercises A, B**, and **C** (practice quiz answers)

---

In [None]:
# === SETUP: Run this cell first ===
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from scipy.stats import binom, binomtest

# Display settings
pd.set_option('display.max_rows', None)
np.set_printoptions(precision=5)

print("‚úì All packages loaded successfully!")
print("\nThis week: Hypothesis Testing ‚Äî Making Decisions Under Uncertainty")

---

## Part A: Review of Random Variables

### A.1 Expected Value

The **expected value** (mean) of a discrete random variable is:

$$\boxed{E[X] = \sum_{x} x \cdot P(X = x)}$$

This is the probability-weighted average of all possible values.

In [None]:
# Example: Expected Value Calculation (Q33 Style)
# Given probability distribution:
# X:    0    3    4    6
# P(X): 0.2  0.4  0.3  0.1

x_values = np.array([0, 3, 4, 6])
probabilities = np.array([0.2, 0.4, 0.3, 0.1])

# Verify probabilities sum to 1
print(f"Sum of probabilities: {probabilities.sum()} (should be 1.0)")

# Calculate E[X]
expected_value = np.sum(x_values * probabilities)

print(f"\nE[X] = {x_values[0]}√ó{probabilities[0]} + {x_values[1]}√ó{probabilities[1]} + {x_values[2]}√ó{probabilities[2]} + {x_values[3]}√ó{probabilities[3]}")
print(f"E[X] = {x_values[0]*probabilities[0]} + {x_values[1]*probabilities[1]} + {x_values[2]*probabilities[2]} + {x_values[3]*probabilities[3]}")
print(f"E[X] = {expected_value}")

### A.2 Binomial Distribution Review

For $X \sim \text{Binomial}(n, p)$:

$$P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}$$

**Key properties:**
- $E[X] = np$
- $\text{Var}(X) = np(1-p)$

In [None]:
# Binomial distribution example: 12 coin tosses
n = 12
p = 0.5

k_values = np.arange(0, n + 1)
probabilities = binom.pmf(k_values, n, p)

# Create a table
df = pd.DataFrame({
    'k (successes)': k_values,
    'P(X = k)': np.round(probabilities, 5)
})
print(f"Binomial Distribution: X ~ Binomial({n}, {p})")
print(df.to_string(index=False))

# Mean and Variance
print(f"\nE[X] = np = {n}√ó{p} = {n*p}")
print(f"Var(X) = np(1-p) = {n}√ó{p}√ó{1-p} = {n*p*(1-p)}")

In [None]:
# Visualize the binomial distribution
plt.figure(figsize=(10, 5))
plt.bar(k_values, probabilities, color='steelblue', edgecolor='black', alpha=0.7)
plt.axvline(x=n*p, color='red', linestyle='--', linewidth=2, label=f'Mean = {n*p}')
plt.xlabel('k (Number of Successes)', fontsize=12)
plt.ylabel('P(X = k)', fontsize=12)
plt.title(f'Binomial Distribution: X ~ Binomial({n}, {p})', fontsize=14)
plt.xticks(k_values)
plt.legend()
plt.grid(axis='y', alpha=0.3)
plt.tight_layout()
plt.show()

---

## Part B: Introduction to Hypothesis Testing

### B.1 The Big Picture

In Week 9, we calculated probabilities. Now we ask: **Can we use data to test claims?**

**Example:** You toss a coin 12 times and get 10 heads. Is the coin fair ($p = 0.5$)?

### B.2 The p-Value

The **p-value** answers: *How likely is it to observe data this extreme (or more extreme) if the hypothesis is true?*

$$\boxed{\text{p-value} = P(\text{data this extreme or more} \;|\; H_0 \text{ is true})}$$

### B.3 The Decision Rule

- If **p-value < 0.05**: Reject $H_0$ (statistically significant)
- If **p-value ‚â• 0.05**: Fail to reject $H_0$ (not enough evidence)

---

## Part C: Calculating p-Values

### C.1 Using `binom.pmf` (Manual Method)

**Scenario:** You observed 10 heads out of 12 tosses. Is the coin fair?

If $p = 0.5$ (fair coin), how likely is getting 10+ heads?

In [None]:
# Calculate p-value manually using binom.pmf
n = 12
p_null = 0.5  # Null hypothesis: fair coin
observed_k = 10

# Calculate all probabilities
k_values = np.arange(0, n + 1)
probabilities = binom.pmf(k_values, n, p_null)

# Create table
df = pd.DataFrame({'k': k_values, 'P(X=k)': np.round(probabilities, 5)})
print("Binomial probabilities under H‚ÇÄ: p = 0.5")
print(df.to_string(index=False))

# p-value for ONE-TAILED test (upper): P(X >= 10)
p_value_upper = sum(probabilities[k_values >= observed_k])
print(f"\nP(X ‚â• {observed_k}) = {p_value_upper:.5f}")

In [None]:
# For TWO-TAILED test, include equally extreme values from BOTH tails
# If 10 heads is extreme (deviation of 4 from mean of 6), 
# then 2 heads is equally extreme (also deviation of 4)

p_value_lower = sum(probabilities[k_values <= 2])  # P(X <= 2)
p_value_two_tailed = p_value_upper + p_value_lower

print(f"One-tailed (upper): P(X ‚â• {observed_k}) = {p_value_upper:.5f}")
print(f"One-tailed (lower): P(X ‚â§ 2) = {p_value_lower:.5f}")
print(f"Two-tailed: {p_value_upper:.5f} + {p_value_lower:.5f} = {p_value_two_tailed:.5f}")

### C.2 Using `binomtest` (Recommended Method)

The `binomtest` function calculates p-values directly.

**Syntax:** `binomtest(k, n, p, alternative)`

| Alternative | Hypotheses | Use When |
|-------------|------------|----------|
| `'greater'` | $H_0: p \le p_0$ vs $H_a: p > p_0$ | Testing if parameter is **greater** |
| `'less'` | $H_0: p \ge p_0$ vs $H_a: p < p_0$ | Testing if parameter is **less** |
| `'two-sided'` | $H_0: p = p_0$ vs $H_a: p \ne p_0$ | Testing if parameter is **different** |

In [None]:
# Using binomtest
n = 12
observed_k = 10
p_null = 0.5

print("Using scipy.stats.binomtest()")
print("="*50)

# One-tailed test (upper): Is coin biased toward heads?
result_greater = binomtest(k=observed_k, n=n, p=p_null, alternative='greater')
print(f"\nOne-tailed test (H‚Çê: p > 0.5):")
print(f"  p-value = {result_greater.pvalue:.5f}")

# Two-tailed test: Is coin biased (either direction)?
result_two_sided = binomtest(k=observed_k, n=n, p=p_null, alternative='two-sided')
print(f"\nTwo-tailed test (H‚Çê: p ‚â† 0.5):")
print(f"  p-value = {result_two_sided.pvalue:.5f}")

# Decision
alpha = 0.05
print(f"\nDecision at Œ± = {alpha}:")
print(f"  One-tailed: {'REJECT H‚ÇÄ' if result_greater.pvalue < alpha else 'Fail to reject H‚ÇÄ'}")
print(f"  Two-tailed: {'REJECT H‚ÇÄ' if result_two_sided.pvalue < alpha else 'Fail to reject H‚ÇÄ'}")

### C.3 Visualizing the p-Value

In [None]:
# Visualize p-value for one-tailed test (upper)
n = 12
p_null = 0.5
observed_k = 10

k_values = np.arange(0, n + 1)
probabilities = binom.pmf(k_values, n, p_null)

# Color bars: red for p-value region (k >= observed)
colors = ['coral' if k >= observed_k else 'steelblue' for k in k_values]

plt.figure(figsize=(10, 6))
plt.bar(k_values, probabilities, color=colors, edgecolor='black', alpha=0.7)

# Add annotations
p_value = sum(probabilities[k_values >= observed_k])
plt.axvline(x=observed_k - 0.5, color='red', linestyle='--', alpha=0.7)

plt.text(0.02, 0.95, f'p-value = P(X ‚â• {observed_k}) = {p_value:.4f}',
         transform=plt.gca().transAxes, fontsize=12,
         bbox=dict(boxstyle='round', facecolor='lightyellow', alpha=0.8))

plt.xlabel('k (Number of Heads)', fontsize=12)
plt.ylabel('P(X = k)', fontsize=12)
plt.title('One-Tailed Test: p-value = P(X ‚â• 10) under H‚ÇÄ: p = 0.5\n(Red bars = p-value region)', fontsize=14)
plt.xticks(k_values)
plt.grid(axis='y', alpha=0.3)

# Legend
from matplotlib.patches import Patch
legend_elements = [Patch(facecolor='coral', label='p-value region'),
                   Patch(facecolor='steelblue', label='Not in p-value')]
plt.legend(handles=legend_elements, loc='upper right')

plt.tight_layout()
plt.show()

---

## Part D: The Four Steps of Hypothesis Testing

### D.1 The Framework

Every hypothesis test has four components:

1. **Null Hypothesis ($H_0$):** The "status quo" claim we want to test
2. **Alternative Hypothesis ($H_a$):** What we suspect might be true
3. **Test Statistic:** A numerical summary of the sample data
4. **Decision Rule:** When to reject $H_0$ (based on significance level $\alpha$)

---

## Part E: One-Tailed vs Two-Tailed Tests

### ‚ö†Ô∏è CRITICAL FOR EXAM Q35 ‚ö†Ô∏è

The choice between one-tailed and two-tailed tests depends on the **research question**.

| Test Type | Hypotheses | Use When |
|-----------|------------|----------|
| **One-tailed (upper)** | $H_0: p \le p_0$ vs $H_a: p > p_0$ | "Is it **greater** than...?" |
| **One-tailed (lower)** | $H_0: p \ge p_0$ vs $H_a: p < p_0$ | "Is it **less** than...?" |
| **Two-tailed** | $H_0: p = p_0$ vs $H_a: p \ne p_0$ | "Is it **different** from...?" |

### E.1 Example: Party Support (One-Tailed Test)

**Problem:** A party leader claims support has **increased** from the previous election's 40%. A survey of 20 people finds 10 supporters. Test this claim.

In [None]:
# One-tailed test: Has support INCREASED from 40%?
print("="*60)
print("ONE-TAILED TEST: Has party support INCREASED from 40%?")
print("="*60)

n = 20
observed_k = 10
p_null = 0.40

print(f"\nData: {observed_k} supporters out of {n} surveyed")
print(f"Sample proportion: pÃÇ = {observed_k}/{n} = {observed_k/n}")

# Step 1: Hypotheses
print(f"\n--- Step 1: Hypotheses ---")
print(f"H‚ÇÄ: p ‚â§ 0.40 (support has NOT increased)")
print(f"H‚Çê: p > 0.40 (support HAS increased)")

# Step 2: Test statistic
print(f"\n--- Step 2: Test Statistic ---")
p_hat = observed_k / n
print(f"pÃÇ = {observed_k}/{n} = {p_hat}")

# Step 3: p-value
print(f"\n--- Step 3: Calculate p-value ---")
result = binomtest(k=observed_k, n=n, p=p_null, alternative='greater')
print(f"p-value = P(X ‚â• {observed_k} | p = 0.40) = {result.pvalue:.5f}")

# Step 4: Decision
print(f"\n--- Step 4: Decision (Œ± = 0.05) ---")
if result.pvalue < 0.05:
    print(f"Since {result.pvalue:.4f} < 0.05, REJECT H‚ÇÄ")
    print("Conclusion: Evidence supports that support has increased.")
else:
    print(f"Since {result.pvalue:.4f} ‚â• 0.05, FAIL TO REJECT H‚ÇÄ")
    print("Conclusion: Insufficient evidence that support has increased.")

### E.2 Finding the Rejection Region

Instead of calculating a p-value for each observation, we can find the **critical value** ‚Äî the boundary of the rejection region.

In [None]:
# Find the critical value (rejection region boundary)
n = 20
p_null = 0.40
alpha = 0.05

print(f"Finding Critical Value for One-Tailed Test (Upper)")
print(f"H‚ÇÄ: p ‚â§ {p_null} vs H‚Çê: p > {p_null}")
print(f"n = {n}, Œ± = {alpha}")
print("="*50)

print("\n  k    P(X ‚â• k)    Reject H‚ÇÄ?")
print("-"*35)

critical_value = None
for k in range(8, 16):  # Check relevant range
    result = binomtest(k=k, n=n, p=p_null, alternative='greater')
    reject = "YES" if result.pvalue < alpha else "NO"
    marker = " ‚Üê Critical value" if result.pvalue < alpha and critical_value is None else ""
    if result.pvalue < alpha and critical_value is None:
        critical_value = k
    print(f"  {k}    {result.pvalue:.5f}     {reject}{marker}")

print(f"\nCritical value: k* = {critical_value}")
print(f"Rule: Reject H‚ÇÄ if observed successes ‚â• {critical_value}")
print(f"      (equivalently, if pÃÇ ‚â• {critical_value/n})")

In [None]:
# Visualize rejection region
n = 20
p_null = 0.40
observed_k = 10
critical_k = 13

k_values = np.arange(0, n + 1)
probabilities = binom.pmf(k_values, n, p_null)

# Color: red for rejection region, highlight observed
colors = []
for k in k_values:
    if k >= critical_k:
        colors.append('coral')  # Rejection region
    elif k == observed_k:
        colors.append('gold')  # Observed value
    else:
        colors.append('steelblue')

plt.figure(figsize=(12, 5))
plt.bar(k_values, probabilities, color=colors, edgecolor='black', alpha=0.7)
plt.axvline(x=critical_k - 0.5, color='red', linestyle='--', linewidth=2, label=f'Critical value k* = {critical_k}')
plt.axvline(x=n*p_null, color='green', linestyle='-', linewidth=2, label=f'Expected = {n*p_null}')

plt.xlabel('k (Number of Supporters)', fontsize=12)
plt.ylabel('P(X = k) under H‚ÇÄ: p = 0.40', fontsize=12)
plt.title(f'Rejection Region for One-Tailed Test\nReject H‚ÇÄ if k ‚â• {critical_k}; Observed k = {observed_k} (NOT in rejection region)', fontsize=13)
plt.legend()
plt.xticks(k_values)
plt.grid(axis='y', alpha=0.3)

# Legend for colors
from matplotlib.patches import Patch
legend_elements = [Patch(facecolor='coral', label='Rejection region'),
                   Patch(facecolor='gold', label=f'Observed (k={observed_k})'),
                   Patch(facecolor='steelblue', label='Acceptance region')]
plt.legend(handles=legend_elements, loc='upper right')

plt.tight_layout()
plt.show()

### E.3 Two-Tailed Test Example

**Problem:** We want to test if support has **changed** (up OR down) from 40%. Same data: 10 supporters out of 20.

In [None]:
# Two-tailed test: Has support CHANGED from 40%?
print("="*60)
print("TWO-TAILED TEST: Has party support CHANGED from 40%?")
print("="*60)

n = 20
observed_k = 10
p_null = 0.40

# Hypotheses
print(f"\nH‚ÇÄ: p = 0.40 (support is unchanged)")
print(f"H‚Çê: p ‚â† 0.40 (support has changed)")

# p-value
result = binomtest(k=observed_k, n=n, p=p_null, alternative='two-sided')
print(f"\np-value = {result.pvalue:.5f}")

# Decision
if result.pvalue < 0.05:
    print(f"Since {result.pvalue:.4f} < 0.05, REJECT H‚ÇÄ")
else:
    print(f"Since {result.pvalue:.4f} ‚â• 0.05, FAIL TO REJECT H‚ÇÄ")
    print("Conclusion: Insufficient evidence that support has changed.")

In [None]:
# Visualize two-tailed test
n = 20
p_null = 0.40
observed_k = 10

k_values = np.arange(0, n + 1)
probabilities = binom.pmf(k_values, n, p_null)
prob_observed = binom.pmf(observed_k, n, p_null)

# For two-tailed: color bars with P(X=k) <= P(X=observed) as extreme
eps = 1e-10
colors = ['coral' if prob <= prob_observed + eps else 'steelblue' for prob in probabilities]

plt.figure(figsize=(12, 5))
plt.bar(k_values, probabilities, color=colors, edgecolor='black', alpha=0.7)
plt.axhline(y=prob_observed, color='red', linestyle='--', alpha=0.7, label=f'P(X={observed_k})')

# p-value
p_value = sum(p for p in probabilities if p <= prob_observed + eps)
plt.text(0.02, 0.95, f'p-value = {p_value:.4f}',
         transform=plt.gca().transAxes, fontsize=12,
         bbox=dict(boxstyle='round', facecolor='lightyellow', alpha=0.8))

plt.xlabel('k (Number of Supporters)', fontsize=12)
plt.ylabel('P(X = k)', fontsize=12)
plt.title('Two-Tailed Test: p-value includes both tails\n(Red bars = values as extreme or more extreme than observed)', fontsize=13)
plt.xticks(k_values)
plt.legend()
plt.grid(axis='y', alpha=0.3)
plt.tight_layout()
plt.show()

### E.4 Comparing One-Tailed vs Two-Tailed

In [None]:
# Side-by-side comparison
n = 20
p_null = 0.5
observed = 14

k_values = np.arange(0, n + 1)
probabilities = binom.pmf(k_values, n, p_null)

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# One-tailed test (upper)
colors_one = ['coral' if k >= observed else 'steelblue' for k in k_values]
axes[0].bar(k_values, probabilities, color=colors_one, edgecolor='black', alpha=0.7)
p_value_one = binomtest(k=observed, n=n, p=p_null, alternative='greater').pvalue
axes[0].set_title(f'One-Tailed Test (H‚Çê: p > 0.5)\np-value = {p_value_one:.4f}', fontsize=12)
axes[0].set_xlabel('k')
axes[0].set_ylabel('P(X = k)')
axes[0].axvline(x=n*p_null, color='green', linestyle='--', label=f'Mean = {n*p_null}')

# Two-tailed test
deviation = abs(observed - n*p_null)
lower_extreme = int(n*p_null - deviation)
colors_two = ['coral' if (k >= observed or k <= lower_extreme) else 'steelblue' for k in k_values]
axes[1].bar(k_values, probabilities, color=colors_two, edgecolor='black', alpha=0.7)
p_value_two = binomtest(k=observed, n=n, p=p_null, alternative='two-sided').pvalue
axes[1].set_title(f'Two-Tailed Test (H‚Çê: p ‚â† 0.5)\np-value = {p_value_two:.4f}', fontsize=12)
axes[1].set_xlabel('k')
axes[1].set_ylabel('P(X = k)')
axes[1].axvline(x=n*p_null, color='green', linestyle='--', label=f'Mean = {n*p_null}')

fig.suptitle(f'Observed: {observed} successes out of {n}', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

print(f"Summary for observed = {observed}, n = {n}, p‚ÇÄ = {p_null}:")
print(f"  One-tailed p-value: {p_value_one:.4f} ‚Üí {'REJECT' if p_value_one < 0.05 else 'Fail to reject'} at Œ± = 0.05")
print(f"  Two-tailed p-value: {p_value_two:.4f} ‚Üí {'REJECT' if p_value_two < 0.05 else 'Fail to reject'} at Œ± = 0.05")

---

## Part F: Q35 Exam Question ‚Äî New Virus Variant

### ‚ö†Ô∏è THIS IS CRITICAL FOR THE EXAM ‚ö†Ô∏è

**Scenario:** Health experts suspect a new virus variant is **more infectious** than the old one (which had $p = 0.50$ transmission rate). Out of 11 contacts, 9 resulted in infections.

**Question:** Is this evidence that the new variant is more infectious?

In [None]:
# Q35: Is the new variant MORE infectious?
print("="*70)
print("Q35 EXAM QUESTION: Is the new variant MORE infectious?")
print("="*70)

# Given data
n = 11
observed_infections = 9
p_old = 0.50

print(f"\nData: {observed_infections} infections out of {n} contacts")
print(f"Old variant rate: p = {p_old}")
print(f"Question: Is the new variant MORE infectious?")

# The CORRECT approach: One-tailed test
print("\n" + "-"*70)
print("‚úì CORRECT APPROACH: One-tailed test (because we ask 'MORE infectious')")
print("-"*70)
print("H‚ÇÄ: p ‚â§ 0.50 (new variant is NOT more infectious)")
print("H‚Çê: p > 0.50 (new variant IS more infectious)")

result_one_tailed = binomtest(k=observed_infections, n=n, p=p_old, alternative='greater')
print(f"\np-value = P(X ‚â• {observed_infections} | p = 0.5) = {result_one_tailed.pvalue:.5f}")
print(f"         = 67/2048 ‚âà 0.033")

print(f"\nSince {result_one_tailed.pvalue:.3f} < 0.05, REJECT H‚ÇÄ")
print("‚Üí Conclusion: Evidence supports that new variant IS more infectious")

# The WRONG approach: Two-tailed test
print("\n" + "-"*70)
print("‚úó INCORRECT APPROACH: Two-tailed test")
print("-"*70)
print("H‚ÇÄ: p = 0.50")
print("H‚Çê: p ‚â† 0.50")

result_two_tailed = binomtest(k=observed_infections, n=n, p=p_old, alternative='two-sided')
print(f"\np-value = {result_two_tailed.pvalue:.5f}")

print(f"\nSince {result_two_tailed.pvalue:.3f} > 0.05, would FAIL TO REJECT H‚ÇÄ")
print("‚Üí This leads to the WRONG conclusion for this question!")

print("\n" + "="*70)
print("KEY INSIGHT: Match the test type to the research question!")
print("  'More infectious' ‚Üí One-tailed (upper) test")
print("  'Different' ‚Üí Two-tailed test")
print("="*70)

In [None]:
# Visualize Q35
n = 11
p_null = 0.5
observed_k = 9

k_values = np.arange(0, n + 1)
probabilities = binom.pmf(k_values, n, p_null)

colors = ['coral' if k >= observed_k else 'steelblue' for k in k_values]

plt.figure(figsize=(10, 6))
plt.bar(k_values, probabilities, color=colors, edgecolor='black', alpha=0.7)

# Annotations
p_value = sum(probabilities[k_values >= observed_k])
plt.annotate(f'Observed: k = {observed_k}', xy=(observed_k, probabilities[observed_k]),
            xytext=(observed_k - 2, probabilities[observed_k] + 0.08),
            arrowprops=dict(arrowstyle='->', color='red'),
            fontsize=11, color='red', fontweight='bold')

plt.text(0.02, 0.95, f'p-value = P(X ‚â• {observed_k}) = {p_value:.4f}',
         transform=plt.gca().transAxes, fontsize=12,
         bbox=dict(boxstyle='round', facecolor='lightyellow', alpha=0.8))

plt.text(0.02, 0.85, 'Since 0.033 < 0.05: REJECT H‚ÇÄ',
         transform=plt.gca().transAxes, fontsize=12,
         bbox=dict(boxstyle='round', facecolor='lightcoral', alpha=0.8))

plt.xlabel('k (Number of Infections)', fontsize=12)
plt.ylabel('P(X = k) under H‚ÇÄ: p = 0.5', fontsize=12)
plt.title('Q35: Testing if New Variant is MORE Infectious\nOne-Tailed Test (H‚Çê: p > 0.5)', fontsize=14)
plt.xticks(k_values)
plt.grid(axis='y', alpha=0.3)

from matplotlib.patches import Patch
legend_elements = [Patch(facecolor='coral', label='p-value region (k ‚â• 9)'),
                   Patch(facecolor='steelblue', label='Not in p-value')]
plt.legend(handles=legend_elements, loc='upper left')

plt.tight_layout()
plt.show()

---

## üìù STUDENT EXERCISE A (Show Demonstrator)

### Two-Tailed Test with Larger Sample

**Problem:** Repeat the party support test with a larger sample:
- 200 people surveyed, 100 say they support the party
- Test if support has **changed** from 40% (two-tailed)

**Tasks:**
1. State the null and alternative hypotheses
2. Calculate the p-value using `binomtest`
3. Make a decision at Œ± = 0.05
4. Visualize the distribution and p-value region

In [None]:
# EXERCISE A: Two-tailed test with larger sample
n = 200
observed_k = 100
p_null = 0.40

# YOUR CODE HERE
# 1. State hypotheses (write in comments or print statements)
# print("H‚ÇÄ: ...")
# print("H‚Çê: ...")

# 2. Calculate p-value
# result = binomtest(k=..., n=..., p=..., alternative='two-sided')
# print(f"p-value = {result.pvalue}")

# 3. Make decision
# if result.pvalue < 0.05:
#     print("REJECT H‚ÇÄ")
# else:
#     print("Fail to reject H‚ÇÄ")

In [None]:
# EXERCISE A: Visualization
# Create a bar plot showing the p-value region

# YOUR CODE HERE
# Hint: Use k_values from around 60 to 120 for better visualization
# k_values = np.arange(60, 121)
# probabilities = binom.pmf(k_values, n, p_null)
# ...

---

## üìù STUDENT EXERCISE B (Upload)

### Virus Transmissibility Test

**Problem:** A virus is known to have a transmission rate of 60% (p = 0.6). A new variant is claimed to be **more transmissible**. Out of 300 contact cases, 190 resulted in infections.

**Conduct a formal hypothesis test:**
1. What are your null and alternative hypotheses?
2. What is your test statistic?
3. What is the p-value?
4. What is your conclusion/decision?

### EXERCISE B: Your Answers

**Null Hypothesis:**

$H_0:$ _______________

**Alternative Hypothesis:**

$H_a:$ _______________

**Test Statistic:**

$\hat{p} =$ _______________

**Rejection Region:**

_______________

**DECISION:**

_______________

In [None]:
# EXERCISE B: Calculate p-value
n = 300
observed_k = 190
p_null = 0.60

# YOUR CODE HERE
# result = binomtest(k=..., n=..., p=..., alternative='...')
# print(f"p-value = {result.pvalue}")

In [None]:
# EXERCISE B: Visualization
# Create a plot showing the distribution and p-value region

# YOUR CODE HERE

---

## üìù STUDENT EXERCISE C (Upload)

### Practice Quiz Answers

Complete the **Probability practice quiz 2 (binomial distribution)(Week 10)** on LMS, then record your answers below.

**EXERCISE C: Practice Quiz Answers**

Q1. Answer: _________________

Q2. Answer: _________________

Q3. Answer: _________________

Q4. Answer: _________________

Q5. Answer: _________________

---

## Summary: Key Concepts

### The Four Steps of Hypothesis Testing

1. **State hypotheses** ($H_0$ and $H_a$)
2. **Calculate test statistic** (e.g., $\hat{p} = k/n$)
3. **Compute p-value**
4. **Make decision** (reject or fail to reject $H_0$)

### One-Tailed vs Two-Tailed Tests

| Question Type | Test Type | `alternative` |
|---------------|-----------|---------------|
| "Is it **greater** than...?" | One-tailed (upper) | `'greater'` |
| "Is it **less** than...?" | One-tailed (lower) | `'less'` |
| "Is it **different** from...?" | Two-tailed | `'two-sided'` |

### Key Python Functions

```python
from scipy.stats import binom, binomtest

# Calculate P(X = k)
binom.pmf(k, n, p)

# Hypothesis test with p-value
result = binomtest(k=observed, n=n, p=p_null, alternative='greater')
result.pvalue  # The p-value
```

### Decision Rule

At significance level $\alpha = 0.05$:
- **p-value < 0.05:** Reject $H_0$ (statistically significant)
- **p-value ‚â• 0.05:** Fail to reject $H_0$ (not enough evidence)

### Exam Tips (Q33, Q35)

1. **Q33 (Expected Value):** $E[X] = \sum x \cdot P(X=x)$
2. **Q35 (Hypothesis Testing):**
   - "More infectious" ‚Üí **One-tailed test** (upper)
   - "Different" ‚Üí Two-tailed test
   - Match the test type to the research question!

---

## What's Next?

**Week 11** moves to **Trigonometric Functions**:
- Modeling periodic phenomena (circadian rhythms, seasonal patterns)
- Sine and cosine functions
- Amplitude, period, and phase shift

---

*Hypothesis testing gives science its teeth ‚Äî it allows us to make rigorous decisions based on evidence, distinguishing real effects from statistical noise.*