# Chapter 2 — Simple Comparative Experiments (Deep-Dive Notes + Python)

> These notes give you the **big picture** and the *mechanics* you’ll actually use: definitions, formulas (with $/$ or $$…$$ LaTeX), assumptions, diagnostic plots, and **Python** examples you can run to reproduce results (including the multi-choice problems you shared).

---

## 0) The Big Picture

**Goal:** Compare two conditions (treatments) using data with noise. Decide if observed differences are real or just random.

**Workflow:**
1. **Design** the experiment → randomize; if possible, **pair/block** to reduce noise.
2. **Plot & check assumptions** → histograms/boxplots/QQ plots; think independence.
3. **Choose the right model/test** → Z, t (pooled/Welch), paired t, or F for variances.
4. **Compute test statistic & P-value** → interpret with your $\alpha$.
5. **Report a CI** → magnitude + uncertainty beats “significant/not”.
6. **Think power & sample size** → do we have enough $n$ to see effects we care about?

---

## 1) Random Variables, Expectation, Variance (Why so many “definitions”?)

We’ll always distinguish **population** (unknown, Greek letters) vs **sample** (observed, Latin):

- **Population mean (expectation)**  
  $$
  \mu = E[Y] =
  \begin{cases}
  \displaystyle \sum_y y\,p(y), & \text{discrete}\\[6pt]
  \displaystyle \int_{-\infty}^\infty y\,f(y)\,dy, & \text{continuous}
  \end{cases}
  $$
  *Why two forms?* Discrete uses a **sum** over masses $p(y)$; continuous uses an **integral** over density $f(y)$.

- **Population variance**  
  $$
  \sigma^2 = \operatorname{Var}(Y) = E\big[(Y-\mu)^2\big] =
  \begin{cases}
  \displaystyle \sum_y (y-\mu)^2\,p(y), & \text{discrete}\\[6pt]
  \displaystyle \int_{-\infty}^\infty (y-\mu)^2 f(y)\,dy, & \text{continuous}
  \end{cases}
  $$
  **Standard deviation** is just the scale: $\sigma=\sqrt{\sigma^2}$.

- **Sample statistics (point estimators)**  
  $$
  \bar{y}=\frac{1}{n}\sum_{i=1}^n y_i,\qquad
  s^2=\frac{1}{n-1}\sum_{i=1}^n (y_i-\bar{y})^2,\qquad
  s=\sqrt{s^2}.
  $$
  *Why $n-1$?* **Bessel’s correction** makes $s^2$ **unbiased** for $\sigma^2$ when sampling i.i.d. from a normal population.

- **Covariance & Independence**  
  $$
  \operatorname{Cov}(Y_1,Y_2)=E\!\left[(Y_1-\mu_1)(Y_2-\mu_2)\right],\quad
  Y_1\perp Y_2\Rightarrow \operatorname{Cov}(Y_1,Y_2)=0.
  $$

---

## 2) Sampling Distributions & Why t/χ²/F Show Up

- **Normal**  
  $$
  Y\sim \mathcal{N}(\mu,\sigma^2),\quad
  Z=\frac{Y-\mu}{\sigma}\sim \mathcal{N}(0,1).
  $$

- **Central Limit Theorem (CLT)**  
  $$
  \frac{\sum_{i=1}^n Y_i - n\mu}{\sigma\sqrt{n}} \xrightarrow{d} \mathcal{N}(0,1).
  $$

- **Variance & χ²** (normality critical)  
  $$
  \chi_0^2=\frac{(n-1)s^2}{\sigma_0^2}\sim \chi^2_{n-1}\quad \text{under }H_0:\sigma^2=\sigma_0^2.
  $$

- **t-distribution** (mean with unknown $\sigma$)  
  $$
  T=\frac{\bar{Y}-\mu_0}{S/\sqrt{n}}\sim t_{n-1}\quad(\text{normal data}).
  $$

- **F-distribution** (ratio of two independent χ²/df)  
  $$
  F=\frac{(X/u)}{(Y/v)}\sim F_{u,v},\quad X\sim \chi^2_u,\ Y\sim \chi^2_v,\ X\perp Y.
  $$

---

## 3) Confidence Intervals (CIs) & Hypothesis Tests

**General CI template:**  
$$
\text{Estimator} \ \pm\ (\text{critical value})\times(\text{SE}).
$$

**Hypothesis testing flow:**
$$
\begin{aligned}
&\text{1) State } H_0 \text{ and } H_1 \\
&\text{2) Pick } \alpha \\
&\text{3) Compute statistic } (Z, t, F, \chi^2) \\
&\text{4) } P\text{-value } = P(\text{at least as extreme} \mid H_0) \\
&\text{5) Reject } H_0 \text{ if } P\text{-value} \le \alpha.
\end{aligned}
$$


**Interpretation:**  
$$
\text{P-value}=P(\text{data as or more extreme}\mid H_0 \text{ true}).
$$

---

## 4) Two-Sample Comparisons (Means)

**Setup:** Two **independent** samples of sizes $n_1,n_2$ with means $\bar{y}_1,\bar{y}_2$ and variances $s_1^2,s_2^2$.

### 4.1 Equal variances assumed (pooled t)
- Pooled variance:
  $$
  s_p^2=\frac{(n_1-1)s_1^2+(n_2-1)s_2^2}{n_1+n_2-2}.
  $$
- Test statistic (two-sided $H_0:\mu_1=\mu_2$):
  $$
  t_0=\frac{\bar{y}_1-\bar{y}_2}{s_p\sqrt{1/n_1+1/n_2}}\sim t_{n_1+n_2-2}.
  $$
- CI for $\mu_1-\mu_2$:
  $$
  (\bar{y}_1-\bar{y}_2)\pm t_{\alpha/2,\ n_1+n_2-2}\ s_p\sqrt{\tfrac1{n_1}+\tfrac1{n_2}}.
  $$

### 4.2 Unequal variances (Welch t)
- Statistic:
  $$
  t_0=\frac{\bar{y}_1-\bar{y}_2}{\sqrt{s_1^2/n_1+s_2^2/n_2}}.
  $$
- df (Welch–Satterthwaite):
  $$
  \mathrm{df}\approx \frac{(s_1^2/n_1+s_2^2/n_2)^2}{\frac{(s_1^2/n_1)^2}{n_1-1}+\frac{(s_2^2/n_2)^2}{n_2-1}}.
  $$

### 4.3 Known variances (Z test)
- Statistic:
  $$
  Z_0=\frac{\bar{y}_1-\bar{y}_2}{\sqrt{\sigma_1^2/n_1+\sigma_2^2/n_2}}\sim \mathcal{N}(0,1)\ \text{under }H_0.
  $$

---

## 5) Comparing Variances (Normality is important)

### 5.1 One variance
- Test:
  $$
  \chi_0^2=\frac{(n-1)s^2}{\sigma_0^2}\sim \chi^2_{n-1}.
  $$

- CI:
  $$
  \left[\frac{(n-1)s^2}{\chi^2_{1-\alpha/2,n-1}},\ \frac{(n-1)s^2}{\chi^2_{\alpha/2,n-1}} \right].
  $$

### 5.2 Two variances (F test)
- Statistic:
  $$
  F_0=\frac{s_1^2}{s_2^2}\sim F_{n_1-1,\ n_2-1}\quad (\text{if data normal \& independent}).
  $$
- Two-sided CI for $\sigma_1^2/\sigma_2^2$:
  $$
  \left[\frac{s_1^2/s_2^2}{F_{1-\alpha/2,\,n_1-1,\,n_2-1}},\ \frac{s_1^2/s_2^2}{F_{\alpha/2,\,n_1-1,\,n_2-1}}\right].
  $$

> **Assumption note:** Tests/intervals on **variances** are **very sensitive** to non-normality. Check QQ plots.

---

## 6) Paired Designs (Blocking Principle)

**Idea:** Measure both treatments on the **same** (or matched) unit → analyze **differences** $d_j=y_{1j}-y_{2j}$.

- Paired t:
  $$
  \bar{d}=\frac{1}{n}\sum_j d_j,\quad s_d^2=\frac{1}{n-1}\sum_j (d_j-\bar{d})^2,\quad
  t_0=\frac{\bar{d}}{s_d/\sqrt{n}}\sim t_{n-1}.
  $$
- CI:
  $$
  \bar{d}\pm t_{\alpha/2,n-1}\frac{s_d}{\sqrt{n}}.
  $$

**Why it helps:** If within-pair outcomes are positively correlated, **variance of $d$ shrinks**, giving **higher power** (narrower CI) for the same $n$.

---

## 7) Power, Effect Size, and Sample Size

- **Standardized effect** (two-sample):
  $$
  \delta=\frac{|\mu_1-\mu_2|}{\sigma}.
  $$
- **Power** (probability to reject $H_0$ when a specific alternative is true) increases with larger $n$, larger $\delta$, and larger $\alpha$ (but increasing $\alpha$ increases Type I error).

**Approx sample size (balanced, pooled t, target two-sided power $1-\beta$):**
$$
n\ \approx\ 2\left(\frac{z_{1-\alpha/2}+z_{1-\beta}}{\Delta/\sigma}\right)^2,
$$
where $\Delta$ is the **minimum meaningful difference** you want to detect, and $\sigma$ a planning SD.

---

## 8) Python “Recipes” (Drop-in Code)

> You can paste these into a notebook. They rely on `scipy` and `numpy`. Replace numbers with your data.

```python
import numpy as np
from math import sqrt
from scipy import stats as st

# ---------- Helpers ----------
def pooled_sd(s1, n1, s2, n2):
    sp2 = ((n1-1)*s1**2 + (n2-1)*s2**2) / (n1+n2-2)
    return sqrt(sp2)

def two_sample_t_equal(y1bar, s1, n1, y2bar, s2, n2, alternative="two-sided"):
    sp = pooled_sd(s1,n1,s2,n2)
    df = n1+n2-2
    t0 = (y1bar - y2bar) / (sp*sqrt(1/n1 + 1/n2))
    if alternative == "two-sided":
        p = 2*(1 - st.t.cdf(abs(t0), df))
    elif alternative == "greater":
        p = 1 - st.t.cdf(t0, df)
    else:
        p = st.t.cdf(t0, df)
    return t0, df, p

def two_sample_t_welch(y1bar, s1, n1, y2bar, s2, n2, alternative="two-sided"):
    se = sqrt(s1*s1/n1 + s2*s2/n2)
    t0 = (y1bar - y2bar) / se
    df = (s1*s1/n1 + s2*s2/n2)**2 / ((s1*s1/n1)**2/(n1-1) + (s2*s2/n2)**2/(n2-1))
    if alternative == "two-sided":
        p = 2*(1 - st.t.cdf(abs(t0), df))
    elif alternative == "greater":
        p = 1 - st.t.cdf(t0, df)
    else:
        p = st.t.cdf(t0, df)
    return t0, df, p

def paired_t(d, alternative="two-sided"):
    d = np.asarray(d)
    n = d.size
    dbar = d.mean()
    sd = d.std(ddof=1)
    t0 = dbar / (sd/sqrt(n))
    df = n-1
    if alternative == "two-sided":
        p = 2*(1 - st.t.cdf(abs(t0), df))
    elif alternative == "greater":
        p = 1 - st.t.cdf(t0, df)
    else:
        p = st.t.cdf(t0, df)
    return dbar, sd, t0, df, p

def f_test_variances(s1, n1, s2, n2, alternative="two-sided"):
    F = (s1**2)/(s2**2)
    df1, df2 = n1-1, n2-1
    if alternative == "two-sided":
        p = 2*min(1-st.f.cdf(F, df1, df2), st.f.cdf(F, df1, df2))
    elif alternative == "greater":  # H1: sigma1^2 > sigma2^2
        p = 1 - st.f.cdf(F, df1, df2)
    else:  # H1: sigma1^2 < sigma2^2
        p = st.f.cdf(F, df1, df2)
    return F, df1, df2, p

def ci_var_ratio(s1, n1, s2, n2, alpha=0.05):
    F_low  = st.f.ppf(alpha/2, n1-1, n2-1)
    F_high = st.f.ppf(1-alpha/2, n1-1, n2-1)
    ratio = (s1**2)/(s2**2)
    return ratio/F_high, ratio/F_low

def ci_diff_means_equal(y1bar, s1, n1, y2bar, s2, n2, alpha=0.05):
    sp = pooled_sd(s1, n1, s2, n2)
    df = n1+n2-2
    me = st.t.ppf(1-alpha/2, df) * sp * sqrt(1/n1 + 1/n2)
    diff = y1bar - y2bar
    return diff-me, diff+me, df

def power_two_sample_equal(delta, sigma, n1, n2, alpha=0.05):
    # Two-sided power using noncentral t with equal variances
    df = n1+n2-2
    tcrit = st.t.ppf(1-alpha/2, df)
    ncp = delta / (sigma*sqrt(1/n1 + 1/n2))
    # Power = P(|T|>tcrit) where T ~ nct(df, ncp)
    return (1 - st.nct.cdf(tcrit, df, ncp)) + st.nct.cdf(-tcrit, df, ncp)




# 9) Worked Example (Portland Cement, equal-variance t)

Two independent samples $(n_1=n_2=10)$ with  
$\bar{y}_1=16.76,\ \bar{y}_2=17.04,\ s_1^2=0.100,\ s_2^2=0.061$.

**Pooled variance and SD**
$$
s_p^2=\frac{9(0.100)+9(0.061)}{18}=0.081,
\qquad
s_p=\sqrt{0.081}=0.284.
$$

**Test statistic**
$$
t_0=\frac{16.76-17.04}{\,0.284\,\sqrt{1/10+1/10}\,}=-2.20,
\qquad
\mathrm{df}=18.
$$

**95% CI**
$$
(16.76-17.04)\ \pm\ t_{0.025,\,18}\cdot 0.284\sqrt{1/10+1/10}
\;=\;-0.28\pm 0.27\;\Rightarrow\;[-0.55,\,-0.01].
$$

---

# 10) Fully Worked Multi-Choice Problems (with answers & Python)

## Problem I — Vaccines

Given: $n_1=10,\ n_2=16,\ \bar{y}_1=11.8,\ \bar{y}_2=13.9,\ s_1=3.6,\ s_2=2.1$.

**(I.1) Equality of variances (two-sided)**  
Use $F_0=s_1^2/s_2^2$ with df $(9,15)$.
- Ratio: $s_1^2/s_2^2=12.96/4.41=2.94$.
- At $\alpha=0.05$ two-sided, the **upper** critical value is $F_{0.975;\,9,15}\approx 3.12$.
- **Answer:** 3.12 (Option 5) is a critical value.

**(I.2) 95% CI for $\sigma_1^2/\sigma_2^2$**
$$
\left[
\frac{2.94}{F_{0.975;\,9,15}},
\ \frac{2.94}{F_{0.025;\,9,15}}
\right]\ \approx\ [0.94,\ 11.08].
$$
**Answer:** $[0.94,\ 11.08]$ (Option 3).

**(I.3) Means (two-sided, equal variances assumed)**  
Pooled $s_p\approx 2.760,\ \mathrm{df}=24$, $t_0\approx -1.888\Rightarrow p\approx 0.071$.  
**Smallest $\alpha$ to reject:** 10% (Option 4).

**(I.4) 95% CI for $\mu_1-\mu_2$ (pooled)**
$$
[-4.40,\ 0.20]\ \text{(to 2 decimals).}
$$
**Answer:** $[-4.40, 0.20]$ (Option 1).

**Reproducible Python (assumes helper functions defined earlier)**
~~~python
y1bar, y2bar, s1, s2, n1, n2 = 11.8, 13.9, 3.6, 2.1, 10, 16

F, df1, df2, pF_two = f_test_variances(s1, n1, s2, n2, alternative="two-sided")
Fcrit_up = st.f.ppf(0.975, df1, df2)   # ~ 3.12

ci_ratio = ci_var_ratio(s1, n1, s2, n2, alpha=0.05)  # ~ (0.94, 11.08)

t0, df, p = two_sample_t_equal(y1bar, s1, n1, y2bar, s2, n2)
ci = ci_diff_means_equal(y1bar, s1, n1, y2bar, s2, n2)  # ends ~ (-4.40, 0.20)
~~~

---

## Problem I — Sporting Goods (Soles)

Given: $n_1=n_2=10,\ \bar{y}_1=11.9,\ \bar{y}_2=17.8,\ s_1=6.2,\ s_2=3.2$.

**(I.1) Equality of variances (two-sided)**  
$F_0=6.2^2/3.2^2\approx 3.754$, df $(9,9)$ → two-sided $p\approx 0.062$ → reject at **10%** but not **5%**.  
**Answer:** 10% (Option 4).

**(I.2) Means equal (two-sided, pooled t)**  
$t_0\approx -2.674,\ \mathrm{df}=18\Rightarrow p\approx 0.0155$ (**significant**).

**(I.3) Power for $\mu_1=\mu_2+8$ at $\alpha=5\%$, true $\sigma^2=16$ (both), $n_1=n_2=10$**  
Noncentral t with $\text{ncp}=\dfrac{8}{4\sqrt{1/10+1/10}}=4.472$; power $\approx \mathbf{98.8\%}$.  
**Answer category:** $P_{\text{detect}}\ge 75\%$ (Option 1).

**(I.4) Paired scheme (each subject wears both soles)**  
Statements 1–3 are **false** (pairing typically **increases** power; decreasing $\alpha$ to 1% **reduces** power).  
**Answer:** None of the above (Option 5).

**Reproducible Python (assumes helper functions defined earlier)**
~~~python
y1bar, y2bar, s1, s2, n1, n2 = 11.9, 17.8, 6.2, 3.2, 10, 10

F, df1, df2, pF_two = f_test_variances(s1, n1, s2, n2)  # ~ p=0.062
t0, df, p = two_sample_t_equal(y1bar, s1, n1, y2bar, s2, n2)  # p ~ 0.0155

power = power_two_sample_equal(delta=8, sigma=4, n1=10, n2=10, alpha=0.05)  # ~ 0.988
~~~

---

## Problem I — Cosmetics (Hand Crème)

Independent groups: $n_{\text{New}}=n_{\text{Old}}=8,\ \bar{y}_{\text{Old}}=19.3,\ \bar{y}_{\text{New}}=22.1,\ s_{\text{Old}}=2.4,\ s_{\text{New}}=4.2$.

**(I.1)** $H_0:\sigma_{\text{New}}^2=\sigma_{\text{Old}}^2$ vs $H_1:\sigma_{\text{New}}^2>\sigma_{\text{Old}}^2$  
$F_0=(4.2^2)/(2.4^2)=3.0625$, df $(7,7)$ → one-sided $p\approx 0.0815$.  
**Minimum $\alpha$ to reject:** 10% (Option 4).

**(I.2)** Means: $H_0:\mu_{\text{New}}=\mu_{\text{Old}}$ vs $H_1:\mu_{\text{New}}>\mu_{\text{Old}}$ (pooled t)  
$t_0\approx 1.64,\ \mathrm{df}=14\Rightarrow p_{\text{one-sided}}\approx 0.062$.

**Switch to paired design (one hand each formula)**  
Differences (New–Old): $[1.4,\,1.6,\,-0.3,\,2.0,\,-0.4,\,1.8,\,0.8,\,1.1]$.

**(I.3) Which statement is correct?**  
$\mathrm{df}=n-1=7$ (not 6); pairing doesn’t **guarantee** significance; fewer subjects ≠ impossible.  
**Answer:** None of the above (Option 4).

**(I.4) Two-sided paired test**  
$\bar{d}=1.00,\ s_d\approx 0.915,\ t_0\approx 3.09,\ \mathrm{df}=7\Rightarrow p\approx 0.0175$ (**significant**).

**Reproducible Python (assumes helper functions defined earlier)**
~~~python
# Independent-phase answers
s_old, s_new, n_old, n_new = 2.4, 4.2, 8, 8
F, df1, df2, p_one = f_test_variances(s_new, n_new, s_old, n_old, alternative="greater")  # ~ 0.0815

y_old, y_new = 19.3, 22.1
t0, df, p_one = two_sample_t_equal(y_new, s_new, n_new, y_old, s_old, n_old, alternative="greater")  # ~ 0.062

# Paired-phase answers
new = np.array([21.7,22.4,22.7,20.8,18.2,16.5,18.8,26.2])
old = np.array([20.3,20.8,23.0,18.8,18.6,14.7,18.0,25.1])
dbar, sd, t0, df, p_two = paired_t(new-old)  # t ~ 3.09, df=7, p ~ 0.0175
~~~

---

# 11) Factorial ANOVA Quick Example (Your Problem II)

Design: $A\times B\times C$ with levels $a=2,\ b=4,\ c=5$, replicated $r=2$ times → $N=80$.  
Given sums of squares (SS):  
$\text{SSA}=24,\ \text{SSB}=14,\ \text{SSC}=42,\ \text{SS}_{AB}=6,\ \text{SS}_{AC}=16,\ \text{SS}_{BC}=12,\ \text{SS}_{ABC}=16,\ \text{SS}_{\text{Total}}=165$.

Assume the **model includes only main effects and two-factor interactions** (so the three-way SS is pooled into error with within-cell replication error).

**Degrees of freedom**
$$
\begin{aligned}
&\mathrm{df}_A=a-1=1,\quad \mathrm{df}_B=b-1=3,\quad \mathrm{df}_C=c-1=4,\\
&\mathrm{df}_{AB}=(a-1)(b-1)=3,\quad \mathrm{df}_{AC}=4,\quad \mathrm{df}_{BC}=12,\\
&\mathrm{df}_{ABC}=(a-1)(b-1)(c-1)=12,\\
&\mathrm{df}_{\text{pure err}}=abc(r-1)=2\cdot4\cdot5\cdot1=40,\\
&\mathrm{df}_{\text{error (pooled)}}=40+12=52,\\
&\mathrm{df}_{\text{total}}=N-1=79.
\end{aligned}
$$

**Error SS (pooled) and MSE**
$$
\mathrm{SSE}=165-(24+14+42+6+16+12)=51,\qquad
\mathrm{MSE}=\frac{51}{52}\approx 0.9808.
$$

**Mean squares & F**
$$
\begin{aligned}
&\text{MSA}=24/1=24 \Rightarrow F_A=\frac{24}{0.9808}\approx 24.47\ (p\approx 8.3\times 10^{-6})\\
&\text{MSB}=14/3 \Rightarrow F_B\approx 4.76\ (p\approx 0.0053)\\
&\text{MSC}=42/4 \Rightarrow F_C\approx 10.71\ (p\approx 2.1\times 10^{-6})\\
&\text{MS}_{AB}=6/3 \Rightarrow F_{AB}\approx 2.04\ (p\approx 0.120)\\
&\text{MS}_{AC}=16/4 \Rightarrow F_{AC}\approx 4.08\ (p\approx 0.0060)\\
&\text{MS}_{BC}=12/12 \Rightarrow F_{BC}\approx 1.02\ (p\approx 0.445).
\end{aligned}
$$

**Interpretation (at $\alpha=0.05$):** A, B, C, and AC are **significant**; AB and BC are **not**.

**Python to verify**
~~~python
from scipy import stats as st
SSA, SSB, SSC, SSAB, SSAC, SSBC, SSABC, SST = 24, 14, 42, 6, 16, 12, 16, 165
a,b,c,r = 2,4,5,2
dfA, dfB, dfC = a-1, b-1, c-1
dfAB, dfAC, dfBC = (a-1)*(b-1), (a-1)*(c-1), (b-1)*(c-1)
dfABC = (a-1)*(b-1)*(c-1)
df_total = a*b*c*r - 1
df_error = df_total - (dfA+dfB+dfC+dfAB+dfAC+dfBC)
SSE = SST - (SSA+SSB+SSC+SSAB+SSAC+SSBC)
MSE = SSE/df_error

def F_p(F, dfn, dfd): return 1-st.f.cdf(F, dfn, dfd)

effects = {
 "A": (SSA/dfA, dfA),
 "B": (SSB/dfB, dfB),
 "C": (SSC/dfC, dfC),
 "AB": (SSAB/dfAB, dfAB),
 "AC": (SSAC/dfAC, dfAC),
 "BC": (SSBC/dfBC, dfBC),
}
for k, (MS, dfn) in effects.items():
    F = MS/MSE
    print(k, F, F_p(F, dfn, df_error))
~~~

---

# 12) Assumptions Checklist & Diagnostics

- **Independence:** design & data collection (randomization, no hidden pairing).  
- **Normality (for $t/\chi^2/F$):** QQ plot on residuals (or within groups). With $n\gtrsim 30$, t-methods are robust.  
- **Equal variances:** boxplots, residual‐vs‐fitted, F-test (sensitive), or Levene’s test (more robust).  
- **No severe outliers:** check plots; consider robust methods/transformations if needed.

---

# 13) Quick “Cheat Sheets”

**Choosing a test (two means)**  
- Independent? **Yes** → Equal variances plausible?  
  – Yes → **pooled** two-sample t  
  – No → **Welch** t  
- Paired/blocked? **Yes** → **Paired** t

**Two-sided P-values**  
- $t$ or $Z$: $2\big[1-F(|\text{stat}|)\big]$.  
- $F$ or $\chi^2$: use appropriate tail(s) per $H_1$.

**Margin of error (two-sample, equal $n$)**
$$
\text{ME}=t_{\alpha/2,\,2n-2}\;s_p\sqrt{\frac{2}{n}}.
$$

---

# 14) Extra: Discrete Distributions — Mean & Variance

For a discrete rv $X$ with pmf $p(x)$:
$$
E[X]=\sum_x x\,p(x),\qquad
\operatorname{Var}(X)=\sum_x (x-E[X])^2\,p(x).
$$

**Example (Python)**
~~~python
# Discrete pmf on {0,1,2} with p = {0.2, 0.5, 0.3}
x = np.array([0,1,2], float)
p = np.array([0.2,0.5,0.3], float)
mu = (x*p).sum()
var = ((x-mu)**2 * p).sum()
mu, var
~~~

---

# 15) Extra: CIs & Hypothesis Tests (One-Sample Refresh)

**Mean, $\sigma$ unknown (normal or large $n$)**
$$
\bar{y}\ \pm\ t_{\alpha/2,\,n-1}\frac{s}{\sqrt{n}},
\qquad
t_0=\frac{\bar{y}-\mu_0}{s/\sqrt{n}}\sim t_{n-1}.
$$

**Variance**
$$
\chi_0^2=\frac{(n-1)s^2}{\sigma_0^2}\sim \chi^2_{n-1},\qquad
\text{CI as in §5.1}.
$$

---

# 16) Sanity-Check Plots (Python)

~~~python
import matplotlib.pyplot as plt
import numpy as np

def quick_plots(sample1, sample2, labels=("Group 1","Group 2")):
    fig, ax = plt.subplots()
    ax.boxplot([sample1, sample2], labels=labels, showmeans=True)
    ax.set_title("Boxplots with means")
    plt.show()

def qq_plot(sample, label="Sample"):
    import statsmodels.api as sm
    sm.qqplot(np.asarray(sample), line='s')
    plt.title(f"QQ plot - {label}")
    plt.show()
~~~

*Tip:* Don’t overthink colors/styles for study notes. Focus on **shape**, **spread**, and **outliers**.

---

# 17) Summary (What to remember)

- Use **paired** designs whenever you can (block what you can’t randomize).  
- Plot first; check **normality/variance** assumptions before relying on $t/F/\chi^2$.  
- Prefer **Welch** when variances look unequal.  
- Report **effect size + CI** (interpretation!) alongside P-values.  
- For planning, connect **effect size**, **power**, and **sample size**.
