# Content

[The Sampling Distribution of the Difference in Sample Means]()

[Two-Sample t-interval for the Difference Between Means]()

[Two-Sample t-test for the Difference Between Means]()

[Step-by-Step Example: Two-Sample t-test & Interval]()


### The Sampling Distribution of the Difference in Sample Means

#### Theory
Imagine we have two distinct populations. Population 1 has a true mean `μ₁` and standard deviation `σ₁`. Population 2 has a true mean `μ₂` and standard deviation `σ₂`.

We take a random sample of size `n₁` from Population 1 and calculate its mean `x̄₁`. We independently take another random sample of size `n₂` from Population 2 and find its mean `x̄₂`. Our statistic of interest is the **difference between these two sample means: `x̄₁ - x̄₂`**.

If we were to repeat this process many times, the distribution of all possible values of `x̄₁ - x̄₂` is called the **sampling distribution of the difference in sample means**. It has the following properties:

*   **Center (Mean):** The mean of this sampling distribution is the true difference between the population means.
    *   **μ_{x̄₁-x̄₂} = μ₁ - μ₂**
*   **Spread (Standard Deviation):** When we combine two independent random variables, their variances **add**. We use the sample standard deviations `s₁` and `s₂` because the population values are unknown. The resulting standard deviation is called the **standard error of the difference**.
    *   **SE_{x̄₁-x̄₂} = √[ (s₁²/n₁) + (s₂²/n₂) ]**
*   **Shape (Conditions):** For the sampling distribution to be approximately Normal, a set of conditions must be met for **both** samples.
    1.  **Random:** Both samples must be drawn randomly from their respective populations.
    2.  **Independent (10% Rule):** Both `n₁` and `n₂` must be less than 10% of their respective population sizes.
    3.  **Independent Groups:** The two samples must be independent of each other. The selection of one sample cannot influence the selection of the other. This is the opposite of paired data.
    4.  **Normal/Large Sample:** Both samples must satisfy the Normal/Large Sample condition (either the population is stated to be Normal, or `n ≥ 30`, or a plot of the sample data for a small sample shows no strong skew or outliers).

---

### Two-Sample t-interval for the Difference Between Means

#### Theory
This interval provides a range of plausible values for the true difference between two population means, `μ₁ - μ₂`.

*   **Formula:** **(x̄₁ - x̄₂) ± t* * √[ (s₁²/n₁) + (s₂²/n₂) ]**
*   `(x̄₁ - x̄₂)` is our point estimate for the difference.
*   `t*` is the critical value from a t-distribution.
*   The square root term is the standard error.

#### Degrees of Freedom (df) for Two Samples
This is the trickiest part. The exact degrees of freedom are calculated with a complex formula (the Welch-Satterthwaite equation), which is what statistical software uses.

For calculations by hand, we use a simpler, more conservative approach:
**Conservative df = the smaller of (n₁ - 1) or (n₂ - 1)**
This approach gives a slightly wider (less precise) interval but is accepted as a valid method.

#### Interpretation
The most important value to look for in the final interval is **zero**.
*   If the interval **contains zero**, it means that a difference of zero is a plausible value. Therefore, we do not have convincing evidence of a difference between the two population means.
*   If the interval is **entirely positive or entirely negative** (it does not contain zero), we have strong evidence that a real difference exists between the two population means.

---

### Two-Sample t-test for the Difference Between Means

#### Theory
This test assesses the evidence for a claim that there is a difference between two population means.

*   **Hypotheses:** The null hypothesis is typically that there is no difference.
    *   **H₀: μ₁ - μ₂ = 0**  (or `μ₁ = μ₂`)
    *   **Hₐ: μ₁ - μ₂ > 0, < 0, or ≠ 0**
*   **Test Statistic (`t`):** The formula follows the standard pattern: (Statistic - Parameter) / (Standard Error).
    *   **t = [ (x̄₁ - x̄₂) - 0 ] / √[ (s₁²/n₁) + (s₂²/n₂) ]**
*   **P-value:** Found from the area in the tail(s) of a t-distribution, using either the complex software-calculated `df` or the conservative `smaller of (n₁-1, n₂-1)` `df`.

---

### Step-by-Step Example: Two-Sample t-test & Interval

**Scenario:** A school district wants to know if there's a difference in the mean end-of-year math scores between two different teaching methods. A group of 40 students uses Method A, and an independent group of 50 students uses Method B. The results are below. First, perform a significance test (`α = 0.05`), then construct a 95% confidence interval.

*   **Method A:** `n₁ = 40`, `x̄₁ = 83.5`, `s₁ = 5.2`
*   **Method B:** `n₂ = 50`, `x̄₂ = 81.2`, `s₂ = 4.8`

#### **Significance Test**
**STATE:**
*   `μ₁` = true mean score for Method A. `μ₂` = true mean score for Method B.
*   **H₀: μ₁ - μ₂ = 0** (There is no difference in mean scores).
*   **Hₐ: μ₁ - μ₂ ≠ 0** (There is a difference in mean scores).
*   `α = 0.05`.

**PLAN:**
*   **Test:** Two-sample t-test for the difference between means.
*   **Conditions:**
    *   Random/Independent Groups: We assume students were randomly assigned to each method.
    *   10% Rule: 40 and 50 students are less than 10% of all potential students.
    *   Normal/Large Sample: `n₁=40` and `n₂=50` are both `≥ 30`. The CLT applies. Conditions are met.

**DO:**
1.  **t-statistic:**
    *   `t = (83.5 - 81.2) / √[ (5.2²/40) + (4.8²/50) ]`
    *   `t = 2.3 / √[ (27.04/40) + (23.04/50) ]`
    *   `t = 2.3 / √[ 0.676 + 0.4608 ] = 2.3 / √1.1368 ≈ 2.3 / 1.066`
    *   **t ≈ 2.157**
2.  **P-value:**
    *   Using the conservative approach, `df = smaller of (40-1, 50-1) = 39`.
    *   We need the two-tailed probability: `2 * P(t₃₉ ≥ 2.157)`.
    *   Using a calculator, `P(t₃₉ ≥ 2.157) ≈ 0.018`.
    *   **P-value ≈ 2 * 0.018 = 0.036**.

**CONCLUDE:**
*   **Decision:** Our P-value (0.036) is less than `α` (0.05). We **reject H₀**.
*   **Context:** There is convincing evidence to suggest a significant difference exists between the true mean math scores of students using Method A and Method B.

#### **Confidence Interval (95%)**
**DO:**
1.  **Find `t*`:** Using `df=39` and 95% confidence, `t* ≈ 2.023`.
2.  **Calculate Interval:**
    *   `(x̄₁ - x̄₂) ± t* * SE`
    *   `2.3 ± 2.023 * (1.066)`
    *   `2.3 ± 2.156`
    *   Interval: **(0.144, 4.456)**

**CONCLUDE:**
*   We are 95% confident that the interval from 0.144 to 4.456 captures the true difference in mean scores (Method A - Method B). Since this interval is entirely positive and does not contain 0, it supports our conclusion that there is a real difference and suggests that Method A yields a higher mean score.

***

### Python Code Illustration

The `scipy.stats.ttest_ind` function is the primary tool for two-sample t-tests in Python. It automatically handles the more accurate `df` calculation.

In [1]:
import numpy as np
from scipy import stats

# --- Two-Sample t-test & Interval: Teaching Method Example ---
print("--- Two-Sample t-test & Interval Example ---")

# Summary statistics
n1, x_bar1, s1 = 40, 83.5, 5.2
n2, x_bar2, s2 = 50, 81.2, 4.8

# For a live test, we'd have the raw data. We'll simulate data that
# matches these statistics to use the scipy function.
np.random.seed(0) # for reproducibility
data1 = stats.norm.rvs(loc=x_bar1, scale=s1, size=n1)
data2 = stats.norm.rvs(loc=x_bar2, scale=s2, size=n2)

# --- Perform the Significance Test ---
# We set equal_var=False to perform Welch's t-test, which does not
# assume equal population variances and is the modern default. It handles df automatically.
t_stat, p_value = stats.ttest_ind(data1, data2, equal_var=False, alternative='two-sided')

print("\n--- Significance Test Results ---")
print(f"t-statistic: {t_stat:.4f}")
print(f"P-value: {p_value:.4f}\n") # This will be slightly different due to random sampling

alpha = 0.05
if p_value < alpha:
    print(f"Decision: Reject H₀, as P-value ({p_value:.4f}) < alpha ({alpha}).")
    print("Conclusion: There is evidence of a significant difference between the two teaching methods.")
else:
    print(f"Decision: Fail to reject H₀, as P-value ({p_value:.4f}) >= alpha ({alpha}).")
    print("Conclusion: There is not enough evidence to claim a difference between the methods.")

print("-" * 50)

# --- Construct the Confidence Interval Manually ---
print("\n--- Confidence Interval Calculation ---")
# Calculate the point estimate and standard error from the original summary stats
point_estimate = x_bar1 - x_bar2
se_diff = np.sqrt( (s1**2 / n1) + (s2**2 / n2) )

# Use the conservative degrees of freedom to find t*
df_conservative = min(n1 - 1, n2 - 1)
confidence = 0.95
t_star = stats.t.ppf((1 + confidence) / 2, df=df_conservative)

# Calculate margin of error and the interval
margin_of_error = t_star * se_diff
lower_bound = point_estimate - margin_of_error
upper_bound = point_estimate + margin_of_error

print(f"Point Estimate (x̄₁ - x̄₂): {point_estimate:.3f}")
print(f"Standard Error: {se_diff:.3f}")
print(f"Conservative df: {df_conservative}")
print(f"Critical t* for 95% confidence: {t_star:.3f}\n")
print(f"95% Confidence Interval: ({lower_bound:.3f}, {upper_bound:.3f})")
print("Conclusion: We are 95% confident that this interval captures the true difference in mean scores.")


--- Two-Sample t-test & Interval Example ---

--- Significance Test Results ---
t-statistic: 4.6428
P-value: 0.0000

Decision: Reject H₀, as P-value (0.0000) < alpha (0.05).
Conclusion: There is evidence of a significant difference between the two teaching methods.
--------------------------------------------------

--- Confidence Interval Calculation ---
Point Estimate (x̄₁ - x̄₂): 2.300
Standard Error: 1.066
Conservative df: 39
Critical t* for 95% confidence: 2.023

95% Confidence Interval: (0.143, 4.457)
Conclusion: We are 95% confident that this interval captures the true difference in mean scores.
