# E06: CI for a Mean with Unknown σ (t-interval)

This notebook solves Exercise 6, focusing on the construction of a confidence interval for a population mean when the population standard deviation ($\sigma$) is unknown. This is a very common scenario in practice, requiring the use of the Student's t-distribution.

## Theoretical Background

When we need to estimate the population mean ($\mu$) without knowing the population standard deviation ($\sigma$), we use the sample standard deviation ($s$) as an estimate. This introduces additional uncertainty. The Student's t-distribution is designed to account for this extra uncertainty. It is similar to the Normal distribution but has heavier tails, resulting in a wider, more conservative confidence interval.

The appropriateness of the t-distribution rests on two conditions:
1.  The population standard deviation $\sigma$ is unknown.
2.  The sample comes from a normally distributed population, OR the sample size ($n$) is sufficiently large (typically $n \ge 30$) for the Central Limit Theorem to apply, ensuring the sampling distribution of the mean is approximately normal.

The formula for the t-interval is:
$$ \text{CI} = \overline{x} \pm t_{\alpha/2, n-1} \cdot \frac{s}{\sqrt{n}} $$

Where $t_{\alpha/2, n-1}$ is the critical value from the t-distribution with $n-1$ degrees of freedom.

## Step 1: Define Given Parameters

We begin by defining the parameters provided for the network round-trip time (RTT) data.

In [1]:
# --- Problem Parameters ---
n = 60          # Sample size (number of packets)
x_bar = 118     # Sample mean (in ms)
s = 32          # Sample standard deviation (in ms)
confidence_level = 0.95

## Step 2: Justify the Use of the t-Distribution

The problem asks why the t-distribution is appropriate here. There are two key reasons:

1.  **Unknown Population Standard Deviation ($\sigma$):** The problem provides the *sample* standard deviation ($s = 32$ ms), not the population standard deviation. We are using an estimate based on our data, which is the primary condition for using the t-distribution instead of the z-distribution.
2.  **Sufficiently Large Sample Size:** The sample size is $n=60$. Since $60 \ge 30$, the Central Limit Theorem ensures that the sampling distribution of the sample mean ($\overline{x}$) will be approximately Normal, even if we don't know the distribution of the underlying round-trip times. This satisfies the second condition for the valid application of the t-interval.

## Step 3: Determine Degrees of Freedom and Critical Value (t)

The shape of the t-distribution depends on the degrees of freedom (df), calculated as $n-1$. We then find the critical t-value for our 95% confidence level.

In [2]:
import scipy.stats as stats

# Calculate Degrees of Freedom (df)
df = n - 1

# Determine the critical t-value
alpha = 1 - confidence_level
t_critical = stats.t.ppf(1 - alpha / 2, df=df)

print(f"Degrees of Freedom (df): {df}")
print(f"Confidence Level: {confidence_level:.0%}")
print(f"Critical t-value (t_α/2, {df}): {t_critical:.4f}")

Degrees of Freedom (df): 59
Confidence Level: 95%
Critical t-value (t_α/2, 59): 2.0010


## Step 4: Calculate the Margin of Error and Confidence Interval

Using the formula, we can now compute the interval for the true mean RTT.

In [3]:
import numpy as np

# Calculate the Standard Error of the Mean (SEM)
sem = s / np.sqrt(n)

# Calculate the Margin of Error (E)
margin_of_error = t_critical * sem

# Calculate the Confidence Interval bounds
lower_bound = x_bar - margin_of_error
upper_bound = x_bar + margin_of_error

print("--- Calculation Results ---")
print(f"Standard Error of the Mean (SEM): {sem:.4f} ms")
print(f"Margin of Error (E): {margin_of_error:.4f} ms")
print(f"\n95% Confidence Interval for the true mean RTT (μ):")
print(f"({lower_bound:.4f} ms, {upper_bound:.4f} ms)")

--- Calculation Results ---
Standard Error of the Mean (SEM): 4.1312 ms
Margin of Error (E): 8.2665 ms

95% Confidence Interval for the true mean RTT (μ):
(109.7335 ms, 126.2665 ms)


## Step 5: Interpretation

**Computed 95% Confidence Interval: (109.7335 ms, 126.2665 ms)**

**Interpretation:**

We are 95% confident that the true average round-trip time for network packets is between approximately 109.7 ms and 126.3 ms. This interval provides a range of plausible values for the true mean, based on our sample of 60 packets. The method used ensures that if we were to repeat this sampling process many times, 95% of the intervals we construct would capture the true population mean.