# E08: CI for Population Standard Deviation (σ)

This notebook solves Exercise 8, which requires constructing a 95% confidence interval for the true population standard deviation ($\sigma$), given a sample standard deviation ($s$) and sample size ($n$).

## Theoretical Background

To estimate the population variance ($\sigma^2$) or standard deviation ($\sigma$), we use the Chi-squared ($\chi^2$) distribution.

### Key Distributional Assumption
The primary assumption for this method is that the **sample is drawn from a normally distributed population**. This assumption is critical for the validity of the interval.

### The Chi-Squared Statistic and Interval Formula
The statistic $\frac{(n-1)s^2}{\sigma^2}$ follows a Chi-squared distribution with $n-1$ degrees of freedom. Due to the asymmetry of the $\chi^2$ distribution, the confidence interval for the population variance $\sigma^2$ is defined by two different critical values:

$$ \frac{(n-1)s^2}{\chi^2_{\text{upper}}} \le \sigma^2 \le \frac{(n-1)s^2}{\chi^2_{\text{lower}}} $$

where:
- $\chi^2_{\text{lower}} = \chi^2_{1-\alpha/2, n-1}$
- $\chi^2_{\text{upper}} = \chi^2_{\alpha/2, n-1}$

The confidence interval for the standard deviation $\sigma$ is simply the square root of the bounds of the interval for the variance.

## Step 1: Define Given Parameters
Let's define the parameters from the problem statement.

In [1]:
# --- Problem Parameters ---
n = 100         # Sample size
s = 0.25        # Sample standard deviation (in seconds)
confidence_level = 0.95

## Step 2: Determine Degrees of Freedom and Critical Values (χ²)

We need to find the two critical values from the $\chi^2$ distribution for a 95% confidence level and $n-1$ degrees of freedom.

In [2]:
import scipy.stats as stats

# Calculate Degrees of Freedom (df)
df = n - 1

# Determine the critical values
alpha = 1 - confidence_level

# Lower critical value (leaves 1 - alpha/2 area to the right, so we use alpha/2 for the left-tail ppf)
chi2_lower = stats.chi2.ppf(alpha / 2, df=df)

# Upper critical value (leaves alpha/2 area to the right, so we use 1 - alpha/2 for the left-tail ppf)
chi2_upper = stats.chi2.ppf(1 - alpha / 2, df=df)

print(f"Degrees of Freedom (df): {df}")
print(f"Confidence Level: {confidence_level:.0%}")
print(f"Lower Critical Value (χ²_0.975,99): {chi2_lower:.4f}")
print(f"Upper Critical Value (χ²_0.025,99): {chi2_upper:.4f}")

Degrees of Freedom (df): 99
Confidence Level: 95%
Lower Critical Value (χ²_0.975,99): 73.3611
Upper Critical Value (χ²_0.025,99): 128.4220


## Step 3: Calculate the Confidence Interval

Now, we apply the formulas to find the interval for the variance ($\sigma^2$) and then for the standard deviation ($\sigma$).

In [4]:
import numpy as np

# Calculate the numerator term (n-1)s^2
numerator = (n - 1) * s**2

# --- Calculate the CI for the Variance (σ²) ---
lower_bound_var = numerator / chi2_upper
upper_bound_var = numerator / chi2_lower

# --- Calculate the CI for the Standard Deviation (σ) ---
lower_bound_std = np.sqrt(lower_bound_var)
upper_bound_std = np.sqrt(upper_bound_var)


print("--- Calculation Results ---")
print(f"Numerator term (n-1)s²: {numerator:.4f}")
print(f"\n95% Confidence Interval for the Variance (σ²):")
print(f"({lower_bound_var:.4f}, {upper_bound_var:.4f})")
print(f"\n95% Confidence Interval for the Standard Deviation (σ):")
print(f"({lower_bound_std:.4f}s, {upper_bound_std:.4f}s)")

--- Calculation Results ---
Numerator term (n-1)s²: 6.1875

95% Confidence Interval for the Variance (σ²):
(0.0482, 0.0843)

95% Confidence Interval for the Standard Deviation (σ):
(0.2195s, 0.2904s)


## Step 4: Final Conclusion and Assumption

**Required Distributional Assumption:**
For this confidence interval to be valid, we must assume that the underlying population from which the 100-run experiment was sampled is **normally distributed**.

**Computed 95% Confidence Interval for σ: (0.2195s, 0.2904s)**

**Interpretation:**
We are 95% confident that the true, unknown standard deviation of the process being measured is between 0.2195s and 0.2904s. This interval provides a range of plausible values for the true process variability.