# Confidence Intervals vs. Tolerance Intervals

Understanding the differences between confidence intervals and tolerance intervals can be facilitated by summarizing their key points and distinct purposes. Here’s a concise breakdown:

Confidence Intervals
Definition: A confidence interval estimates the range within which a population parameter (such as the mean or proportion) is likely to fall, based on a sample statistic.
Purpose: To infer about a population parameter.
Interpretation: If you were to take many samples and construct a confidence interval from each sample, a certain percentage (e.g., 95%) of those intervals would contain the true population parameter.
Focus: On the precision of the sample estimate of a population parameter.
Common Use Cases:
Estimating the mean height of a population from a sample.
Determining the proportion of voters favoring a candidate.
Calculating the average test scores of students.
Calculation: Based on the sample mean, sample size, standard error, and the desired confidence level (e.g., 95%, 99%).

# Tolerance Intervals
Definition: A tolerance interval specifies the range within which a certain proportion of the population falls, with a given level of confidence.
Purpose: To specify a range that captures a specified proportion of the population.
Interpretation: With a certain confidence level (e.g., 99%), a specified proportion (e.g., 95%) of the population values fall within this interval.
Focus: On the coverage of the population's distribution.
Common Use Cases:
- Ensuring product quality in manufacturing (e.g., 95% of items meet the size specification with 99% confidence).

- Determining environmental safety limits (e.g., pollutant levels within safe limits for 95% of samples with 99% confidence).

- Setting medical benchmarks (e.g., 95% of patients' blood pressure within a certain range with 99% confidence).
Calculation: Based on sample statistics (mean and standard deviation), sample size, desired proportion of the population, and the desired confidence level.

## Key Points to Memorize
Confidence Intervals
Purpose: Estimate population parameters.
Focus: Mean or proportion of a population.
Interpretation: Probability about where the population parameter lies.
Common Example: "We are 95% confident that the population mean lies between X and Y."

## Tolerance Intervals
Purpose: Capture a proportion of the population.
Focus: Range covering a proportion of the population.
Interpretation: Proportion of the population within a range with a specified confidence.
Common Example: "We are 99% confident that 95% of the population falls between X and Y."
Visualization
Confidence Interval: Narrower focus on estimating a specific parameter (like the mean) with a given level of confidence.
Tolerance Interval: Broader focus on ensuring a specified proportion of the population falls within the interval with a given level of confidence.
Example Comparison
Confidence Interval: If you sample the heights of 100 students and calculate a 95% confidence interval for the mean height, you're saying "I'm 95% confident that the mean height of all students is between X and Y inches."

Tolerance Interval: If you want to ensure that 95% of all student heights fall within a certain range with 99% confidence, you're saying "I'm 99% confident that the interval from X to Y inches covers the heights of 95% of all students."

By focusing on these key distinctions and purposes, you can better understand and remember the differences between confidence intervals and tolerance intervals.

## 1. Confidence Interval Example
We’ll use a sample dataset to calculate a 95% confidence interval for the mean. This interval will give us a range within which we expect the true population mean to fall with 95% confidence.

In [2]:
import numpy as np
import scipy.stats as stats

# Generate a sample dataset
np.random.seed(42)
data = np.random.normal(loc=50, scale=10, size=100)  # Mean = 50, Std Dev = 10, Sample Size = 100

# Calculate sample statistics
mean = np.mean(data)
std_dev = np.std(data, ddof=1)
n = len(data)

# Confidence level
confidence = 0.95

# Compute the critical value for the t-distribution
alpha = 1 - confidence
t_critical = stats.t.ppf(1 - alpha/2, df=n-1)

# Calculate the margin of error
margin_of_error = t_critical * (std_dev / np.sqrt(n))

# Calculate the confidence interval
lower_bound = mean - margin_of_error
upper_bound = mean + margin_of_error

print(f"95% Confidence Interval for the mean: ({lower_bound:.2f}, {upper_bound:.2f})")


95% Confidence Interval for the mean: (47.16, 50.76)
