# Tolerance Intervals in Statistics
-A tolerance interval in statistics is an interval within which, with some confidence level, a 
specified proportion of a population falls. 

-Unlike confidence intervals, which focus on estimating population parameters (like the mean or 
                                                                              standard deviation), 
tolerance intervals focus on the range of the population's values.

### There are two types of tolerance intervals:

- Two-sided tolerance interval: This interval covers a specified proportion of the population on both 
sides of the mean.
    
- One-sided tolerance interval: This interval covers a specified proportion of the population on only
    one side (either the lower or the upper side).

### Use Cases of Tolerance Intervals in Real Life
- Manufacturing Quality Control: Ensuring that a certain proportion of products meet the specified quality standards. For example, in the production of resistors, a tolerance interval can ensure that a specified percentage of resistors have resistance within a certain range.

- Environmental Regulations: Determining the range within which pollutant levels fall for a certain percentage of time with a given confidence level.

- Medical Studies: Establishing ranges for vital signs (like blood pressure or cholesterol levels) that cover a certain proportion of a healthy population.
Engineering: Ensuring that materials or components will function correctly within certain ranges of stress or strai

## Comprehensive Example with Python Code
- Suppose we have a sample of 1000 measurements of a certain chemical concentration in a water supply. We want to determine the tolerance interval that covers 95% of the population with 99% confidence

In [2]:
import numpy as np
import scipy.stats as stats

# Generate a sample dataset
np.random.seed(42)
sample_size = 1000
data = np.random.normal(loc=50, scale=10, size=sample_size)  # Mean = 50, Std Dev = 10

# Define the proportion of the population and the confidence level
proportion = 0.95
confidence_level = 0.99

# Calculate the tolerance interval
n = len(data)
mean = np.mean(data)
std_dev = np.std(data, ddof=1)

# Degrees of freedom
dof = n - 1

# Find the chi-squared critical value
alpha = 1 - confidence_level
chi2_critical = stats.chi2.ppf(1 - alpha, dof)

# Calculate the tolerance factor
k = np.sqrt((dof * (1 + 1/n) * stats.norm.ppf((1 + proportion)/2)**2) / chi2_critical)

# Calculate the tolerance interval
lower_bound = mean - k * std_dev
upper_bound = mean + k * std_dev

(lower_bound, upper_bound)

(31.9432272665737, 68.44341384987281)

## Step by step explanation 

- ### 1.Generating the Sample Dataset:
We generate a dataset with 1000 samples from a normal distribution with a mean of 50 and a standard deviation of 10.

- ### 2.Setting Proportion and Confidence Level:
We define that we want 95% of the population (proportion = 0.95) to be within the interval with 99% confidence (confidence_level = 0.99).

- ### 3.Calculating Mean and Standard Deviation:
We calculate the sample mean and standard deviation.

- ### 4.Calculating Degrees of Freedom:
Degrees of freedom (dof) is calculated as the sample size minus one.

- ### 5.Finding the Chi-Squared Critical Value:
The chi-squared critical value is obtained for the given confidence level and degrees of freedom.

- ### 6.Calculating the Tolerance Factor:
The tolerance factor (k) is calculated using the formula that incorporates the chi-squared critical value, degrees of freedom, sample size, and the standard normal critical value for the given proportion.

- ### 7.Calculating the Tolerance Interval:
Finally, the lower and upper bounds of the tolerance interval are calculated using the mean, standard deviation, and the tolerance factor.
By running the above code, you will get the lower and upper bounds of the tolerance interval for the given dataset. This interval tells you that with 99% confidence, 95% of the population's chemical concentration measurements fall within this range.