**Confidence Interval**

**Background**
In quality control processes, especially when dealing with high-value items, destructive sampling is a necessary but costly method to ensure product quality. The test to determine whether an item meets the quality standards destroys the item, leading to the requirement of small sample sizes due to cost constraints.


**Scenario**
A manufacturer of print-heads for personal computers is interested in estimating the mean durability of their print-heads in terms of the number of characters printed before failure. To assess this, the manufacturer conducts a study on a small sample of print-heads due to the destructive nature of the testing process.


**Data**
A total of 15 print-heads were randomly selected and tested until failure. The durability of each print-head (in millions of characters) was recorded as follows:
1.13, 1.55, 1.43, 0.92, 1.25, 1.36, 1.32, 0.85, 1.07, 1.48, 1.20, 1.33, 1.18, 1.22, 1.29


**Assignment Tasks**
A. Build 99% Confidence Interval Using Sample Standard Deviation


In [1]:
from scipy import stats
import numpy as np
import pandas as pd

In [8]:
data = [1.13, 1.55, 1.43, 0.92, 1.25, 1.36, 1.32, 0.85, 1.07, 1.48, 1.20, 1.33, 1.18, 1.22, 1.29]
confidence_level = 0.99

In [3]:
# Sample size
n = len(data)
print(n)
# Degrees of freedom
df = n - 1
print(df)

15
14


In [4]:
# mean
mean = np.mean(data)
print(mean)

1.2386666666666666


In [5]:
# standard error
std_dev = np.std(data, ddof=1)
standard_error = std_dev / np.sqrt(n)
print(standard_error)

0.04987476379384733


In [6]:
# since we dont know standard deciation we use t_score
t_score = stats.t.ppf((1 + confidence_level) / 2, df)
t_score

np.float64(2.976842734370834)

In [7]:
# confidence Interval
confident = stats.t.interval(confidence_level,df,loc=mean, scale=standard_error)
confident

(np.float64(1.0901973384384906), np.float64(1.3871359948948425))

**conclusion**
The 99% confidence interval for the mean durability of the print-heads is between 1.09 million and 1.39 million characters. I have used t-distribution because the sample size is small (15 data points) and the population standard deviation is unknown. The t-distribution accounts for the increased variability in small samples, providing a more accurate estimate of the population mean.

**Assignment Task** = B. 99% Confidence Interval Using Known Population Standard Deviation


In [9]:
from scipy import stats
import numpy as np

In [10]:
# since we know standard deciation = 0.2 we use z_score
z_score = stats.norm.ppf((1 + confidence_level) / 2)
z_score

np.float64(2.5758293035489004)

In [11]:
# confidence Interval
confident = stats.norm.interval(confidence_level,loc=mean, scale=0.2)
confident

(np.float64(0.7235008059568865), np.float64(1.7538325273764466))

**conclusion**
The 99% confidence interval for the mean durability of the print-heads is between 0.72 million and 1.75 million characters. I have used the normal distribution because the population standard deviation (0.2 million characters) is known. The normal distribution is appropriate when the population standard deviation is known and the sample size is relatively small, providing a reliable estimate for the population mean within the given confidence level.