## Estimation And Confidence Intervals

Background

In quality control processes, especially when dealing with high-value items,
destructive sampling is a necessary but costly method to ensure product quality. 
The test to determine whether an item meets the quality standards destroys the item,
leading to the requirement of small sample sizes due to cost constraints.

    
Scenario
A manufacturer of print-heads for personal computers is interested in estimating the mean durability of their print-heads in terms of the number of characters printed before failure.
To assess this, the manufacturer conducts a study on a small sample of print-heads due to the destructive nature of the testing process.


## Data

A total of 15 print-heads were randomly selected and tested until failure. The durability of each print-head (in millions of characters) was recorded as follows:
1.13, 1.55, 1.43, 0.92, 1.25, 1.36, 1.32, 0.85, 1.07, 1.48, 1.20, 1.33, 1.18, 1.22, 1.29


## Assignment Tasks


a. Build 99% Confidence Interval Using Sample Standard Deviation

Assuming the sample is representative of the population, construct a 99% confidence interval for the mean number of characters printed before the print-head fails using the sample standard deviation.
Explain the steps you take and the rationale behind using the t-distribution for this task.
    
    
b. Build 99% Confidence Interval Using Known Population Standard Deviation

If it were known that the population standard deviation is 0.2 million characters, 
construct a 99% confidence interval for the mean number of characters printed before failure.



In [1]:


''' 
step1: calculate sample mean.
step2: Determine the criticl value.
step3: Compute Margin Error.
step4:construct confidence interval.
'''

'''
Explanation of Using t-Distribution vs. z-Distribution

The t-distribution is used when the sample size is small (𝑛<30) and the population standard deviation is unknown.
It accounts for the additional variability in the estimate of the standard deviation.

The z-distribution is appropriate when the population standard deviation is known or when the sample size is large
enough to approximate normality through the Central Limit Theorem.

'''

'\nExplanation of Using t-Distribution vs. z-Distribution\n\nThe t-distribution is used when the sample size is small (𝑛<30) and the population standard deviation is unknown.\nIt accounts for the additional variability in the estimate of the standard deviation.\n\nThe z-distribution is appropriate when the population standard deviation is known or when the sample size is large\nenough to approximate normality through the Central Limit Theorem.\n\n'

In [4]:
# Task a: Confidence Interval Using Sample Standard Deviation

import numpy as np
from scipy.stats import t,norm

#data
durability = [1.13, 1.55, 1.43, 0.92, 1.25, 1.36, 1.32, 0.85, 1.07, 1.48, 1.20, 1.33, 1.18, 1.22, 1.29]
n = len(durability)
mean = np.mean(durability)
print(f'Sample Mean: {mean:.3f}')
sample_std = np.std(durability, ddof=1)

confidence_interval = 0.99
alpha = 1 - confidence_interval

t_critical = t.ppf(1-alpha/2, df=n-1)
margin_of_error_a = t_critical * (sample_std/np.sqrt(n))
ci_a = (mean - margin_of_error_a, mean + margin_of_error_a )
print('Confidence Interval using sample standard deviation:', ci_a)

Sample Mean: 1.239
Confidence Interval using sample standard deviation: (1.0901973384384906, 1.3871359948948425)


In [5]:
# Task b
pop_std = 0.2
z_critical = norm.ppf(1 - alpha/2)
margin_of_error_b = z_critical * (pop_std/np.sqrt(n))
ci_b = (mean - margin_of_error_b, mean + margin_of_error_b)
print('Confidence Interval usind population standard deviation:', ci_b)

Confidence Interval usind population standard deviation: (1.1056514133957607, 1.3716819199375725)
