# Estimation and Confidence Intervals Assignment

    Background
    In quality control processes, especially when dealing with high-value items, destructive sampling is a necessary but costly method to ensure product quality. The test to determine whether an item meets the quality standards destroys the item, leading to the requirement of small sample sizes due to cost constraints.
    Scenario
    A manufacturer of print-heads for personal computers is interested in estimating the mean durability of their print-heads in terms of the number of characters printed before failure. To assess this, the manufacturer conducts a study on a small sample of print-heads due to the destructive nature of the testing process.
    Data
    A total of 15 print-heads were randomly selected and tested until failure. The durability of each print-head (in millions of characters) was recorded as follows:
    1.13, 1.55, 1.43, 0.92, 1.25, 1.36, 1.32, 0.85, 1.07, 1.48, 1.20, 1.33, 1.18, 1.22, 1.29


In [None]:
import numpy as np
data = np.array([1.13, 1.55, 1.43, 0.92, 1.25, 1.36, 1.32, 0.85, 
                 1.07, 1.48, 1.20, 1.33, 1.18, 1.22, 1.29])

n = len(data)
sample_mean = np.mean(data)
sample_std = np.std(data, ddof=1)
(sample_mean, sample_std, n)

(1.2386666666666666, 0.18661427836285438, 15)

a. 99% Confidence Interval using Sample Standard Deviation (t-distribution)

#### Why t-distribution?
- When the **population standard deviation (σ) is unknown** and sample size is small (n < 30),  
  we use the **t-distribution** instead of z-distribution.  
- Degrees of Freedom (df) = n - 1 = 14.  
- Formula:  

  CI= mean +/- t_critical*(std/sqrt(n))


In [None]:
from scipy import stats

alpha = 0.01
df = n - 1
se = sample_std/ np.sqrt(n)

t_critical = stats.t.ppf(1 - alpha/2, df)

margin_error_t = t_critical * (sample_std / np.sqrt(n))

ci_lower_t = sample_mean - margin_error_t
ci_upper_t = sample_mean + margin_error_t

(ci_lower_t, ci_upper_t)


(1.0952316686385626, 1.3821016646947706)

In [None]:
# ci_t = stats.t.interval(0.99, df=n-1, loc=sample_mean, scale=se)
# print("99% CI (t-distribution):", ci_t)

b. Build 99% Confidence Interval Using Known Population Standard Deviation

In [11]:
pop_std = 0.2  
z_critical = stats.norm.ppf(1 - alpha/2)
margin_error_z = z_critical * (pop_std / np.sqrt(n))

ci_lower_z = sample_mean - margin_error_z
ci_upper_z = sample_mean + margin_error_z

(ci_lower_z, ci_upper_z)


(1.1056514133957607, 1.3716819199375725)

In [None]:
# Using z interval
# std =0.2
# se = 0.2/ np.sqrt(n)
# ci_z = stats.norm.interval(0.99,loc=sample_mean,scale=se)
# ci_z

# Final Conclusion 
- Using the **t-distribution** (σ unknown), the confidence interval is slightly wider to account for extra uncertainty.  
- Using the **z-distribution** (σ known = 0.2), the confidence interval is narrower.  
- In practice, when σ is not known (which is common), we rely on the sample std.  
