# Estimation and Confidence Intervals

## Background

A manufacturer of print-heads wants to estimate the mean durability 
(number of characters printed before failure, in millions).

Because testing destroys the product, only 15 print-heads were tested.

We will construct 99% confidence intervals in two cases:

1. Using sample standard deviation (t-distribution)
2. Using known population standard deviation (z-distribution)


In [1]:
import numpy as np
import scipy.stats as stats

# Given sample data (in millions of characters)
data = np.array([
    1.13, 1.55, 1.43, 0.92, 1.25,
    1.36, 1.32, 0.85, 1.07, 1.48,
    1.20, 1.33, 1.18, 1.22, 1.29
])

# Sample size
n = len(data)

# Sample mean
mean = np.mean(data)

# Sample standard deviation
s = np.std(data, ddof=1)

print("Sample Size (n):", n)
print("Sample Mean:", round(mean, 4))
print("Sample Standard Deviation:", round(s, 4))


Sample Size (n): 15
Sample Mean: 1.2387
Sample Standard Deviation: 0.1932


## Part A: 99% Confidence Interval (Using Sample Standard Deviation)

Since:
- Sample size is small (n = 15)
- Population standard deviation is unknown

We use the t-distribution.

Formula:

CI = x̄ ± t(α/2, n-1) × (s / √n)

Where:
x̄ = sample mean
s = sample standard deviation
n = sample size


In [2]:
# Confidence level
confidence = 0.99
alpha = 1 - confidence

# t critical value
t_critical = stats.t.ppf(1 - alpha/2, df=n-1)

# Margin of error
margin_t = t_critical * (s / np.sqrt(n))

# Confidence Interval
lower_t = mean - margin_t
upper_t = mean + margin_t

print("t Critical Value:", round(t_critical, 4))
print("99% CI using t-distribution:")
print("(", round(lower_t, 4), ",", round(upper_t, 4), ")")


t Critical Value: 2.9768
99% CI using t-distribution:
( 1.0902 , 1.3871 )


## Part B: 99% Confidence Interval (Using Known Population Standard Deviation)

If population standard deviation is known (σ = 0.2),

We use the z-distribution.

Formula:

CI = x̄ ± z(α/2) × (σ / √n)


In [3]:
# Known population standard deviation
sigma = 0.2

# z critical value
z_critical = stats.norm.ppf(1 - alpha/2)

# Margin of error
margin_z = z_critical * (sigma / np.sqrt(n))

# Confidence Interval
lower_z = mean - margin_z
upper_z = mean + margin_z

print("Z Critical Value:", round(z_critical, 4))
print("99% CI using z-distribution:")
print("(", round(lower_z, 4), ",", round(upper_z, 4), ")")


Z Critical Value: 2.5758
99% CI using z-distribution:
( 1.1057 , 1.3717 )


## Conclusion

Using t-distribution:
We use this because the sample size is small and population standard deviation is unknown.

Using z-distribution:
We use this when the population standard deviation is known.

Both intervals estimate the range in which the true mean durability lies 
with 99% confidence.

The t-interval is slightly wider because it accounts for additional uncertainty 
from estimating the population standard deviation.
