In [1]:
import warnings
warnings.filterwarnings('ignore')
import numpy as np

# Estimation and Confidence Intervals

## Background

In quality control processes, especially when dealing with high-value items, destructive sampling is a necessary but costly method to ensure product quality. The test to determine whether an item meets the quality standards destroys the item, leading to the requirement of small sample sizes due to cost constraints.

## Scenario

In quality control processes, especially when dealing with high-value items, destructive sampling is a necessary but costly method to ensure product quality. The test to determine whether an item meets the quality standards destroys the item, leading to the requirement of small sample sizes due to cost constraints.

## Data

A total of 15 print-heads were randomly selected and tested until failure. The durability of each print-head (in millions of characters) was recorded as follows:
1.13, 1.55, 1.43, 0.92, 1.25, 1.36, 1.32, 0.85, 1.07, 1.48, 1.20, 1.33, 1.18, 1.22, 1.29


In [2]:
data = [1.13, 1.55, 1.43, 0.92, 1.25, 1.36, 1.32, 0.85, 1.07, 1.48, 1.20, 1.33, 1.18, 1.22, 1.29]

## Assignment Tasks

#### a. Build 99% Confidence Interval Using Sample Standard Deviation

Assuming the sample is representative of the population, construct a 99% confidence interval for the mean number of characters printed before the print-head fails using the sample standard deviation. Explain the steps you take and the rationale behind using the t-distribution for this task.

#### Solution

##### Step 1: Calculate the sample mean. 

In [3]:
sample_mean = np.mean(data)
print(f"Sample Mean: {sample_mean}")

Sample Mean: 1.2386666666666666


Step 2: Calculate Sample standard deviation from the given data.

In [4]:
sample_std = np.std(data, ddof=1)
print(f'Sample Standard Deviation: {sample_std}')

Sample Standard Deviation: 0.19316412956959936


Step 3: Calculate Degrees of Freedom:

In [5]:
n = len(data)
df = n-1
print(f'Degrees of Freedom: {df}')

Degrees of Freedom: 14


Step 4: Calculate t-score for 99% confidence Interval.

In [6]:
from scipy.stats import t

In [7]:
t_score = t.ppf(0.995, df) # two-tailed test
print(f'T-Score: {t_score}')

T-Score: 2.97684273411266


Step 5: Calculate the margin of error:

In [8]:
margin_error = t_score * (sample_std / np.sqrt(n))
print(f'Margin of Error: {margin_error}')

Margin of Error: 0.1484693282152996


Step 6: Calculate the confidence interval:

In [9]:
confidence_interval = (sample_mean - margin_error, sample_mean + margin_error)
print(f'Confidence Interval: {confidence_interval}')

Confidence Interval: (1.090197338451367, 1.3871359948819662)


So, the 99% confidence interval for the mean number of characters printed before failure using the sample standard deviation is approximately 1.090 to 1.387 million characters.

Rationale for using the t-distribution:

We use the t-distribution because the population standard deviation is unknown, and the sample size is small (less than 30). The t-distribution accounts for the additional uncertainty introduced by using the sample standard deviation instead of the population standard deviation.

#### b. Build 99% Confidence Interval Using Known Population Standard Deviation

If it were known that the population standard deviation is 0.2 million characters, construct a 99% confidence interval for the mean number of characters printed before failure.


#### Solution:

When the population standard deviation is known, we use the z-distribution instead of the t-distribution because we don't need to estimate the standard error.

In [10]:
population_std = 0.2 # in millions
print(f'Population Standard Deviation is Given: {population_std}')

Population Standard Deviation is Given: 0.2


Z_Score at 99% confidence interval:

In [11]:
z_score = 2.576 # for a two-tailed test
print(f'Z-Score: {z_score}')

Z-Score: 2.576


Calculate Margin of Error:

In [12]:
margin_error_b = z_score * (population_std / np.sqrt(n))
print(f'Margin of Error: {margin_error_b}')

Margin of Error: 0.13302406799773742


Calculate Confidence Interval:

In [13]:
confidence_interval_b = (sample_mean - margin_error_b, sample_mean + margin_error_b)
print(f"Confidence Interval: {confidence_interval_b}")

Confidence Interval: (1.105642598668929, 1.371690734664404)


So, the 99% confidence interval for the mean number of characters printed before failure using the known population standard deviation is approximately 1.105 to 1.371 million characters.