# Estimation And Confidence Intervals

Background

In quality control processes, especially when dealing with high-value items, destructive sampling is a necessary but costly method to ensure product quality. The test to determine whether an item meets the quality standards destroys the item, leading to the requirement of small sample sizes due to cost constraints.


Scenario

A manufacturer of print-heads for personal computers is interested in estimating the mean durability of their print-heads in terms of the number of characters printed before failure. To assess this, the manufacturer conducts a study on a small sample of print-heads due to the destructive nature of the testing process.


Data

A total of 15 print-heads were randomly selected and tested until failure. The durability of each print-head (in millions of characters) was recorded as follows:
1.13, 1.55, 1.43, 0.92, 1.25, 1.36, 1.32, 0.85, 1.07, 1.48, 1.20, 1.33, 1.18, 1.22, 1.29

Assignment Tasks

==>a. Build 99% Confidence Interval Using Sample Standard Deviation
Assuming the sample is representative of the population, construct a 99% confidence interval for the mean number of characters printed before the print-head fails using the sample standard deviation. Explain the steps you take and the rationale behind using the t-distribution for this task.

==>b. Build 99% Confidence Interval Using Known Population Standard Deviation
If it were known that the population standard deviation is 0.2 million characters, construct a 99% confidence interval for the mean number of characters printed before failure.



In [2]:
#from the data it is clearly mentioned that there is no population standard
#deviation given and the sample size is n<30  so we use t-distribution/t-test

import numpy as np
from scipy import stats

#step 1 check if the population mean is unknown

df=np.array([1.13, 1.55, 1.43, 0.92, 1.25, 1.36, 1.32, 
            0.85, 1.07, 1.48, 1.20, 1.33, 1.18, 1.22, 1.29])

#step 2 check sample size is large<30  so we use the {T-test}

dflen=len(df)
print("length of sample=",dflen)
print()

#step 3 calculate sample mean x

mean=df.mean()
print("mean of sample is=",mean)
print()


#step 4 set the confidence interval(i.e 99%)

CI=0.99

#step 5 calculate t critical value alpha=5%

alpha=1-CI   #describes that there me be a chance of getting error  i.e=0.1
            #ppf probability point function
    
t_critical=stats.t.ppf(alpha/2, dflen-1)  #alpha/2 describes two tail and 
                                 #dflen describes degree of freedom
    
print("tcritical value",-t_critical)
print()
#step 6 calculating standard deviation of sample

s=df.std(ddof=1)  #degree of freedom=1 fro calculation sample standard deviation
print("standard deviation=",s)
print()


#step 7 find margin of error  E=z*s/root(n)

epsilon=t_critical*s/np.sqrt(dflen+1)
print("Margin of error=",epsilon)
print()

confidenceinterval=(mean+epsilon,mean-epsilon)
print(confidenceinterval)
print()
print("99% confidence interval for the mean durability of the print-heads is: [1.10, 1.40] (in millions of characters).")
print()
print("This means we are 99% confident that the true population mean lies within this interval.")

length of sample= 15

mean of sample is= 1.2386666666666666

tcritical value 2.97684273411266

standard deviation= 0.19316412956959936

Margin of error= -0.14375480890011458

(1.094911857766552, 1.3824214755667812)

99% confidence interval for the mean durability of the print-heads is: [1.10, 1.40] (in millions of characters).

This means we are 99% confident that the true population mean lies within this interval.


In [1]:
#population standard deviation is 0.2 mean we use z distribution because


import numpy as np
from scipy import stats
from math import sqrt

df=np.array([1.13, 1.55, 1.43, 0.92, 1.25, 1.36, 1.32, 
            0.85, 1.07, 1.48, 1.20, 1.33, 1.18, 1.22, 1.29])

n=len(df)

#1 calculate the mean

mean=df.mean()
print("population mean=",mean)
print()

#2 determine the sample or population  
 
#here we have population standard deviation so we use z distribution/test

#3 set the confidence interval

alpha=1-0.99

zcal=stats.norm.ppf(1-0.99/2)  #describe the critical value is distributed on 
                                                         #two sides
zcal=2.58
print("The confidence interval is 99%")
print()


#4 determining the z_critical_value

zcritical=zcal 

print("The crital value when distributed on two side",zcritical)
print()

#5.calculate the standard deviation deviation

sigma=0.2  #given in problem sigma because population standard deviation

print("Population Standard Deviation",sigma)
print()

#6 find the margin of error e=z*s/root(n)

eplison=zcritical*sigma/sqrt(n)

print("margin of error",eplison)
print()

#7 construct  the confidence interval

confidence_Interval=(mean-eplison,mean+eplison)
print("we are 99% sure that the mean value lies in between this confidence interval")
print()
print('confidence interval for the mean number characters printed before failure .',confidence_Interval)

population mean= 1.2386666666666666

The confidence interval is 99%

The crital value when distributed on two side 2.58

Population Standard Deviation 0.2

margin of error 0.13323062710953515

we are 99% sure that the mean value lies in between this confidence interval

confidence interval for the mean number characters printed before failure . (1.1054360395571314, 1.3718972937762017)


# confidence interval for the mean number characters printed before failure . (1.1054360395571314, 1.3718972937762017)