### a. Build 99% Confidence Interval using Sample standard deviation

In [2]:
import numpy as np
from scipy import stats

## Next, lets input the data and calculate the sample mean and sample standard deviation

Since we are dealing with a small sample size (n=15) we need to use t-distribution to calculate the confidence interval. The t-distribution is similar to normal distribution, but it has wider tail to account for the increased uncertainty due to smaller sample size.

The formula for confidence interval using t-distribution is
  CI = x̄ ± t*(s/√n)

where x̄ is the sample mean, s is the sample standard deviation, n is the sample size, and t* is the critical value of the t-distribution with n-1 degrees of freedom and a significance level of (1-α).

We can calculate the critical value using the stats.t.ppf function:

In [5]:
data = [1.13, 1.55, 1.43, 0.92, 1.25, 1.36, 1.32, 0.85, 1.07, 1.48, 1.20, 1.33, 1.18, 1.22, 1.29]
n = len(data)
mean = np.mean(data)
std_dev = np.std(data, ddof = 1)  # ddof=1 to calculate the sample standard deviation

In [6]:
alpha = 0.01   # significance level of 1%
df = n - 1    # degreees of freedom
t_star = stats.t.ppf(1 - alpha/2, df)

# Finally we calculate the confidence interval

In [8]:
lower_ci = mean - t_star*(std_dev/np.sqrt(n))
upper_ci = mean + t_star*(std_dev/np.sqrt(n))
print(f"The 99% confidence interval is ({lower_ci}, {upper_ci})")

The 99% confidence interval is (1.0901973384384906, 1.3871359948948425)


### b. Build 99% Confidence interval using Known population standard deviation

If we know the population standard deviation, we can use the normal distribution to calculate the confidence interval. The formula for the confidence interval using normal distribution is
CI = x̄ ± z*(σ/√n)

where x̄ is the sample mean, σ is the population standard deviation, n is the sample size, and z* is the critical value of the normal distribution with a significance level of (1-α).

We can calculate the critical value using the stats.norm.ppf function:


In [11]:
z_star = stats.norm.ppf(1-alpha/2)
print(z_star)

2.5758293035489004


Assuming we know the population standard deviation is 0.2 million characters, we can calculate the confidence interval as follows:

In [13]:

pop_std_dev = 0.2
lower_ci = mean - z_star * (pop_std_dev / np.sqrt(n))
upper_ci = mean + z_star * (pop_std_dev / np.sqrt(n))
print(f"The 99% confidence interval is ({lower_ci}, {upper_ci})")


The 99% confidence interval is (1.1056514133957607, 1.3716819199375725)
