# Estimation And Confidence Intervals

In [1]:
# Importing Libraries
import numpy as np
from scipy import stats

In [2]:
# Loading Data
data = [1.13, 1.55, 1.43, 0.92, 1.25, 1.36, 1.32, 0.85, 1.07, 1.48, 1.20, 1.33, 1.18, 1.22, 1.29]
data

[1.13,
 1.55,
 1.43,
 0.92,
 1.25,
 1.36,
 1.32,
 0.85,
 1.07,
 1.48,
 1.2,
 1.33,
 1.18,
 1.22,
 1.29]

## a). 99% Confidence Interval Using Sample Standard Deviation

To construct a 99% confidence interval for the mean number of characters printed before the print-head fails using the sample standard deviation, the following steps were taken:

### 1. Calculate the Sample Mean and Standard Deviation

- **Sample Mean ($\bar{x}$)**: The average durability of the print-heads, calculated as the sum of all observed values divided by the number of observations.

$\bar{x} = \frac{\sum(x)}{15}$

- **Sample Standard Deviation ($s$)**: A measure of the dispersion of the data points around the sample mean, adjusted for the small sample size using Bessel's correction ($n-1$ degrees of freedom).

$s = \sqrt{\frac{\sum(x_i - \bar{x})^2}{n-1}}$

In [3]:
# Calculation of sample mean
sample_mean = np.mean(data)
print("Sample mean: ", round(sample_mean, 3))

Sample mean:  1.239


In [4]:
# Calculation of sample standard deviation
sample_std = np.std(data, ddof=1)                            # ddof =1 is used for bessel's correction
print("Sample Standard Deviation: ", round(sample_std,3))

Sample Standard Deviation:  0.193


### 2. Determine the t-critical value

- **Why t-distribution?**:
  - The t-distribution is used when the sample size is small (typically $n<30$) and the population standard deviation is unknown.
  - The t-distribution accounts for the additional uncertainty introduced by estimating the population standard deviation from the sample.

In [6]:
# Calculating degree of freedom (n-1)
dof = len(data)-1
print('Degree of freedom = ',dof)

Degree of freedom =  14


In [16]:
# Confidence level
confidence_level = 0.99
alpha = 1 - confidence_level

# Calculation of t-Critical value
t_critical = stats.t.ppf(1-alpha/2, df=14)
print('t-Critical = ',round(t_critical,3))

t-Critical =  2.977


- **t-critical value**: For a 99% confidence interval with degrees of freedom ($n-1$) is 2.977

### 3. Calculate the Margin of Error

- **Formula**: 

$\text{Margin of Error} = t_{\text{critical}} \times \frac{s}{\sqrt{n}}$

In [17]:
# Calculating Margin of Error
margin_error = t_critical*(sample_std/np.sqrt(len(data)))
print("Margin of error = ", round(margin_error,3))

Margin of error =  0.148


- **Explanation**:
  - The margin of error quantifies the range within which the true population mean is likely to fall. It is influenced by the t-critical value, the sample standard deviation, and the square root of the sample size.
  - For this example, the margin of error is approximately 0.148 million characters.

### 4. Construct the Confidence Interval

- **Formula**:

$\text{Confidence Interval} = \left(\bar{x} - \text{Margin of Error}, \bar{x} + \text{Margin of Error}\right)$

In [18]:
# Construct confidence interval
confi_interval = (round(sample_mean-margin_error,3),round(sample_mean + margin_error,3))
print("99% Confidence interval = ",confi_interval)

99% Confidence interval =  (1.09, 1.387)


- **Explanation**:
  - The 99% confidence interval is the range within which we are 99% confident that the true population mean lies.
  - In this case, the confidence interval is approximately (1.09, 1.387) million characters.

## Rationale for Using the t-Distribution

- **Unknown Population Standard Deviation**: When the population standard deviation is unknown, we rely on the sample standard deviation as an estimate. This introduces extra variability, which the t-distribution accounts for by having heavier tails compared to the z-distribution.

- **Small Sample Size**: With small sample sizes, the sample mean is less likely to be a perfect estimator of the population mean. The t-distribution corrects for this by widening the confidence interval, thus providing a more conservative estimate of the population mean.

Using the t-distribution in such cases ensures that the confidence interval accurately reflects the increased uncertainty due to the limited sample size and unknown population parameters.

## b). 99% Confidence Interval Using Known Population Standard Deviation

### 1. Determine the z-critical value

- **Why z-distribution?**:
  - The z-distribution is used when the population standard deviation is known, regardless of the sample size.
  - The z-distribution assumes a normal distribution of the sample mean.

In [15]:
# Given Population Standard Deviation
pop_std = 0.2
print('Population Standard Deviation =',pop_std)

Population Standard Deviation = 0.2


In [21]:
# Confidence level
confidence_level = 0.99
alpha = 1 - confidence_level

# Calculate critical z-value
critical_z = stats.norm.ppf(1-alpha/2)
print("Critical z-value:", round(critical_z,3))

Critical z-value: 2.576


 **z-critical value**: For a 99% confidence interval, the critical value from the z-distribution is approximately 2.576.

### 2. Calculate the Margin of Error

- **Formula**:

$\text{Margin of Error} = z_{\text{critical}} \times \frac{\sigma}{\sqrt{n}}$

In [23]:
# Calculating Margin of Error
margin_err = critical_z *(pop_std/np.sqrt(len(data)))
print("Margin of error =", round(margin_err,3))

Margin of error = 0.133


- **Explanation**:
  - With the population standard deviation ($\sigma$) known, the margin of error is calculated similarly but uses the z-critical value instead of the t-critical value.
  - The margin of error in this case is approximately 0.133 million characters.

### 3. Construct the Confidence Interval

- **Formula**:

$\text{Confidence Interval} = \left(\bar{x} - \text{Margin of Error}, \bar{x} + \text{Margin of Error}\right)$

In [25]:
# Construct confidence interval
confidence_interval = (round(sample_mean - margin_err,3), round(sample_mean + margin_err,3))
print("99% Confidence interval =", confidence_interval)

99% Confidence interval = (1.106, 1.372)


- **Explanation**:
  - The confidence interval is constructed similarly to the previous step, but the use of the z-distribution typically results in a slightly narrower interval.
  - The confidence interval is approximately (1.106, 1.372) million characters.

####  **Author Information:**
- **Author:-**  Er.Pradeep Kumar
- **LinkedIn:-**  [https://www.linkedin.com/in/pradeep-kumar-1722b6123/](https://www.linkedin.com/in/pradeep-kumar-1722b6123/)

#### **Disclaimer:**
This Jupyter Notebook and its contents are shared for educational purposes. The author, Pradeep Kumar, retains ownership and rights to the original content. Any modifications or adaptations should be made with proper attribution and permission from the author.