You are a quality control analyst in a factory that produces light bulbs. The factory claims that the average lifetime of their light bulbs is 1,000 hours. However, you suspect that the true average lifetime of the bulbs **might be more** than this, so you decide to test it.

**Given:**

*   Population Mean (
$\mu_0$): 1,000 hours

*   Sample Size (
𝑛
): 36 bulbs

*   Sample Mean (
$\bar{x}$
 ): 990 hours

*   Significance Level (
𝛼
): 0.05



**Hypotheses:**

*   Null Hypothesis (
$H_0$
 ):
𝜇
$\leq$
1
,
000 hours
(The true average lifetime is 1,000 hours)

*   Alternative Hypothesis (
$H_a$
​
 ):
𝜇 > 1
,
000 hours (The true average lifetime is not 1,000 hours)

**Goal:**
Determine whether to reject the null hypothesis based on the provided information.

We were not given the population standard deviation $\sigma$ nor the sample one $\sigma_s$, so we need to calculate it so that we can use it to calculate the z score.

This means that we need to use the t test.
So first we will calculate the sample standard deviation $\bar{\sigma}$. For this we need the data for each of the 36 bulbs:


In [None]:
import pandas as pd
df = pd.read_csv('/bulb_lifetime_data.csv')
df

Unnamed: 0,Bulb #,Lifetime (hours)
0,1,988
1,2,995
2,3,1002
3,4,993
4,5,989
5,6,1001
6,7,987
7,8,996
8,9,1003
9,10,1004


In [None]:
# getting all the lifetime values into a variable:

lifetime_values = df['Lifetime (hours)'].values
ltv = lifetime_values

# calculating the mean of all ltv values:
mean_ltv = ltv.mean()
print(f'The sample mean (x̄) of our 36 bulbs is:\n\t{mean_ltv}')

The sample mean (x̄) of our 36 bulbs is:
	995.25


In [None]:
import math

#calculating population standard deviation of the dataset
deviation_sum = 0
for i in range(len(ltv)):
   deviation_sum+=(ltv[i]- mean_ltv)**2
   psd = math.sqrt((deviation_sum)/len(ltv))

#calculating sample standard deviation of the dataset
ssd = math.sqrt((deviation_sum)/len(ltv) - 1)

print(f"\nSample standard deviation of the dataset is\n\t{round(ssd, 3)}")


Sample standard deviation of the dataset is
	6.098


So $\bar{\sigma}$ = 6.098 = S


Now that we have S (which is $\bar{\sigma}$) we can calculate:
*   $S_{\bar{x}}$ = $\frac{S}{\sqrt{n}}$
*   t score = $\frac{\bar{x} - \mu}{S_{\bar{x}}}$



In [None]:
# calculating S sub x bar:
ssubxbar = ssd / math.sqrt(len(ltv))
print(f'\nS sub x bar =\n\t{round(ssubxbar, 3)}')



S sub x bar =
	1.016


So $S_{\bar{x}}$ = 1.016

In [None]:
# calculating the t score:

tscore = (mean_ltv - 1000) / ssubxbar
print(f'\nThe t score =\n\t{round(tscore, 3)}')



The t score =
	-4.674


The t value is -4.674

In [None]:
# getting the critical value from the t score table:

from scipy.stats import t

alpha = 0.05  # Significance level
n = len(ltv)  # Sample size
d_f = n - 1  # Degrees of freedom

# For a right-tailed test:
critical_value = t.ppf(1 - alpha, d_f)

print(f'The critical_value is:\n\t{round(critical_value, 3)}')

The critical_value is:
	2.03


In [None]:
# concluding based on the critical value and the t value:

if tscore > critical_value:
    print("\nReject the null hypothesis")
else:
    print("\nFail to reject the null hypothesis")


Fail to reject the null hypothesis


Given the t value of -4.674 and a critical value of 1.69, we indeed can say with confidence that the null hypothesis cannot be rejected.

Based on the data of 36 bulbs, we do not have enough evidence to prove that the average lifetime of a bulb is greater than 1000 hours.

Given the large difference between the t value and the critical value, we can say with confidence that the average lifetime of a bulb is still **equal to or lower than 1000 hours**.