### 対応がない平均の差の検定 (Independent two-samples t-test)

#### 帰無仮説・対立仮説を立てる
- 帰無仮説: メーカーAとメーカーBの電球の平均寿命に差はない  
- 対立仮説: メーカーAの電球はメーカーBの電球よりも平均寿命が長い
- 片側検定  
- 母分散がわかっている

Set up the null/alternative hypothesis 
- H0: There is no difference in the life expectancy of bulbs of Maker A and Maker B
- H1: Maker A bulbs have a longer life expectancy than Maker B bulbs
- One-tailed test
- The population variance is unknown

In [10]:
import numpy as np
import pandas as pd
import scipy.stats as ss

#### データの読み込み
Read csv file

In [11]:
df = pd.read_csv("data/bulbs.csv")
print(df.shape)
display(df.head())

(30, 2)


Unnamed: 0,A,B
0,1111,1124
1,1163,1009
2,1136,1045
3,1161,1109
4,1175,1073


#### 標本の平均，サイズ
Sample mean and size

In [12]:
import numpy as np
from scipy import stats
from scipy.stats import t

M1=np.mean(df['A'])
M2=np.mean(df['B'])

sd1=np.std(df['A'], ddof=1)
sd2=np.std(df['B'], ddof=1)

N1=len(df['A'])
N2=len(df['B'])

#### 検定統計量tを求める
Calculate the test statistic t

In [13]:
stat_t = (M1-M2)/np.sqrt( pow(sd1,2)/N1 + pow(sd2,2)/N2 )
print(stat_t)

m11= pow(sd1,2)/N1 + pow(sd2,2)/N2
m1 = pow(m11,2)
m2 = pow(sd1,4)/(pow(N1,2)*(N1-1)) + pow(sd2,4)/(pow(N2,2)*(N2-1) )
df_val=m1/m2
print(df_val)

3.0110303151993656
56.970260691248434


#### 棄却域
critical region

In [14]:
# 棄却域 critical region
t_upper = ss.t.ppf(0.95, df=df_val) # 上側5%点 upper 5% point
print(t_upper)

1.6720433079284318


#### 検定統計量tが棄却域に含まれているか調べる
Check whether the test statistic z is in the critical region

In [15]:
if t_upper < stat_t:
    print("Reject H0 (帰無仮説を棄却)")
else:
    print("Retain H0 (帰無仮説を棄却できない)")

Reject H0 (帰無仮説を棄却)


#### p値を求める
Calculate p-value

In [16]:
p_val = stats.t.cdf(-np.abs(stat_t),df=df_val) #p値 片側検定
print(p_val)

if stat_t > 0 and p_val < 0.05:
    print("Reject H0 (帰無仮説を棄却)")
else:
    print("Retain H0 (帰無仮説を棄却できない)")

0.0019383134855873061
Reject H0 (帰無仮説を棄却)


#### ttest_ind関数を使う方法

In [17]:
stat_t, p_val = ss.ttest_ind(df["A"],df["B"],equal_var = False,alternative='greater')
print(stat_t, p_val)

if stat_t > 0 and p_val < 0.05:
    print("Reject H0 (帰無仮説を棄却)")
else:
    print("Retain H0 (帰無仮説を棄却できない)")

3.0110303151993656 0.0019383134855873061
Reject H0 (帰無仮説を棄却)


#### 結論 Results
検定統計量tが正の値でかつp値が有意水準0.05よりも小さいので帰無仮説は棄却される．  
つまり，メーカーAの電球はメーカーBの電球よりも有意に平均寿命が長い

The null hypothesis is rejected because the test statistic t is positive and the p-value is less than the significance level of 0.05.  
That is, Maker A bulbs have a longer life expectancy than Maker B bulbs.