### 対応のある平均の差の検定 (paired two-samples t-test)

#### 帰無仮説・対立仮説を立てる
- 帰無仮説: キャンペーン前後の売上高に差はない  
- 対立仮説: キャンペーン後の売上高が上がった  
- 片側検定  
- 母分散が未知

Set up the null/alternative hypothesis 
- H0: There is no difference in sales before and after the campaign
- H1: Sales increased after the campaign
- One-sided test
- The population variance is unknown

In [1]:
import numpy as np
import pandas as pd
import scipy.stats as ss

#### データの読み込み
Read csv file

In [2]:
df = pd.read_csv("data/sales_camp.csv")
print(df.shape)
display(df.head())

(18, 2)


Unnamed: 0,before,after
0,1685,1717
1,1963,1866
2,2283,2449
3,2456,2469
4,2331,2223


#### 標本の平均，サイズ
Sample mean and size

In [3]:
diff = df["after"]-df["before"]
sample_mean = np.mean(diff) #標本の差の平均 mean of sample differences
unbiased_std = np.std(diff, ddof=1) # 標本の差の不偏標準偏差 unbiased standard deviation
n = df.shape[0] # 標本サイズ sample size

#### 検定統計量tを求める
Calculate the test statistic t

In [4]:
t = (sample_mean - 0)/(unbiased_std / np.sqrt(n))
print(t)

1.0894893777993577


#### 棄却域
critical region

In [5]:
# 棄却域 critical region
t_upper = ss.t.ppf(0.95, df=n-1) # 上側5%点 upper 5% point
print(t_upper)

1.7396067260750672


#### 検定統計量tが棄却域に含まれているか調べる
Check whether the test statistic t is in the critical region

In [6]:
if t_upper < t:
    print("Reject H0 (帰無仮説を棄却)")
else:
    print("Retain H0 (帰無仮説を棄却できない)")

Retain H0 (帰無仮説を棄却できない)


#### p値を求める
Calculate p-value

In [9]:
p_val = ss.t.cdf(-np.abs(t), df=n-1)
print(p_val)

if t > 0 and p_val < 0.05:
    print("Reject H0 (帰無仮説を棄却)")
else:
    print("Retain H0 (帰無仮説を棄却できない)")

0.14557086186684673
Retain H0 (帰無仮説を棄却できない)


#### ttest_1samp関数を使う方法

In [11]:
stat_t, p_val = ss.ttest_1samp(df["after"]-df["before"], 0.0, alternative='greater')
print(stat_t, p_val)

if stat_t > 0 and p_val < 0.05:
    print("Reject H0 (帰無仮説を棄却)")
else:
    print("Retain H0 (帰無仮説を棄却できない)")

1.089489377799358 0.1455708618668467
Retain H0 (帰無仮説を棄却できない)


#### 結論 Conclusion
p値が有意水準0.05よりも大きいので帰無仮説は棄却されない．   
つまり，キャンペーン後の売上高が上がったとはいえない

p-value (one-sided) > 0.05, 
so, it cannot be said that sales increased after the campaign