### Training of statistical test 

In [None]:
import numpy as np
import pandas as pd
import scipy.stats as ss

#### 1. Can population mean of weight be regarded as 100 [g] ? (one-sample t-test)  
重さの母平均は100gであると見なせるか？ (1標本t検定)  

H0: Can be regarded as 100 [g]

In [None]:
csv_in = 'stat_test_sample1.csv'
df1 = pd.read_csv(csv_in, delimiter=',', skiprows=0, header=0)
print(df1.shape)
display(df1.head())

In [None]:
print(df1['Weight'].mean())
t, p = ss.ttest_1samp(df1['Weight'], 100.0)
print(t, p)

##### (Conclusion) p < 0.05, so population mean of the weight CANNOT be regarded as 100g  
(結論) p < 0.05なので、重さの母平均は100gであるとは見なせない。

#### 2. Does the population mean of blood pressure decrease after taking a medichine? (paired t-test)
服薬後に血圧の母平均は低下したといえるか？ (対応のある2標本t検定)  

H0: Population mean of blood pressure DOES NOT decreased  

In [None]:
csv_in = 'stat_test_sample2.csv'
df2 = pd.read_csv(csv_in, delimiter=',', skiprows=0, header=0)
print(df2.shape)
display(df2.head())

In [None]:
print(df2['BP_before'], df2['BP_after'])
t, p = ss.ttest_1samp(df2['BP_after']-df2['BP_before'], 0.0)
print(t, p/2)

##### (Conclusion) t < 0 and p/2 < 0.05, so population mean of the blood pressure significantly decreased after taking this medicine.  
(結論) t < 0 かつ p/2 < 0.05 なので、この薬を飲むと血圧の母平均は有意に下がったといえる。

#### 3. Are the population means of exam results different between male and female? (unpaired t-test, two-sided)
試験点数の母平均に男女差があるといえるか？ (対応のない2標本t検定, 両側検定)  

H0: Population means of exam results are NOT different  

In [None]:
csv_in = 'stat_test_sample3.csv'
df3 = pd.read_csv(csv_in, delimiter=',', skiprows=0, header=0)
print(df3.shape)
display(df3.head())

In [None]:
df3_M = df3[ df3['M_or_F'] == 'M' ]
df3_F = df3[ df3['M_or_F'] == 'F' ]
print(df3_M['Points'].mean(), df3_F['Points'].mean())
t, p = ss.ttest_ind(df3_M['Points'], df3_F['Points'], equal_var=False)
print(t, p)

##### (Conclusion) p > 0.05, so there is no evidence that the population mean of males' points is significantly different from that of females' points.  
(結論) p > 0.05 なので、試験点数の母平均に有意な男女差があるとはいえない。  

#### 3-2.Is the population mean of female better than that of male? (unpaired t-test, one-sided)
試験点数の母平均は女の方が男よりよいといえるか？ (対応のない2標本t検定, 片側検定)  

H0: Population mean of female is NOT different from that of male  

In [None]:
print(df3_M['Points'].mean(), df3_F['Points'].mean())
t, p = ss.ttest_ind(df3_M['Points'], df3_F['Points'], equal_var=False)
print(t, p)

##### (Conclusion) p > 0.05, so there is no evidence that the population mean of females' points is significantly better than that of males' points.
(結論) p > 0.05 なので、試験点数の母平均は女の方が男より有意によいとはいえない。  

#### 4. Are the population MEDIANs of exam results different between male and female? (Mann-Whitney's U-test)
試験点数の母**中央値**に男女差があるといえるか？ (Mann-WhitneyのU検定(WilcoxonNo順位和検定))  

H0: Population medians of exam results are NOT different  

In [None]:
csv_in = 'stat_test_sample3.csv'
df4 = pd.read_csv(csv_in, delimiter=',', skiprows=0, header=0)
print(df4.shape)
display(df4.head())

In [None]:
df4_M = df4[ df4['M_or_F'] == 'M' ]
df4_F = df4[ df4['M_or_F'] == 'F' ]
print(df4_M['Points'].median(), df4_F['Points'].median())
u, p = ss.mannwhitneyu(df4_M['Points'], df4_F['Points'], alternative='two-sided')
print(u, p)

##### (Conclusion) p < 0.05, so population MEDIANs of male/femail points are significantly different.   
Note that this conclusion is opposite to 3. using the same data.  
"0.05" is NOT A GOLDEN THRESHOLD,  
and we should collect more results to know about the data.  
(結論) p < 0.05なので、試験点数の母中央値には有意な男女差がある。  
注意: 3.とまったく同じデータを使用しているが結論は逆になっている。  
「0.05」は何か根拠のある値ではない。データを把握するには、多くの解析を総合的に判断することが必要。  

#### 5. Does the population MEDIAN of blood pressure decrease after taking a medichine? (Wilcoxon signed rank test)
服薬後に血圧の母**中央値**は低下したといえるか？ (Wilcoxonの符号順位検定)  

H0: Population medians of exam results DOES NOT decreased  

In [None]:
csv_in = 'stat_test_sample2.csv'
df5 = pd.read_csv(csv_in, delimiter=',', skiprows=0, header=0)
print(df5.shape)
display(df5.head())

In [None]:
d = df5['BP_after'] - df5['BP_before']
ranks = ss.rankdata(np.abs(d))
rank_plus = np.sum((d>0)*ranks)
rank_minus = np.sum((d<0)*ranks)
print('rank sum:', rank_plus-rank_minus)
T, p = ss.wilcoxon(d)
print(T, p/2)

##### (Conclusion) Rank sum of BP_after - BP_before is negative, and p/2 < 0.05, so population median of the blood pressure significantly decreased after taking this medicine.  
(結論) この服薬後の血圧の符号付き順位和は負であり(つまり血圧は服薬により低下方向)、かつp/2(片側なので)<0.05なので、この薬を飲むと血圧の母平均は有意に下がったといえる。

#### 6. Are rows and cols significantly correlated on a small 2x2 cross table? (Fisher's exact test)  
Data: Results of the two questions: "Do you live in Tokyo?" and "Do you like baseball?"  
小さな2x2クロス集計表において、行と列の間に関連はあるといえるか？ (Fisher正確確率検定)    
データ: 「東京に住んでいますか？」「野球が好きですか？」のアンケート結果  

H0: "live in Tokyo or not" and "like baseball or not" DOES NOT correlated  

- Hint: Program for count by values: see "Cross Tabulation (クロス集計)" on MOOCs AI-0102.

In [None]:
csv_in = 'stat_test_sample6.csv'
df6 = pd.read_csv(csv_in, delimiter=',', skiprows=0, header=0)
print(df6.shape)
display(df6.head())

In [None]:
cross_tab = pd.crosstab(df6['Live_in_Tokyo'], df6['Like_baseball'])
display(cross_tab)

In [None]:
odds, p = ss.fisher_exact(cross_tab)
print(odds, p)

##### (Conclusion) p < 0.05, so "Live in Tokyo" and "Like baseball" are significantly related. 
(結論) p < 0.05なので、「東京在住」と「野球好き」は有意に関連している。  

#### 7. Are rows and cols significantly correlated on a general cross table?  (test of independence by chi-squared test)   
Data: Results of the two questions: "Where are you from (Kyushu/Shikoku/Honshu/Hokkaido)" and "Do you like baseball?"  
一般のクロス集計表において、行と列の間に関連はあるといえるか？ (独立性のカイ2乗検定)    
データ: 「どの地域出身ですか？」「野球が好きですか？」のアンケート結果  

H0: Area and "like baseball" DOES NOT correlated  

In [None]:
csv_in = 'stat_test_sample7.csv'
df7 = pd.read_csv(csv_in, delimiter=',', skiprows=0, header=0)
print(df7.shape)
display(df7.head())

In [None]:
cross_tab = pd.crosstab(df7['Area'], df7['Like_baseball'])
display(cross_tab)

In [None]:
chi2, p, dof, expected = ss.chi2_contingency(cross_tab)
print(chi2, p, dof, expected)

##### (Conclusion) p > 0.05, so there is no evidence that "Area" is significantly related to "Like baseball". 
(結論) p > 0.05なので、「出身地域」と「野球好き」は有意に関連しているとはいえない。  

#### 8. Are observed frequencies are regarded to be the same as expected frequencies? (chi-square goodness-of-fit test)   
Data: The frequency of "Most lucky" zodiac sign every day: Is it uniform?   
観測度数は理論(期待)度数と同じといえるか？ (適合度のカイ2乗検定)    
データ: 毎日の星占い1位。偏りはない？  

H0: Distribution of "most lucky zodiac sign" is regarded as "uniform".    

- Hint: Program for count by values: see "List unique values and their counts in columns (列の値の種類と個数一覧)" on MOOCs AI-0102.

In [None]:
csv_in = 'stat_test_sample8.csv'
df8 = pd.read_csv(csv_in, delimiter=',', skiprows=0, header=0)
print(df8.shape)
display(df8.head())

In [None]:
lucky = df8['most_lucky'].value_counts()
print(lucky)

In [None]:
chi2, p = ss.chisquare(lucky)
print(chi2, p)

##### (Conclusion) p > 0.05, so there is no evidence that "Distribution of Most Lucky Zodiac is significantly different from uniform"
(結論) p > 0.05なので、「星占い1位」の星座は偏っているとはいえない。  

#### Tukey-Kramer method
Tukey-Kramer多重比較

Data: Exam results of students after three kinds of lectures (A, B, C)  
A, B, Cの3つの講義を受けたあとの試験得点  

H0: All pairs of the population means of exam results (A-B, B-C, C-A) are the same.  

##### Read data

In [None]:
csv_in = 'stat_test_sample_tukey_kramer.csv'
df_tk = pd.read_csv(csv_in, delimiter=',', skiprows=0, header=0)
print(df_tk.shape)
display(df_tk.head())

In [None]:
df_tk_A = df_tk[ df_tk['Lecture']=='A' ]
df_tk_B = df_tk[ df_tk['Lecture']=='B' ]
df_tk_C = df_tk[ df_tk['Lecture']=='C' ]

##### Levene's test (equality of variance)

In [None]:
W, p = ss.levene(df_tk_A['Points'], df_tk_B['Points'], df_tk_C['Points'])
print(W, p)

##### p > 0.05 (= can be regarded as equal variance), so go forward to execute Tukey-Kramer method  

In [None]:
from statsmodels.stats.multicomp import pairwise_tukeyhsd

pairwise_tukeyhsd(data, labels)  # labels can be any string or value

In [None]:
results = pairwise_tukeyhsd(df_tk['Points'], df_tk['Lecture'])
print(results)

##### (Conclusion) Population means of points for lectures A-B and B-C are significantly different.  
(結論) 講義A-B, B-Cの間の試験得点の母平均は有意に差があるといえる。  

#### Two-way ANOVA and Tukey-Kramer method  
二元配置分散分析からのTukey-Kramer多重比較  

##### Read data
NOTE: columns for factors should be read as 'object'.  
So, pd.read_csv(..., dtype={'mg':'object'}) is needed.  
「要因」として用いる列は文字列型(object)で読んでおかないといけない。  
このため pd.read_csv(..., dtype={'mg':'object'}) としている。  

In [None]:
csv_in = 'stat_test_sample_2way_anova.csv'
df_at = pd.read_csv(csv_in, delimiter=',', skiprows=0, header=0,
                 dtype={'mg':'object'})
print(df_at.shape)
print(df_at.info())
display(df_at.head())

##### Two-way ANOVA
二元配置分散分析  

Import additional libraries

In [None]:
import statsmodels.api as sm
from statsmodels.formula.api import ols

Execute 2-way ANOVA

In [None]:
formula = 'Humidity ~ Ingredient + mg + Ingredient:mg'  # 2-way (Ingredient and mg)
model = ols(formula, df_at).fit()
print(model.summary())
aov_table = sm.stats.anova_lm(model, typ=2)  # Usually use Type II ANOVA
print(aov_table)

In [None]:
df_at['labels'] = df_at['Ingredient']+df_at['mg']  # make unique labels for each group
display(df_at.head())

In [None]:
results = pairwise_tukeyhsd(df_at['Humidity'], df_at['labels'])
print(results)

##### (Conclusion) There is no evidence that the population means between A100-B50, A150-B100, A150-B150, B100-B150 are significantly different. Population means of other pairs are significantly different from each other.
(結論) A100-B50, A150-B100, A150-B150, B100-B150 の母平均は有意に異なるとはいえない。残りのペアの母平均は有意に異なるといえる。　　