In [1]:
ENV["LINES"] = 200
ENV["COLUMNS"] = 200
using Distributions
using DataFrames
using HypothesisTests
using RCall

rd(x) = round(x; sigdigits=2)

dfs = 4:13
ts = 0:0.1:6
tbl_pvals = @. rd(2ccdf(TDist(dfs'), ts))
names = ["t-value", ("df=$k" for k in dfs)...]
data_pvals = DataFrame([ts tbl_pvals], names)
print(data_pvals)

[1m61×11 DataFrame[0m
[1m Row [0m│[1m t-value [0m[1m df=4    [0m[1m df=5    [0m[1m df=6    [0m[1m df=7    [0m[1m df=8    [0m[1m df=9    [0m[1m df=10   [0m[1m df=11   [0m[1m df=12   [0m[1m df=13   [0m
[1m     [0m│[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m[90m Float64 [0m
─────┼───────────────────────────────────────────────────────────────────────────────────────────────────
   1 │     0.0   1.0      1.0     1.0      1.0      1.0      1.0      1.0      1.0      1.0      1.0
   2 │     0.1   0.93     0.92    0.92     0.92     0.92     0.92     0.92     0.92     0.92     0.92
   3 │     0.2   0.85     0.85    0.85     0.85     0.85     0.85     0.85     0.85     0.84     0.84
   4 │     0.3   0.78     0.78    0.77     0.77     0.77     0.77     0.77     0.77     0.77     0.77
   5 │     0.4   0.71     0.71    0.7      0.7  

以上はt分布での両側P値の表。

以下の表は中原治『基礎から学ぶ統計学』p.169より

<img width=350 src="IMG_1453.jpeg">

In [2]:
x_A = [118, 132, 120, 115, 113]
x_B = [129, 126, 134, 135, 131]
n_A = length(x_A)
n_B = length(x_B)
@show xbar_A = mean(x_A)
@show xbar_B = mean(x_B)
@show s_A = std(x_A)
@show s_B = std(x_B)
@show df = n_A + n_B - 2
@show s_p = sqrt(((n_A - 1) * s_A^2 + (n_B - 1) * s_B^2) / df)
@show t = (xbar_A - xbar_B) / (s_p * sqrt(1/n_A + 1/n_B))
;

xbar_A = mean(x_A) = 119.6
xbar_B = mean(x_B) = 131.0
s_A = std(x_A) = 7.436396977031282
s_B = std(x_B) = 3.6742346141747673
df = (n_A + n_B) - 2 = 8
s_p = sqrt(((n_A - 1) * s_A ^ 2 + (n_B - 1) * s_B ^ 2) / df) = 5.865151319446071
t = (xbar_A - xbar_B) / (s_p * sqrt(1 / n_A + 1 / n_B)) = -3.073234036297997


ここまでは電卓で計算できる。

上の両側P値の表から df = 8 と |t| ≈ 3.1 の値を読み取ると P値 ≈ 0.015 だと分かる。

以下は表を使わない計算。

次のセルはRのptによる計算

In [3]:
@rput df t
@show rcopy(R"""2*pt(abs(t), df, lower.tail=F)""");

rcopy(R"2*pt(abs(t), df, lower.tail=F)") = 0.015272804322800957


次のセルはRのt.testによる計算

In [4]:
@rput x_A x_B
R"""
t.test(x_A, x_B, var.equal=T)
"""

RObject{VecSxp}

	Two Sample t-test

data:  x_A and x_B
t = -3.0732, df = 8, p-value = 0.01527
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -19.954001  -2.845999
sample estimates:
mean of x mean of y 
    119.6     131.0 



In [5]:
@rput x_A x_B
R"""
t.test(x_A, x_B)
"""

RObject{VecSxp}

	Welch Two Sample t-test

data:  x_A and x_B
t = -3.0732, df = 5.8431, p-value = 0.02261
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -20.536084  -2.263916
sample estimates:
mean of x mean of y 
    119.6     131.0 



以下はJuliaでの計算

In [6]:
@show 2ccdf(TDist(df), abs(t));

2 * ccdf(TDist(df), abs(t)) = 0.015272804322800973


In [7]:
EqualVarianceTTest(x_A, x_B)

Two sample t-test (equal variance)
----------------------------------
Population details:
    parameter of interest:   Mean difference
    value under h_0:         0
    point estimate:          -11.4
    95% confidence interval: (-19.95, -2.846)

Test summary:
    outcome with 95% confidence: reject h_0
    two-sided p-value:           0.0153

Details:
    number of observations:   [5,5]
    t-statistic:              -3.073234036297997
    degrees of freedom:       8
    empirical standard error: 3.709447398198281


In [8]:
xxA = copy(x_A)
xxA[3] = 130
@show x_A xxA x_B
EqualVarianceTTest(xxA, x_B)

x_A = [118, 132, 120, 115, 113]
xxA = [118, 132, 130, 115, 113]
x_B = [129, 126, 134, 135, 131]


Two sample t-test (equal variance)
----------------------------------
Population details:
    parameter of interest:   Mean difference
    value under h_0:         0
    point estimate:          -9.4
    95% confidence interval: (-19.23, 0.4269)

Test summary:
    outcome with 95% confidence: fail to reject h_0
    two-sided p-value:           0.0585

Details:
    number of observations:   [5,5]
    t-statistic:              -2.205819295980482
    degrees of freedom:       8
    empirical standard error: 4.261455150532504


In [9]:
UnequalVarianceTTest(x_A, x_B)

Two sample t-test (unequal variance)
------------------------------------
Population details:
    parameter of interest:   Mean difference
    value under h_0:         0
    point estimate:          -11.4
    95% confidence interval: (-20.54, -2.264)

Test summary:
    outcome with 95% confidence: reject h_0
    two-sided p-value:           0.0226

Details:
    number of observations:   [5,5]
    t-statistic:              -3.0732340362979964
    degrees of freedom:       5.843139917416074
    empirical standard error: 3.7094473981982814


In [10]:
xxA = copy(x_A)
xxA[3] = 130
@show x_A xxA x_B
UnequalVarianceTTest(xxA, x_B)

x_A = [118, 132, 120, 115, 113]
xxA = [118, 132, 130, 115, 113]
x_B = [129, 126, 134, 135, 131]


Two sample t-test (unequal variance)
------------------------------------
Population details:
    parameter of interest:   Mean difference
    value under h_0:         0
    point estimate:          -9.4
    95% confidence interval: (-20.14, 1.339)

Test summary:
    outcome with 95% confidence: fail to reject h_0
    two-sided p-value:           0.0749

Details:
    number of observations:   [5,5]
    t-statistic:              -2.205819295980482
    degrees of freedom:       5.355801180341501
    empirical standard error: 4.261455150532504
