# Mean hypothesis bilaterial tests using R

This is a an illustration of how to do (and what for) Mean Hypothesis test in R.

![](http://d33wubrfki0l68.cloudfront.net/61710dface4b4dcef2ba78dba1a38d7d509be9cd/48d1a/img/hypothesis_testing/hypothesis_testing_flow_mean.png)

Note : multiple factors analysis (like two-way Anova) are not concerned by this notebook.

### One sample t-test
Suppose $X \sim \mathcal{N}(\mu,\,\sigma^{2})$ with $\sigma$ unknown. <br/>
Let $x = \{ x_i = X_i(\omega) \}_{i=1,n}$ be a iid n sample from $X$.<br/>
Let $\mu_0$ be the supposed mean of $X$ from observing $x$.<br/>
Let be $H_0 : \{ \mu =\mu_0 \}$.<br/>


In [2]:
x = rnorm(50, 10, 5)
t.test(x, mu=10) 


	One Sample t-test

data:  x
t = -1.6214, df = 49, p-value = 0.1113
alternative hypothesis: true mean is not equal to 10
95 percent confidence interval:
  7.520678 10.265025
sample estimates:
mean of x 
 8.892852 


### One sample Z-test
Suppose $X \sim \mathcal{N}(\mu,\,\sigma^{2})$ with $\sigma$ known. <br/>
Let $x = \{ x_i = X_i(\omega) \}_{i=1,n}$ be a iid n sample from $X$.<br/>
Let $\mu_0$ be the supposed mean of $X$ from observing $x$.<br/>
Let be $H_0 : \{ \mu =\mu_0 \}$.<br/>


In [4]:
# Install requirements (execute once)
install.packages("BSDA")

package 'BSDA' successfully unpacked and MD5 sums checked

The downloaded binary packages are in
	C:\Users\Zeryab\AppData\Local\Temp\RtmpWcO802\downloaded_packages


In [7]:
# Load library
library(BSDA)

In [9]:
x = rnorm(50, 10, 5)
z.test(x, mu=10, sigma.x = 5)


	One-sample z-Test

data:  x
z = 2.1604, p-value = 0.03074
alternative hypothesis: true mean is not equal to 10
95 percent confidence interval:
 10.14172 12.91353
sample estimates:
mean of x 
 11.52762 


### Paired t-test

Suppose $X \sim \mathcal{N}(\mu,\,\sigma^{2})$ with $\sigma$ unknown. <br/>
Let $x = \{ x_i = X_i(\omega) \}_{i=1,n}$ be a iid n sample from $X$.<br/>
Let $x_1$ and $x_2$ be a pair from $x$ <br/>
Let $\mu_{1,2}$ be the (resp.) supposed mean of $X$ from observing (resp.) $x_{1,2}$.<br/>
Let be $H_0 : \{ \mu_d = \mu_2 - \mu_1 = 0 \}$.<br/>


In [32]:
x = rnorm(50, 10, 5)
x1 = x[c(1:25)]
x2 = x[c(26:50)]
t.test(x1, x2, paired=TRUE) 


	Paired t-test

data:  x1 and x2
t = 2.7067, df = 24, p-value = 0.01232
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 0.7084793 5.2581980
sample estimates:
mean of the differences 
               2.983339 


## Wilcoxon Signed Rank test

Suppose $X \sim \mathcal{F_{X}}$ unknown. <br/>
Let $x = \{ x_i = X_i(\omega) \}_{i=1,n}$ be a iid n sample from $X$.<br/>
Let $x_1$ and $x_2$ be a pair from $x$ <br/>
Let $\theta$ be the supposed local shift from observing $x_1$ and $x_2$.<br/>
Let be $H_0 : \{ \theta = 0 \}$.<br/>

In [3]:
x = rlogis(50, 10, 5)
x1 = x[c(1:25)]
x2 = x[c(26:50)]
wilcox.test(x1, x2, paired = T)


	Wilcoxon signed rank test

data:  x1 and x2
V = 274, p-value = 0.001816
alternative hypothesis: true location shift is not equal to 0


###  Two-sample t-test 

Suppose $X,Y \sim \mathcal{N}(\mu,\,\sigma^{2})$ with the same $\sigma$ unknown. <br/>
Let $x = \{ x_i = X_i(\omega) \}_{i=1,n}$ be a iid n sample from $X$.<br/>
Let $y = \{ y_i = Y_i(\omega) \}_{i=1,n}$ be a iid n sample from $Y$.<br/>
Let $\mu_{x,y}$ be the (resp.) supposed mean of (resp) $X,Y$ from observing (resp.) $x,y$.<br/>
Let be $H_0 : \{ \mu_x = \mu_y \}$.<br/>

In [5]:
x = rnorm(50, 10, 5)
y = rnorm(50, 10, 5) 
t.test(x, y)


	Welch Two Sample t-test

data:  x and y
t = 0.00039684, df = 97.982, p-value = 0.9997
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -2.120507  2.121356
sample estimates:
mean of x mean of y 
 9.475463  9.475039 


## Mann-Withney test

Suppose $X \sim \mathcal{F_{X}}$ unknown. <br/>
Suppose $Y \sim \mathcal{F_{Y}}$ unknown. <br/>
Let $x = \{ x_i = X_i(\omega) \}_{i=1,n}$ be a iid n sample from $X$.<br/>
Let $y = \{ y_i = Y_i(\omega) \}_{i=1,n}$ be a iid n sample from $Y$.<br/>
Let be $H_0 : \{ \mathcal{F_{X}} = \mathcal{F_{Y}} \}$.<br/>

In [6]:
x = rlogis(50, 10, 5)
y = rlogis(50, 10, 5)
wilcox.test(x, y)


	Wilcoxon rank sum test with continuity correction

data:  x and y
W = 1570, p-value = 0.02762
alternative hypothesis: true location shift is not equal to 0


### One-way ANOVA

Suppose $X,Y,Z \sim \mathcal{N}(\mu,\,\sigma^{2})$ with the same $\sigma$ unknown. <br/>
Let $x = \{ x_i = X_i(\omega) \}_{i=1,n}$ be a iid n sample from $X$.<br/>
Let $y = \{ y_i = Y_i(\omega) \}_{i=1,n}$ be a iid n sample from $Y$.<br/>
Let $z = \{ z_i = Z_i(\omega) \}_{i=1,n}$ be a iid n sample from $Z$.<br/>
Let $\mu_{x,y,z}$ be the (resp.) supposed mean of (resp) $X,Y,Z$ from observing (resp.) $x,y,z$.<br/>
Let be $H_0 : \{ \mu_x = \mu_y = \mu_z \}$.<br/>

In [29]:
x = rnorm(50, 10, 8)
y = rnorm(50, 10, 8) 
z = rnorm(50, 10, 8) 
xyz = c(x,y,z)
group = c(rep("A", 50), rep("B", 50), rep("C", 50))
xyz.aov = aov(xyz ~ group)
summary(xyz.aov)

             Df Sum Sq Mean Sq F value Pr(>F)
group         2    109   54.46   0.611  0.544
Residuals   147  13101   89.13               

### Kruskal Wallis test

Suppose $X \sim \mathcal{F_{X}}$ unknown. <br/>
Suppose $Y \sim \mathcal{F_{Y}}$ unknown. <br/>
Suppose $Z \sim \mathcal{F_{Z}}$ unknown. <br/>
Let $x = \{ x_i = X_i(\omega) \}_{i=1,n}$ be a iid n sample from $X$.<br/>
Let $y = \{ y_i = Y_i(\omega) \}_{i=1,n}$ be a iid n sample from $Y$.<br/>
Let $z = \{ z_i = Z_i(\omega) \}_{i=1,n}$ be a iid n sample from $Z$.<br/>
Let $\mu_{x,y,z}$ be the (resp.) supposed mean of (resp) $X,Y,Z$ from observing (resp.) $x,y,z$.<br/>
Let be $H_0 : \{ \mathcal{F_{X}} = \mathcal{F_{Y}} = \mathcal{F_{Z}} \}$.<br/>

In [27]:
x = rlogis(50, 10, 8)
y = rlogis(50, 10, 8) 
z = rlogis(50, 10, 8) 
xyz = c(x,y,z)
group = c(rep("A", 50), rep("B", 50), rep("C", 50))
kruskal.test(xyz ~ group)


	Kruskal-Wallis rank sum test

data:  xyz by group
Kruskal-Wallis chi-squared = 2.5967, df = 2, p-value = 0.273
