T-tests and non-parameteric equivalent tests

1. Two-sample t-test
2. Paired t-test
3. Wilcoxon Signed rank test
4. Mann-Whitney U test

1. Performing unpaired t-tests in R

t.test(paired=FALSE, alternative="two.sided") are the default so you do not need to specify

In [None]:
control<-c(91, 87, 99, 77, 88, 91)
treatment<-c(101, 110, 103, 93, 99, 104)
mydata<-data.frame(control,treatment)
mydata

boxplot(control,treatment, data=mydata, names=c("control", "treatment"), ylab="Time", xlab="Group")

In [None]:
t.test(mydata$treatment,mydata$control)

Alternative hypotheses can be specified
"less", "greater", or default="two.sided"

Compare the following results and note that ORDER MATTERS

In [None]:
t.test(mydata$control,mydata$treatment,
       alternative="less")

In [None]:
t.test(mydata$control,mydata$treatment,
       alternative="greater")

In [None]:
t.test(mydata$treatment,mydata$control,
       alternative="greater")

tilde (~) vs. comma(,) know when to use

t.test(variable1,variable2)
t.test(variable1~variable2)

We will use the iris data set in R

In [None]:
data(iris)
head(iris)

In [None]:
library(plyr)
Iris.sum<- ddply(iris, c("Species"),summarise,
                         mean1=mean(Sepal.Length),mean2=mean(Sepal.Width), mean3=mean(Petal.Length))
Iris.sum

We will test if two of the iris species veriscolor and virginica have different Sepal.Widths

First subset to these two species

In [None]:
IrisV<-subset(iris,!iris$Species=="setosa")
summary(IrisV)
boxplot(Sepal.Width~Species, data=IrisV)

We run the t-test with ~ because there is one column with the measurements and on column with the factor

In [None]:
t.test(Sepal.Width~Species, data=IrisV)

2. Performing paired t-tests in R

Used the paired=TRUE option

test if Sepal.Length is different then Sepal.Width for any given iris

In [None]:
t.test(iris$Sepal.Length,iris$Sepal.Width, paired=TRUE,alternative="two.sided")

3. Wilcoxon signed rank test

Non parametric test are used when data are not normally distributed and the central limit theorem does not hold

Use in place of a paired t-test for non-normal data

We will use a fertilizer data set
Biomass measured pre and post fertilizer addition

In [None]:
Fert.pre<-c(18.2,17.6,16.8,18.8,17.4,18.7,15.2,18.8,16.5,15.9)
Fert.post<-c(20.1,19.7,19.1,19.1,16.4,15.9,18.4,17.1,17.1,16.7)

Fert.all<-cbind(Fert.pre,Fert.post)

Fert.all

boxplot(Fert.pre,Fert.post, ylab="Biomass", xlab="Treatment", names=c("pre","post"))
hist(Fert.pre, xlab="Pre Biomass")
hist(Fert.post, xlab="Post Biomass")

We will test if post is greater than pre fertilizer

In [None]:
wilcox.test(Fert.post,Fert.pre,paired=TRUE, alternative="greater")

4. MANN-WHITNEY U TEST 

Use in place of a two-sided t-test if data is not normal

First we will put all our data into one dataframe

Create control data frame named Fert1

In [None]:
Biomass<-c(19.1,16.4,15.9,18.4,17.1,15.2,18.8,16.5,16.0,17.2,16.7)

Fert1<-as.data.frame(Biomass)
Fert1
Fert1$Type<-rep("Control",length(Fert1))
Fert1

Create treatment data frame Fert2

In [None]:
Biomass<-c(18.2,20.1,17.6,16.8,18.9,19.7,19.3,17.4,18.7)
Fert2<-as.data.frame(Biomass)
Fert2
Fert2$Type<-rep("Treated",length(Fert2))
Fert2

Because they have the same column names "Biomass" and "Type" we can bind them together using rbind

In [None]:
Fert3<-as.data.frame(rbind(Fert1,Fert2))
Fert3

Look at histogram, looks non-normal

In [None]:
hist(Fert3$Biomass, xlab="Biomass")

In [None]:
boxplot(Biomass~Type, data=Fert3, xlab="Type",ylab="Biomass")

Run Mann-Whitney test

In [None]:
wilcox.test(Biomass~Type, data=Fert3, paired=FALSE)