# T-test Practice

In this notebook, we will see how to use the t-test on some data sets using the t.test() function. 


Our example data set is the birth weight data. We will do hypothesis testing to see if there is statistically significant effects of some variables. 

We will start by comparing the birth weight between smoking and non-smoking mothers. 



In [None]:
library(tidyverse)


# Load data from MASS into a tibble
birthwt <- as_tibble(MASS::birthwt)

# Rename variables
birthwt <- birthwt %>%
  rename(birthwt.below.2500 = low, 
         mother.age = age,
         mother.weight = lwt,
         mother.smokes = smoke,
         previous.prem.labor = ptl,
         hypertension = ht,
         uterine.irr = ui,
         physician.visits = ftv,
         birthwt.grams = bwt)

# Change factor level names
birthwt <- birthwt %>%
  mutate(race = recode_factor(race, `1` = "white", `2` = "black", `3` = "other")) %>%
  mutate_at(c("mother.smokes", "hypertension", "uterine.irr", "birthwt.below.2500"),
            ~ recode_factor(.x, `0` = "no", `1` = "yes"))


In [None]:

# Do a boxplot 
qplot(x = mother.smokes, y = birthwt.grams,
      geom = "boxplot", data = birthwt,
      xlab = "Mother smokes", 
      ylab = "Birthweight (grams)",
      fill = I("lightblue"))


Boxplot shows that the birth weight may be associated with the smoking. Let's see if there is a atatistically significant difference. 

Let's compute the mean and standard devation of birth weight for each group as well as the **standard error**.

In [None]:
birthwt %>%
  group_by(mother.smokes) %>%
  summarize(num.obs = n(),
            mean.birthwt = round(mean(birthwt.grams), 0),
            sd.birthwt = round(sd(birthwt.grams), 0),
            se.birthwt = round(sd(birthwt.grams) / sqrt(num.obs), 0))

To assess the significance of the difference, we will use the `t.test` function. 


In [None]:
birthwt.t.test <- t.test(birthwt.grams ~ mother.smokes, data = birthwt)
birthwt.t.test

P-value is very small, the difference is significant. The mean birth weight for smokers is 2772 grams versus 3056 grams for non-smokers. 

We can do the same analysis with a different syntax: 


In [None]:
with(birthwt, t.test(x=birthwt.grams[mother.smokes=="no"], 
                     y=birthwt.grams[mother.smokes=="yes"]))

We can access to individual pieces of information like following: 

In [None]:
names(birthwt.t.test)


In [None]:
birthwt.t.test$p.value   # p-value

In [None]:
birthwt.t.test$estimate  # group means

## Non-parametric Case

If the data does not follow normal distribution, your sample size should be very large so that your data can follow t-distribution. Otherwise,  you should run a **non-parametric test** that does not make any assumptions about normality or the 
mean. 

We can use the **Wilcoxon rank-sum test** in that case. 


In [None]:
birthwt.wilcox.test <- wilcox.test(birthwt.grams ~ mother.smokes, data=birthwt, conf.int=TRUE)
birthwt.wilcox.test

In [None]:
# OR 

with(birthwt, wilcox.test(x=birthwt.grams[mother.smokes=="no"], 
                     y=birthwt.grams[mother.smokes=="yes"]))

Again, small p-value shows that we can reject the null hypothesis. 

**Keep in mind** that you can use this test for independent samples or paired samples. In the case of **paired samples**, it's actually called **Wilcoxson signed rank test**, but you still call the same function with `paired=TRUE` option. 



**To check whether our data follows normal distribution, we can do a q-q plot.** 


In [None]:
p.birthwt <- ggplot(data = birthwt, aes(sample = birthwt.grams))

p.birthwt + stat_qq() + stat_qq_line()

In [None]:
# Separate plots for different values of smoking status
p.birthwt + stat_qq() + stat_qq_line() + facet_grid(. ~ mother.smokes)


It is not too bad, there is no large deviation from normal distribution. 

Here’s what we would see if the data were **right-skewed.**

In [None]:
set.seed(12345)
fake.data <- data.frame(x = rexp(200))
p.fake <- ggplot(fake.data, aes(sample = x))
qplot(x, data = fake.data)

In [None]:
p.fake + stat_qq() + stat_qq_line()


This is a good indication that you should be careful if you see this type of deviation from the normal. 

---

### Categorical Variables

If we have categorical variables as opposed to continuous, we can use **Fisher's Exact Test** or **Chi-squared test**. 

In [None]:
# First create a categorical variable: 

weight.smoke.tbl <- with(birthwt, table(birthwt.below.2500, mother.smokes))
weight.smoke.tbl

In [None]:
# Use Fisher's exact test 
birthwt.fisher.test <- fisher.test(weight.smoke.tbl)
birthwt.fisher.test

This also shows a significant difference between birth weights. 

We can also use the Chi-squared test: 

In [None]:
chisq.test(weight.smoke.tbl)

You get essentially the same answer by running the Chi-squared test, but the output is not very useful. You are not getting an estimate or confidence interval for the odds ratio.

---

### Paired T-test 

Here is an example of a paired t-test. Take a look at the data:

In [None]:
help(sleep)

In [None]:
sleep

As you can see, the two groups are **NOT** independent. They are actually **same** people given the drug and the measurements are for **before** and **after** the drug was given. We have to use a **paired t-test** here. 

In [None]:
plot(extra ~ group, data = sleep)

In [None]:
sleep.t <- with(sleep,
     t.test(extra[group == 1],
            extra[group == 2], paired = TRUE))
sleep.t

In [None]:
## The sleep *prolongations*
sleep1 <- with(sleep, extra[group == 2] - extra[group == 1])
summary(sleep1)

**Take a look** at the following outputs. What do you think of about rejecting the null hypotheses ? 

In [None]:

# H1: group1 slept more than group2 

sleep.t2 <- with(sleep,
     t.test(extra[group == 1],
            extra[group == 2], paired = TRUE, alternative = "greater"))
sleep.t2

In [None]:

# H1: group1 slept less than group2 


sleep.t3 <- with(sleep,
     t.test(extra[group == 1],
            extra[group == 2], paired = TRUE, alternative = "less"))
sleep.t3

### Another example: 

Suppose that the manufacturer claims that the mean lifetime of a light bulb is more than 10,000 hours. In a sample of 30 light bulbs, it was found that they only last 9,900 hours on average. Assume the sample standard deviation is 125 hours. At .05 significance level, can we reject the claim by the manufacturer?

The null hypothesis is $H_0: \mu \ge 10000$. This is waht the manufacturer claims. 

**YOUR TURN:** Find the t-statistic below: 


In [None]:
mubar = <YOUR CODE HERE> 
mu = <YOUR CODE HERE> 
s = <YOUR CODE HERE> 
n = <YOUR CODE HERE> 

t = (xbar−mu0)/(s/sqrt(n)) 
t


Compute the critical value at alpha=0.05

In [None]:
help(qt)

In [None]:
alpha = 0.05

t.alpha = <YOUR CODE HERE>  # use qt() function similar to qnorm() function 

The test statistic -4.3818 is less than the critical value of -1.6991. Hence, at .05 significance level, we can reject the claim that mean lifetime of a light bulb is above 10,000 hours.

Alternatively, instead of using the critical value, we apply the pt function to compute the lower tail p-value of the test statistic.

In [None]:
pval = pt(t, df=n-1) # lower tail p−value 

pval

---

**YOUR TURN:**

For the following data set, find out if there was an **improvement** in the scores of the subjects. Note that you have to create groups to use the t.test() function. 


This data seems to be about the same group of subjects and has a column `score1` and a column `score2` and is asking if there is an improvement. So we can assume that this is a paired t-test. $H_0$ should be "no improvement"; how do you write it mathematically? Do you do a one-tailed or two-tailed test? 

In [None]:
subject = seq(1,11)
score1 = c(3,3,3,12,15,16,17,19,23,24,32)
score2 = c(20,13,13,20,29,32,23,20,25,15,30)
df <-data.frame(cbind(subject,score1,score2))
head(df)

In [None]:
<YOUR CODE HERE> 

---
### Two-sample T-test: 

The following data contains observations on degree of polymerization for paper specimens for which viscosity times concentration fell in a certain middle range x=(418,421,421,422,425,427,431,434,437,439,446,447,448,453,454,463,465) and and higher range y=(429,430,430,431,36,437,440,441,445,446,447).

We want to test if the observations in middle range and higher range are from populations with different means, at significance level alpha=0.05 .



Run a **two-sample t.test** to see if they are from different populations. HINT: We are looking for a difference of means of two samples. So, it should be two-tailed. 

In [None]:
x = c(418,421,421,422,425,427,431,434,437,439,446,447,448,453,454,463,465)   
y = c(429,430,430,431,36,437,440,441,445,446,447)

In [None]:
help(t.test)

In [None]:

t.test(x, y, alternative="two.sided", mu=0, var.equal=F, conf.level=0.95) 

**What is your interpretation?** Why do we need to use `var.equal`? 