# W7: Uncertainty and P-Values - Solutions

This week we are going to focus on interpreting standard errors, t-statistics, and p-values. By the end of this notebook, you should have a better understanding of what p-values mean and do not mean. 

We will be using data wherein students were randomly assigned to tutoring. The dataset has two main variables: 
- **`Mentored`**: `TRUE` if the student participated in tutoring, `FALSE` if the student did not participate in tutoring
- **`Final`**: Each student's final score in the course 

In [None]:
#begin by loading the estimatr package and the dataset 
library(estimatr) 
data <- read.csv("final_scores.csv")
head(data)

To explain the relationship between the standard deviation and the standard error, I will first show you what a sampling distribution is. In the code chunks below, calculate the mean and standard deviation of the final scores. 

In [None]:
mean.score <- mean(data$Final) #replace NULL with your code 
mean.score 

sd.score <- sd(data$Final) #replace NULL with your code 
sd.score 

The standard deviation tells us the spread of the final scores in the class whereas the mean tells us the average score in the class. 

In [None]:
# you do not have to understand the code below; run the cells and I will explain the output 
library(tidyverse)
set.seed(94608) #makes sure results are same across all laptops
mean.function <- function(x){
    sample <- sample_n(x, 100, replace = TRUE)
    mean.sample <- mean(sample$Final)
    return(mean.sample)
}
sample.means <- replicate(1000, mean.function(data))

hist(sample.means, 
    main = "Sampling Distribution of Sample Means",
    xlab = "Sample Mean",
    ylab = "Frequency")
abline(v = mean(sample.means), col = "red")

std.error <- sd(sample.means)
std.error

mean.sampling.dist <- mean(sample.means) 
mean.sampling.dist

What I did here is: 
- I created a function that takes a sample of size 100 from the final scores of students and calculates the mean of each sample
- I then repeated the process 1000 times and stored the results in `sample.means` (1000 sample means)
- I plotted the distribution of those sample means - this is called a sampling distribution
- I then took the standard error of that sampling distribution

Compare the mean of the sampling distribution to the mean in our data. What do you observe?

## Standard Error

What is the standard error of the sample below? Let's find the standard error using a sample of final scores we take below (`our.sample$Final`).

$$Standard Error = \sqrt{\frac{s^2_1}{n_1} + \frac{s^2_2}{n_2}}$$

In [None]:
set.seed(94608)

our.sample <- sample_n(data, 100, replace = TRUE) 

#calculate the following using the sd() and length() functions
s1 <- sd(our.sample$Final[our.sample$Mentored == TRUE]) #standard dev for tutored
s1
s2 <- sd(our.sample$Final[our.sample$Mentored == FALSE]) #standard dev for not tutored
s2
n1 <- length(our.sample$Final[our.sample$Mentored == TRUE]) #sample size of tutored, using the length function
n1
n2 <- length(our.sample$Final[our.sample$Mentored == FALSE]) #sample size of not tutored, using the length function
n2

#mathematically calculate using the formula above 
#hint: use sqrt() to calculate the square root 
se <- sqrt((s1^2/n1) + 
               (s2^2/n2)) 
se

## Statistical Significance

Let's say we want to know whether assignment to tutoring improved students' final scores in the course. 

What is our null hypothesis? What is our alternative hypothesis? 

- **Null hypothesis:** There is no difference in the final scores of the students who were tutored and not tutored.
- **Alternative hypothesis:** There is a difference in the final scores of students who were tutored and not tutored.

First, manually calculate the difference in means, the standard error (using the formula above), and the t-statistic for the final scores between students who were and were not tutored in the course. 

In [None]:
diff.in.means <- mean(data$Final[data$Mentored == TRUE]) - mean(data$Final[data$Mentored == FALSE])
se.dim <- sqrt((sd(data$Final[data$Mentored == TRUE])^2/length(data$Final[data$Mentored == TRUE])) + 
               (sd(data$Final[data$Mentored == FALSE])^2/length(data$Final[data$Mentored == FALSE])))
t.stat <- diff.in.means/se.dim

diff.in.means
se.dim
t.stat

Now, using the `difference_in_means()` function, check your calculations. You should get the same difference in means (estimate), standard error, and t-statistic. 

In [None]:
difference_in_means(Final ~ Mentored, data, condition1 = FALSE, condition2 = TRUE) 

We observe a p-value of nearly 0. What does this mean? 
- It means, if the null hypothesis were true, there is a nearly 0 percent chance of observing a difference in means as extreme or more extreme than the one we observed.

Can you think of another way to confirm your results? (**Hint**: Think about the t-statistic)
- We can confirm these results by looking at the t-statistic, which is much larger than 1.96. 