# Uncertainty and Statistical Significance (Cont.)

### PS 3 Week 7 Discussion - Yue Lin


--------

Today, we'll look at the data on final scores in a large undergraduate class where students were randomly assigned to attend tutoring. The final scores for all of the students in the course were recorded and participation in tutoring was noted. 

-------

**Data Dictionary/Codebook**

`Mentored`: `TRUE` = student participated in tutoring, `FALSE` = student did not participate in tutoring

`Final`: student score on final

--------


In [None]:
# Run this cell
library(estimatr)
scores <- read.csv("final_scores.csv")
head(scores)

## Uncertainty

How did students do on the Final? Let's have a look at the average score and standard deviation in the class.

In [None]:
# mean
NULL

In [None]:
# SD
NULL

Imagine that I am only able to get data for 50 students. I randomly sampled 50 students and asked them for their final scores. Now we treat the entire dataset/class as the population, and the 50 students as a sample.  

In [None]:
# Just run codes below, you do not need to understand

set.seed(101) # set.seed to keep the result consistent across laptops
our_sample <- scores[sample(1:nrow(scores), 50, replace = FALSE),] 
hist(our_sample$Final, col = "gray")
abline(v = mean(our_sample$Final), col = "red", lwd = 3) 
# draw a red line to show the mean Final score in the 50 students sample

In [None]:
# What's the mean of the Final score for these 50 students?
NULL

# the standard deviation of 50 students?
NULL

Compare this to the mean of all of the students in the dataset! 

If we did this 1000 times, what are all of the potential outcomes? Let's take a look at the first 10:

In [None]:
# Just run codes below, you do not need to understand
samples <- list()
sample_means <- array()

for(i in c(1:1000)){
    samples[i] <- list(scores[sample(1:nrow(scores), 50, replace = FALSE),]) # sample 50 students without replacement
    sample_means[i] <- mean(samples[[i]]$Final) # calculate their average score
}

head(sample_means, 10)

In [None]:
# Here's the range of outcomes we got with 1000 different samples:
min(sample_means)
max(sample_means)

**Uncertainty** is measured by the amount of error/noise in an estimate of the mean.

In [None]:
# Let's have a look at the histogram
hist(sample_means)
abline(v = mean(scores$Final), col = "red", lwd = 3) # draw a red line to show the average of the sample means 

# Recall: no matter what type of distribution from the population, sampling distribution (i.e., the distribution of the sample means) will always have a Normal Distribution.

## Standard Error

What is the standard error of the samples above? Let's find the standard error using the first sample of Final scores we took (`our_sample$Final`).

$$Standard Error = \sqrt{\frac{s^2_1}{n_1} + \frac{s^2_2}{n_2}}$$

In [None]:
s1 <- ... #standard dev for not tutored
s2 <- ... #standard dev for tutored
n1 <- ... #sample size of not tutored, using the length function
n2 <- ... #sample size of tutored, using the length function

se <- ... #mathematically calculate using the formula above
se

What does the value we got for the standard error tell us?

## Statistical Significance

Back to our original question: Did receiving tutoring affect student performance on the Final? 

**Null Hypothesis:** Tutoring had no effect on Final Scores- scores for students who did not receive tutoring were the same as scores for students who did receive tutoring.  
**Alternative Hypothesis:** Tutoring did have an effect on Final Scores- scores for students who did not receive tutoring were different from scores for students who did receive tutoring.

### t- statistic

The t-statistic describes how likely an estimate of the size we saw would arise by chance even if there is no treatment effect.

Let's compare the average Final score for the students that were tutored in our sample of 50 students to the sample distribution of means if there were no treatment effect (not tutored).

In [None]:
#Find the average Final performance for the students that were tutored in our sample
mean_sample_tutored = NULL
mean_sample_tutored #our estimate from the sample

In [None]:
#Finding means Final score for students not tutored in samples
#Run this cell, you do not need to understand the code here

no_effect_means <- array() 
no_effect_scores <- numeric(0)
for(i in c(1:1000)){
    no_effect_means[i] <- mean(samples[[i]][!samples[[i]]$Mentored,]$Final)
    no_effect_scores = c(no_effect_scores, samples[[i]][!samples[[i]]$Mentored,]$Final)
}

In [None]:
#Visualize our results
hist(no_effect_means, col = "gray")#, xlim = range(40, 75))
abline(v = mean_sample_tutored, col = "red", lwd = 3) # draw a red line to show the mean of the tutored group, and compare

**What does the histogram above tell us about the hypotheses?** 

Let's confirm by finding the t-statistic.

$$ t = \frac{estimate}{SE} $$

The estimate in this case is the difference in the Final score between the two groups (tutored & not tutored)

In [None]:
# We already have the mean for the tutored students (mean_sample_tutored)
# First, find the mean for the students who were not tutored in our sample
mean_sample_not_tutored <- ...

estimate <- ...
t_statistic <- ...

estimate #print estimate
t_statistic #print t-statistic

Is this t-statistic larger than 1.96 or smaller than -1.96?
What does it tell us about the effect of our treatment?

### Using the difference_in_means() function

Today's exercise so far was simply to show the statistical intuition behind statistical processes! In practice, we'll use the `difference_in_means()` function to calculate the effect, t-score, standard error, and p-value. 

**Let's compare the results we calculated for `our_sample` to the results evaluated using `difference_in_means()`**

In [None]:
...

**Now lets take a look at the `difference_in_means` for the entire `scores` dataset.**

In [None]:
...

**Let's interpret the results:** 

What is the treatment effect? Standard Error? t-statistic? p-value?

What do the t-statistic and p-value tell us about the treatment effect? Why? 
