# Estimation and Hypothesis Testing

In [1]:
library(tidyverse)

── Attaching packages ─────────────────────────────────────── tidyverse 1.2.1 ──
✔ ggplot2 2.2.1     ✔ purrr   0.2.5
✔ tibble  1.4.2     ✔ dplyr   0.7.5
✔ tidyr   0.8.1     ✔ stringr 1.3.1
✔ readr   1.1.1     ✔ forcats 0.3.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()


In [2]:
options(repr.plot.width=4, repr.plot.height=3)

## Set random number seed for reproucibility

In [3]:
set.seed(42)

## Point estimates

In [4]:
x <- rnorm(10)

In [5]:
x

### Mean

Manual calculation

In [6]:
sum(x)/length(x)

Using built-in function

In [7]:
mean(x)

### Median

Manual calculation

In [8]:
x_sorted <- sort(x)

In [9]:
length(x)

Since there are an even number of observations, we need the average of the middle two data poitns

In [10]:
sum(x_sorted[5:6])/2

Using built-in function

In [11]:
median(x)

### Quantiles

The mean is just the 50 percentile. We can use R to get any percentile we like.

In [12]:
quantile(x, 0.5)

In [13]:
quantile(x, seq(0,1,length.out = 5))

## Interval estimates

### Confidence intervals

In [23]:
ci = 0.95

In [24]:
alpha = (1-ci)
n <- length(x)
m <- mean(x)
s <- sd(x)
se <- s/sqrt(n)
me <- qt(1-alpha/2, df=n-1) * se
c(m - me, m + me)

Note that confidence intervals get larger as the confidence required increases.

In [25]:
ci = 0.99

In [26]:
alpha = (1-ci)
n <- length(x)
m <- mean(x)
s <- sd(x)
se <- s/sqrt(n)
me <- qt(1-alpha/2, df=n-1) * se
c(m - me, m + me)

### Making a function

In [27]:
conf <- function(x, ci=0.95) {
   alpha = (1-ci)
    n <- length(x)
    m <- mean(x)
    s <- sd(x)
    se <- s/sqrt(n)
    me <- qt(1-alpha/2, df=n-1) * se
    c(m - me, m + me)     
}

In [28]:
conf(x)

### Coverage

In 1,000 experiments, we expect the true mean (0) to lie within the estimated 95% CIs 950 times.

In [46]:
n_expt <- 1000
n <- 10
cls <- t(replicate(n_expt, conf(rnorm(n))))

In [47]:
sum(cls[,1] < 0 & 0 < cls[,2])

## Hypothesis testing