# Lab week 11 - Outcomes


In this lab you should read through and run the code in the lab sheet and complete the lab assessment. By the end of this lab you should be able to use R to:

* Use two-sided confidence intervals to make a decision.
* Construct one-sided confidence intervals.
* Use the p-value to make a test decision.


Please run the following line of code for the purpose of better graph quality.



In [None]:
options(repr.plot.width=4, repr.plot.height=4, repr.plot.res = 120)

**Also run the next block of code please. It ensures that random sampling is reproducible. You need to re-run this block every time you restart the lab sheet!!!**

In [None]:
## You MUST run this code (every time you open the lab sheet).
aux <- version
if (((as.numeric(aux$major) >= 3) && (as.numeric(aux$minor) >= 6)) || (as.numeric(aux$major) >= 4)) {
  RNGkind(kind = "Mersenne-Twister", normal.kind = "Inversion", sample.kind = "Rounding")
} else {
  RNGkind(kind = "Mersenne-Twister", normal.kind = "Inversion")
}

set.seed(12345)
cat("1st check: 5 = ",sample(1:6,1),"\n",sep="")
cat("2nd check: 9 = ",ceiling(runif(1,0,10)),"\n",sep="")
cat("If the statements above are right, then it is ok.\n",sep="")
cat("\n",sep="")
cat("If you get a warning message as below, this is ok. \n   'Warning message in RNGkind(kind = ",'"',"Mersenne-Twister",'"',
    ", normal.kind = ",'"',"Inversion",'"',", :\n",'   "',"non-uniform 'Rounding' sampler used",'"',"\n",sep="")

# Hypothesis Testing via Confidence Intervals


Last week we learned how to construct confidence intervals. This week, we will learn how such confidence intervals enable us to make test decisions for hypothesis tests.


# Exercise 1.1: Test on the Mean via Confidence Interval

Let's go back to the "pregnancy" datafile, which amongst other information contains the mother's weight in pounds.
We will now build confidence intervals similar to last week to test the following hypothesis pair: $H_0: \mu = 126$ vs. $H_1: \mu \neq 126$, where $\mu$ is the population mean of pregnancy weights.
To do so we will:

- Treat the entire dataframe as the "original" sample.
- Bootstrap 2000 replications of the sample mean.
- Construct a confidence interval.
- Make a test decision.

Upload the datafile first and recall the following code for the first two steps from above.

In [None]:
pregnancy =  read.csv("week11-pregnancy.csv", header = T, sep = ",")
head(pregnancy)

In [None]:
set.seed(1400)

original = pregnancy$Maternal.Pregnancy.Weight


bootstrap_mean = function(replications)
{ 
  boot_mean = c()
  for(i in 1 : replications)
     { 
       bootstrap_sample = original[sample(length(original), size = length(original), replace = T)]
       boot_mean[i] = mean(bootstrap_sample)
     }
  return(boot_mean)    
}

result <- bootstrap_mean(2000)

The variable "result" now contains 2000 bootstrapped means of maternal pregnancy weight. Construct confidence intervals based on those 2000 means as indicated in your assignment, and make a decision regarding the above hypothesis pair based on those intervals. 

Hint: This decision can be made based on whether $\mu_0 = 126$ falls inside (do NOT reject $H_0$) or outside (reject $H_0$) the confidence interval.

In [None]:
# use this block to construct confidence intervals.

# One-sided confidence intervals

Note that in Exercise 1.1 we tested whether the population mean of mother's pregnancy weight is equal to ($H_0$) or unequal to ($H_1$) 126 pounds. Correspondingly, we had to construct a two-sided confidence interval. However, confidence intervals can also be one-sided depending on the situation at hand. For example, if one is only interested to find a lower limit of the parameter of interest, a "lower one-sided CI" $[LB,\infty)$ is helpful.
Analogously, $(-\infty, UB]$ is called an upper one-sided CI.


# Exercise 1.2:

Repeat the process from Exercise 1.1 for the flight data from a couple of weeks ago. Treat the dataframe below as your sample which you bootstrap from. Test whether the true average delay is equal to 16 minutes.  

In [None]:
flights =  read.csv("week11-united_summer2015.csv", header = T, sep = ",")
head(flights)
nrow(flights)

In [None]:
set.seed(1400)

# use this block to repeat the bootstrap and find the one sided confidence intervals. Then make your test decision(s). 


# Exercise 1.3: Test on the Proportion via Confidence Interval

We will now aim to perform a hypothesis test on the true proportion $\pi$ of flights headed for Newark Liberty International Airport (EWR). This information can be accessed via the 'Destination' variable. We will use the subset command for this. As always, use `?subset` if you want to learn more about this command.

In [None]:
summary(flights)

# number of flights
nrow(flights)

# number of flights that went to Los Angeles
Newark = subset(flights, flights$Destination == "EWR")
nrow(Newark)

# relative frequency
nrow(Newark)/nrow(flights)


As we can see, the sample proportion of flights headed for Newark is roughly 9.56%. Let's assume United Arlines claimes that 10% of their domestic flights have destination Newark and motivated by the sample result we wanted to falsify this claim, i.e. test whether the true proportion is unequal to 10%. In other words: Is the sample result of 9.56% different enough from 10% to conclude that the true proportion is not 10%? We will now repeat the process from above to build a confidence interval for the true proportion to test this assertion, that is, we will simulate the distribution of the true proportion via 2000 bootstrapped replications based on the sample from above. Fill in the blanks in the code below to do so please.

In [None]:
# THIS CODE BLOCK WILL HAVE TO BE MANIPULATED BY YOU TO WORK!
# To do so, replace all "...j..." sections with the correct code for j=1,2,3,4.


set.seed(1400)

original = flights$Destination


bootstrap_proportion = function(replications)
{ 
  boot_proportion = c()
  for(i in 1 : replications)
     { 
       bootstrap_sample = original[sample(length(original), size = ...1... , replace = T)]
       EWR.subset = subset(...2..., bootstrap_sample == "EWR")
       boot_proportion[i] = length(...3...)/length(bootstrap_sample)
      
     }
  return(...4...)    
}

result <- bootstrap_proportion(2000)

The vector "result" now contains 2000 replicated proportions. Create a histogram if you like to visulise the distribution of those replications.

In [None]:
hist(result)

In [None]:
# use this block to find the confidence intervals based on your 2000 replications. Then make your test decision(s). 

# Test on the Proportion via p-value.

Instead of using confidence intervals to arrive at a test decisions, we can alternatively use **p-values** as you have learned in lecture. Study the corresponding lecture material, then take a look at the following example.

Do you remember the court example featuring Mike who was unhappy with the composition of his jury panel.

Similarly to Mike your imaginary friend Freddy feels betrayed about his jury composition. In particular, Freddy is quite upset about the supposingly random selection of the jury panel, which consist of 100 people. Freddy is 65 years old and feels that his chances for a "not-guilty" verdict would depend on the amount of senior citizens on the jury. Without further details about why he feels that way or the crime he committed, let us investigate whether Freddy is right to feel betrayed.
Freddy was told by his friend Francesca that exactly 32% of the population in his distract were senior citizens and therefore felt lied to when he found out that only 24 of the 100 people from the jury panel were seniors.

To evaluate how likely or unlikely such an outcome really is under the given presumption that 32% of the population are seniors, we will follow the same steps as performed in lecture. First we take one random sample of the population described above.

In [None]:
# creating the categories
sample_space = c("Senior","NonSenior")

# taking a random sample  
sample_100 = sample(sample_space, size = 100, replace = T, prob = c(0.32, 0.68))

# getting the count of the categories 
summary.factor(sample_100)

# getting the proportion of the categories 
proportion_categories = summary.factor(sample_100)/100
proportion_categories

Clearly, the statistic of interest for our analysis is the sample proportion. If you run the code above multiple times, the output will vary since every simulation will have different outcomes. Recall that we already discussed that a statistic is based on the outcome of the sample and hence changes its value!?
We now want to simulate the sample proportion 1000 times and report its distribution via a histogram to get an idea of how unlucky Freddy really was. Familiarise yourself with the following lines of code, then run them. 

In [None]:
set.seed(1400)

sample_space = c("Senior","NonSenior")
samples <- c()

for (i in 1: 1000) {
    # as above code
    sample_100 = sample(sample_space, size = 100, replace = T, prob = c(0.32, (1-0.32)))

    # getting the count category
    samples[i] = sum(sample_100 == "Senior")
  }

hist(samples)
mean(samples)

Based on the histogram, do you think that Freddy is right to claim he was lied to or did Franseca tell the truth and Freddy was just unlucky? 

Now, let's use R to visualise how unlucky Freddy would have been if 32% was indeed the true proportion.

In [None]:
hist(samples, breaks = 30, xlab = " Number of Senior Citiziens", main = "")

# showing observed value on the horizontal axis
observed_statistic = 24
points(observed_statistic, 0, pch = 16, col = "red")

# Exercise 2.1
We can calculate how many of our 1000 samples had an average of less than 24 seniors. That percentage is represented by the area to the left of the red dot in the above histogram. Recall from lecture that this area is commonly referred to as the **p-value**. P-values are very important in inferential statistics as they allow us to make test decisions. We hence recommend that you read and internalise its definition: **The p-value is the probability to observe such a result (24) or a more extreme result (<24) given the nullhypothesis is true.** Note that "a more extreme result" here means observing less than 24 seniors. 
A test decision based on the p-value is made according to the following rule: **If the p-value is smaller than (or equal to) the chosen significance level, then the null hypothesis can be rejected.**

We will now calculate the p-value for our test using R. Recall that "samples" is a vector containing the observed number of seniors (out of 100) of 1000 simulations.

In [None]:
# extracting the values that are <= 24 and calculating the corresponding proportion.
p_value = length(samples[samples <= 24]) /1000
p_value

Based on the p-value of 0.046 Freddy has sufficient evidence to prove at a 5% significance level that the true proportion of senior citizens is in fact less than 32% and Francesca lied to him.

# Exercise 2.2

Francesca herself acted as Freddy's companion and is on trial herself. Other than Freddy she is hoping for people under the age of 65 to be sitting on her jury panel. Based on our analysis above we can clearly tell that she must have only wanted to comfort Freddy when she told him that 32% of the population are seniors. In fact, a day before she told Freddy, her lawyer had just told her, that the proportion of non-seniors is believed to be 75%.

When she gets 85 Non-Seniors on her jury panel, she feels extremely lucky, questions her lawyers information and suspects, that the actual proportion of Non-Seniors exceeds 75%.

Perform the corresponding hypothesis test on the proportion of Non-Senior citizens. Follow the steps from above.

Based on your analysis, how likely was Francesca to end up with a jury panel of 85 (or more) Non-Seniors? Is she right to feel lucky with that result or is 85 an outcome that could be expected under the assumption that 75% of the population are Non-Seniors?

In [None]:
set.seed(12345)
sample_space = c("Senior","NonSenior")

# use this cell to calculate the p-value based on the above information.

Please finalise this week's lab sheet now by answering the last of your assignment questions. Remember to round all results to 3 decimal places. Following zero's can of course as always be omitted. For example, state 2.5 as 2.5 and not as 2.500. 

Good luck!