# Hypothesis Testing

Hypothesis tests allow us to take a sample of data from a population and infer about the plausibility of competing hypotheses. 

 For example, in the upcoming “promotions” activity , you’ll study the data collected from a psychology study in the 1970’s to infer as to whether there exists gender-based discrimination in the banking industry as a whole.

Needed packages
ggplot2 for data visualization
dplyr for data wrangling
tidyr for converting data to “tidy” format
readr for importing spreadsheet data into R
As well as the more advanced purrr, tibble, stringr, and forcats packages

In [None]:
library(tidyverse)
library(infer)
library(moderndive)
library(nycflights13)
library(ggplot2movies)

Promotions activity
 Does gender affect promotions at bank?
Say you are working at a bank in the 1970’s and you are submitting your resume to apply for a promotion. Will your gender affect your chances of getting promoted

The moderndive package contains the data on the 48 applicants in the promotions data frame. Let’s explore this data first:

In [None]:
promotions

Let’s perform an exploratory data analysis of the relationship between the two categorical variables decision and gender

In [None]:
ggplot(promotions, aes(x = gender, fill = decision)) +
  geom_bar() +
  labs(x = "Gender of name on resume")

It appears that resumes with female names were much less likely to be accepted for promotion. Let’s quantify these promotions rates by computing the proportion of resumes accepted for promotion for each group using the dplyr package for data wrangling:

In [None]:
promotions %>% 
  group_by(gender, decision) %>% 
  summarize(n = n())

In [None]:
promotions_shuffled

In [None]:
ggplot(promotions_shuffled, aes(x = gender, fill = decision)) +
  geom_bar() +
  labs(x = "Gender of resume name")

Compared to the barplot in  it appears the different in “male” vs “female” promotions rates is now different. Let’s also compute the proportion of resumes accepted for promotion for each group:

In [None]:
promotions_shuffled %>% 
  group_by(gender, decision) %>% 
  summarize(n = n())

Understanding hypothesis tests

First, a hypothesis is a statement about the value of an unknown population parameter. In our resume activity, our population parameter is the difference in population proportions  

 . Hypothesis tests can involve any of the population parameters in Table 8.8 of the 6 inference scenarios we’ll cover in this book and more.

Second, a hypothesis test consists of a test between two competing hypotheses: 1) a null hypothesis  
H
0
  (pronounced “H-naught”) versus 2) an alternative hypothesis  
H
A
  (also denoted  
H
1
 ).

H0 : men and women are promoted at the same rate
HA : men are promoted at a higher rate than women



 Conducting hypothesis tests
 specify variables

In [None]:
promotions %>% 
  specify(formula = decision ~ gender, success = "promoted")

hypothesize the null

In [None]:
promotions %>% 
  specify(formula = decision ~ gender, success = "promoted") %>% 
  hypothesize(null = "independence")

generate replicates

In [None]:
promotions %>% 
  specify(formula = decision ~ gender, success = "promoted") %>% 
  hypothesize(null = "independence") %>% 
  generate(reps = 1000, type = "permute")

calculate summary statistics

In [None]:
null_distribution <- promotions %>% 
  specify(formula = decision ~ gender, success = "promoted") %>% 
  hypothesize(null = "independence") %>% 
  generate(reps = 1000, type = "permute") %>% 
  calculate(stat = "diff in props", order = c("male", "female"))
null_distribution

In [None]:
obs_diff_prop <- promotions %>% 
  specify(decision ~ gender, success = "promoted") %>% 
  calculate(stat = "diff in props", order = c("male", "female"))
obs_diff_prop

visualize the p-value

In [None]:
visualize(null_distribution, binwidth = 0.1)

In [None]:
visualize(null_distribution, bins = 10) + 
  shade_p_value(obs_stat = obs_diff_prop, direction = "right")

In [None]:
null_distribution %>% 
  get_p_value(obs_stat = obs_diff_prop, direction = "right")

Comparison with confidence intervals

In [None]:
null_distribution <- promotions %>% 
  specify(formula = decision ~ gender, success = "promoted") %>% 
  hypothesize(null = "independence") %>% 
  generate(reps = 1000, type = "permute") %>% 
  calculate(stat = "diff in props", order = c("male", "female"))

In [None]:
bootstrap_distribution <- promotions %>% 
  specify(formula = decision ~ gender, success = "promoted") %>% 
  # Change 1 - Remove hypothesize():
  # hypothesize(null = "independence") %>% 
  # Change 2 - Switch type from "permute" to "bootstrap":
  generate(reps = 1000, type = "bootstrap") %>% 
  calculate(stat = "diff in props", order = c("male", "female"))

Using bootstrap_distribution, we first compute the percentile-based confidence intervals:

In [None]:
percentile_ci <- bootstrap_distribution %>% 
  get_confidence_interval(level = 0.95, type = "percentile")
percentile_ci

In [None]:
visualize(bootstrap_distribution) + 
  shade_confidence_interval(endpoints = percentile_ci)

In [None]:
se_ci <- bootstrap_distribution %>% 
  get_confidence_interval(level = 0.95, type = "se", 
                          point_estimate = obs_diff_prop)
se_ci

In [None]:
visualize(bootstrap_distribution) + 
  shade_confidence_interval(endpoints = se_ci)

Interpreting hypothesis tests
 Two possible outcomes
If the p-value is less than α  , we reject the null hypothesis H0 in favor of HA
If the p-value is greater than or equal to α  , we fail to reject the null hypothesis H0

