# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

<div class="span5 alert alert-info">
### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
+ Formulas for the Bernoulli distribution: https://en.wikipedia.org/wiki/Bernoulli_distribution
</div>
****

In [1]:
import pandas as pd
import numpy as np
from scipy import stats
import seaborn as sns
import matplotlib.pyplot as plt
import math

In [2]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

## 1. What test is appropriate for this problem? Does CLT apply?

We are only interested in whether or not applicants receive callbacks. The 'call' column contains binary data, so we will use a binomial distribution. Also, as we have two separate populations, we will be using a two sample test. Lastly, because our each of the population size is greater than 30, we will be using a z-test.

The Central Limit Theorem still applies because we have a large enough sample size (4000+ resumes).

## 2. What are the null and alternate hypotheses?

**Null hypothesis:** Race does not have an impact on the rate of callbacks for resumes.

**Alternate hypothesis:** Race does have an impact on the rate of callbacks for resumes.

## 3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.

We will be using a 95% confidence level to calculate the margin of error, confidence interval, and p-value.

### Bootstrapping

In [3]:
def calculate_prob(data):
    """Calculate probability (value / size of array)"""
    
    size = len(data)
    
    # Initialize empty array
    prob_array = np.empty(size)
    
    # Calculate probability and store in array
    for i, x in enumerate(data):
        prob_array[i] = x / size
        
    return prob_array

In [4]:
w = data[data.race=='w']
b = data[data.race=='b']

# TOTAL NUMBER OF CALLBACKS
w_sum = sum(w.call)
b_sum = sum(b.call)


# CALLBACK PROBABILITY
w_prob = w_sum / len(w)
b_prob = b_sum / len(b)
prob_diff = w_prob - b_prob


# GENERATING SAMPLES
w_sample = np.random.binomial(len(w), w_prob, size=len(w))
b_sample = np.random.binomial(len(b), b_prob, size=len(b))


# CALCULATE CALLBACK PROBABILITIES ON ABOVE SAMPLES
w_sample_prob = calculate_prob(w_sample)
b_sample_prob = calculate_prob(b_sample)
prob_differences = w_sample_prob - b_sample_prob


# SHIFT BOTH ARRAYS SO THEY HAVE THE SAME MEAN (SINCE WE ARE ASSUMING THEY HAVE IDENTICAL CALLBACK PROBABILITIES)
mean = np.mean(prob_differences)
w_shifted = w_sample_prob - np.mean(w_sample_prob) + mean
b_shifted = b_sample_prob - np.mean(b_sample_prob) + mean
prob_diff_shifted = w_shifted - b_shifted


# MARGIN OF ERROR
se = math.sqrt((w_prob * (1 - w_prob)) / len(w) + (b_prob * (1 - b_prob)) / len(b))
margin_error = 1.96 * se
print('Margin of error:', margin_error)


# CONSTRUCT CONFIDENCE INTERVAL
conf_int = np.percentile(prob_differences, [2.5, 97.5])
print('Confidence interval:', list(conf_int))


# FIND P-VALUE
p = np.sum(prob_diff_shifted >= prob_diff) / len(prob_diff_shifted)
print('P-value:', p)

Margin of error: 0.015255406349886438
Confidence interval: [0.017248459958932233, 0.04722792607802875]
P-value: 0.0


### Frequentist statistical testing

In [5]:
# GET P-VALUE
std_err = math.sqrt((w_prob * (1 - w_prob)) / len(w) + (b_prob * (1 - b_prob)) / len(b))
mrgn = 1.96 * std_err
z = (prob_diff - 0) / std_err
p = stats.norm.sf(abs(z)) * 2

print('Margin of error: ', mrgn)
print('Confidence interval: [', prob_diff - mrgn, ',', prob_diff + mrgn, ']')
print('P-value:', p)

Margin of error:  0.015255406349886438
Confidence interval: [ 0.016777447859559147 , 0.047288260559332024 ]
P-value: 3.862565207522622e-05


## 4. Write a story describing the statistical significance in the context or the original problem.

While it is no surprise racial discrimination is still prevalent, we wonder how much of an impact it has on job applications.

We have a relatively large sample of resumes collected (4870) and assigned similar resumes to two different racial groups, some to white-sounding names and the remaining to black-sounding names. Looking at just the callback success, we performed a hypothesis test.

Assuming race does not impact the probability of receiving a callback, we figure out what is the likelihood of getting the difference in callback probabilities from the sample (about 3.2%). A p-value of 0.00003 informs us the 3.2% difference is extremely unlikely to occur if race does not affect callback success.  The results lead us to believe that unfortunately, racial discrimination is still an issue in the U.S. job market. 

Having a black-sounding name attached to a resume will decrease the likelihood of the applicant getting callbacks from potential employers.

## 5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

While race does affect callback probabilities, it might not be the only factor that impacts the likelihood of getting a callback. Despite being in the 21st century, gender inequality is still an existing problem. Gender may very well have an impact on callback success as well, especially in STEM fields. Level of education may play a hand too. I would perform hypothesis tests to determine whether sex and level of education have effects on callback probabilities.