# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

<div class="span5 alert alert-info">
### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
+ Formulas for the Bernoulli distribution: https://en.wikipedia.org/wiki/Bernoulli_distribution
</div>
****

In [1]:
import pandas as pd
import numpy as np
from scipy import stats

In [2]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [3]:
# number of callbacks for black-sounding names
sum(data[data.race=='w'].call)

235.0

In [4]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


## Question 1

+ The appropriate test to use for this problem is a permutation A/B test. The problem states that resumes were randomly assigned to names, so random sampling is assumed. The data has close to 5,000 rows, so we can assume n is sufficiently large. Finally, we can assume that more than 50,000 resumes have been submitted in the United States, so the observations can be assumed to be independent. Therefore, the CLT applies.

## Question 2

+ We will be evaluating the null hypothesis that the sound of an applicant's name has no bearing on whether or not he or she was requested for an interview. The alternate hypothesis is that the sound does have a bearing on interview requests. 

In [5]:
w = data[data.race=='w']
b = data[data.race=='b']

In [6]:
### Compute the p-value using a permutation test.

# Create function for computing permutation samples

def permutation_sample(data1, data2):

    data = np.concatenate((data1, data2))

    permuted_data = np.random.permutation(data)

    perm_sample_1 = permuted_data[:len(data1)]
    perm_sample_2 = permuted_data[len(data2):]

    return perm_sample_1, perm_sample_2

# Create a function for computing multiple permutation replicates

def draw_perm_reps(data_1, data_2, func, size=1):

    perm_replicates = np.empty(size)

    for i in range(size):

        perm_sample_1, perm_sample_2 = permutation_sample(data_1,data_2)

        perm_replicates[i] = func(perm_sample_1, perm_sample_2)

    return perm_replicates

# Define a function for computing the fraction of black-sounding applicants that received interview requests 
# (this is our test statistic)

def frac_request(white,black):
    
    frac = np.sum(black)/len(black)
    
    return frac

# Store the 'call' columns of the black-sounding and white-sounding dataframes as numpy arrays

w_requests = np.array(w['call'])
b_requests = np.array(b['call'])

# Calculate the empirical fraction of black-sounding names that received interview requests

b_frac_actual = np.sum(b_requests) / len(b_requests)

# Draw 10000 permutation samples and compute the permutation replicates

perm_reps = draw_perm_reps(w_requests,b_requests, frac_request, size=10000)

#Calculate the p-value

p = np.sum(perm_reps<=b_frac_actual)/len(b_requests)

print('The p-value is:', p)

The p-value is: 0.0


In [7]:
### margin of error, and confidence interval of difference in proportions using frequentist approach

# Calculate the fraction of requests for each group, and the empirical difference between them

w_frac = np.sum(w_requests) / len(w_requests)
b_frac = np.sum(b_requests) / len(b_requests)
empirical_diff = w_frac - b_frac

# Calculate the standard error

s_e = np.sqrt(w_frac*(1-w_frac)/len(w_requests) + b_frac*(1-b_frac)/len(b_requests))

# Calculate the critical z-value

z_critical = stats.norm.ppf(.975)

# Calculate the margin of error

margin_of_error = s_e * z_critical

# Calculate the 95% confidence interval

conf_low = empirical_diff - margin_of_error
conf_high= empirical_diff + margin_of_error

confidence_interval = np.array([conf_low,conf_high])

print(confidence_interval)

[0.01677773 0.04728798]


## Analysis of hypothesis test and confidence interval

+ The extremely small p-value for the permutation hypothesis test suggests that the null hypothesis is false. In other words, the sound, in terms of race, of a person's name, does seem to have some bearing on whether or not that person receives a request for an interview. 
+ The 95% confidence interval we constructed from the difference in sample proportions suggests similarly. A difference of 0, where the true proportions of white versus black sounding names that received requests are the same, does not even appear in the confidence interval. Therefore, we are confident that on average, the proportion of white-sounding names that receive requests is larger than the proportion of black-sounding names that receive requests. 
+ However, these results DO NOT imply that race/name is the most important factor in callback success. It only implies that the two factors seem to be related. If we wanted to discover the most important factors, we might consider observing the correlation coefficient matrix of all of the variables, and running further hypothesis testing on those values that seem signficantly large.