# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
+ Formulas for the Bernoulli distribution: https://en.wikipedia.org/wiki/Bernoulli_distribution

In [27]:
import pandas as pd
import numpy as np
from scipy import stats

In [28]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [29]:
#Examining the data
data.head(10)

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit
5,b,1,4,2,6,1,0,0,0,266,...,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,Private
6,b,1,4,2,5,0,1,0,0,13,...,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,Private
7,b,1,3,4,21,0,1,0,1,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit
8,b,1,4,3,3,0,0,0,0,316,...,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,Private
9,b,1,4,2,6,0,1,0,0,263,...,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,Private


In [30]:
#What columns are in the data?
data.columns

Index(['id', 'ad', 'education', 'ofjobs', 'yearsexp', 'honors', 'volunteer',
       'military', 'empholes', 'occupspecific', 'occupbroad', 'workinschool',
       'email', 'computerskills', 'specialskills', 'firstname', 'sex', 'race',
       'h', 'l', 'call', 'city', 'kind', 'adid', 'fracblack', 'fracwhite',
       'lmedhhinc', 'fracdropout', 'fraccolp', 'linc', 'col', 'expminreq',
       'schoolreq', 'eoe', 'parent_sales', 'parent_emp', 'branch_sales',
       'branch_emp', 'fed', 'fracblack_empzip', 'fracwhite_empzip',
       'lmedhhinc_empzip', 'fracdropout_empzip', 'fraccolp_empzip',
       'linc_empzip', 'manager', 'supervisor', 'secretary', 'offsupport',
       'salesrep', 'retailsales', 'req', 'expreq', 'comreq', 'educreq',
       'compreq', 'orgreq', 'manuf', 'transcom', 'bankreal', 'trade',
       'busservice', 'othservice', 'missind', 'ownership'],
      dtype='object')

In [31]:
#Number of white applicants
w = data[data.race=='w']
w_count = w.shape[0]
w_count

2435

In [32]:
#Number of black applicants
b = data[data.race=='b']
b_count = b.shape[0]
b_count

2435

So it looks like we have an equal number of black- and white-sounding applicants.

In [33]:
# number of callbacks for white- and black-sounding names
rate_w = sum(data[data.race=='w'].call) / w_count
rate_b = sum(data[data.race=='b'].call) / b_count
print(rate_w, rate_b)

0.09650924024640657 0.06447638603696099


Converting the above numbers to percentages, white-sounding applicants had a 9.65% interview rate, and black-sounding applicants had a 6.45% interview rate. This means that an applicant with a white-sounding name is approximately (9.65-6.45) / 6.45 = 49.6% more likely to get an interview compared to one with a black-sounding name. Whether this difference is significant would be scrutinized later in this notebook.

   **1. What test is appropriate for this problem? Does CLT apply?**

What we're really comparing in this problem is the proportion of black-sounding applicant who received interviews to that of white-sounding applicants, making this a proportion test with two samples. CLT applies since 1) The sample is random, and should not contain any other biases, 2) Independent, since our sample size is < 10% of the total population, and 3) normal, since the number of callbacks exceeded 10 for both cases.
   
   **2. What are the null and alternate hypotheses?**

Null Hypothesis: Race plays no role in interview rate and the proportion of white applicants that received interviews is not significantly different from black applicants.

Alternate Hypothesis: Race plays a role in interview rate and the proportion of white applicants that received interviews is significantly different from that for black applicants.

**3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.**

Frequentist approach:

In [34]:
#Hypotheses
H0 = "The proportion of white applicants that received interviews is not significantly different from that of black applicants."
H1 = "The proportion of white applicants that received interviews is significantly different from that for black applicants."
alpha = 0.01

#Calculating the test statistic
diff_obs = rate_w - rate_b

#Calculate the paired standard error
rate_pooled = (rate_w + rate_b) / 2
sep = np.sqrt(2 * rate_pooled * (1 - rate_pooled) / 4870)

#Calculate the z-score
z_score = (diff_obs - 0) / sep

#Calculate margin of error
moe = sep * stats.norm.ppf(.975)
ci_lower, ci_upper = diff_obs - moe, diff_obs + moe
print("The margin of error is " + str(moe) + ", and the 95% confidence interval goes from " + str(ci_lower) + " to " + str(ci_upper))

#Calculate p-score
p_value = 1 - (stats.norm.cdf(z_score))
print("p = " + str(p_value))

The margin of error is 0.010805745262777831, and the 95% confidence interval goes from 0.021227108946667753 to 0.04283859947222342
p = 3.1204311357058145e-09


Bootstrap approach:

In [35]:
#Define Bootstrap function
def draw_bs_reps(data, func, size = 10000):
    array = np.empty(size)
    for i in range(size):
        bs_samples = np.random.choice(data, size)
        array[i] = func(bs_samples)
    return array

#Define rate
def rate(data):
    return np.sum(data)/len(data)

#Make bootstrap replicates of the original data that are the same length
bs_reps_w = draw_bs_reps(w.call, rate, size=2435)
bs_reps_b = draw_bs_reps(b.call, rate, size=2435)

#Find confidence interval of the difference between the bootstraps
ci_lower, ci_upper = np.percentile(bs_reps_w - bs_reps_b, [2.5, 97.5])
print(ci_lower, ci_upper, moe)

0.016427104722792615 0.04722792607802875 0.010805745262777831


In [36]:
#Calculating the test statistic
diff_obs = rate_w - rate_b

# Generate bootstrap replicates of the original data and randomly assigning them to black or white.
bs_reps_w_re = draw_bs_reps(data.call, rate, size=2435)
bs_reps_b_re = draw_bs_reps(data.call, rate, size=2435)

# Get replicates of difference of means: bs_replicates
bs_reps = bs_reps_w_re - bs_reps_b_re

# Compute and print p-value: p
p = np.sum(bs_reps >= diff_obs) / 2435
print("P-value = ", p)

P-value =  0.0


In both methods, we got a margin of error of around 0.01 and a 95% confidence interval of around 0.02 to 0.04 for the difference between interview rates for applicants with black- and white-sounding names. The p-value for two sample hypothesis testing is also less than the alpha value (0.01), so we would reject the null hypothesis and accept the alternate hypothesis.

   **4. Write a story describing the statistical significance in the context or the original problem.**

To determine whether the interview rates are significantly different for black- and white-sounding applicants, I performed a two-sample hypothesis test of proportion using two different methods that reached the same conflusion: That applicants with white-sounding names are significantly more likely to receive an interview compared to those with black-sounding names with a p-value of 3.12e-9, all other factors kept constant. 

White-sounding applicants had an interview rate of 9.65% while black-sounding applicants had an interview rate of 6.45%, meaning that white-sounding applicants are 49.6% more likely to receive an interview. Using a frequentist approach, the 95% confidence interval for the difference in interview rates goes from 2.12% to 4.28%, and using a bootstrap approach, the 95% confidence interval for the difference in interview rates goes from 1.64% to 4.72%.

Thus, I suspect that there are biases against black-sounding names involved in employment decisions in this study. Whether this bias directly correlates with racism, or is subconscious would need further investigation.

   **5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?**
   
Definitely not. All this analysis does is to show that there is a difference in callback rates for applicants whose name sounds black versus those whose name sounds white, given that everything else is controlled for. To determine whether race/name is the most important factor in callback success, we would need to look at the impact of every other factor (years of relevant experience, work in school, military experience) and compare those with that of race/name before reaching a conclusion.