# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
+ Formulas for the Bernoulli distribution: https://en.wikipedia.org/wiki/Bernoulli_distribution
****

In [1]:
import pandas as pd
import scipy.stats
from statsmodels.stats import proportion
import numpy as np

In [2]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [3]:
# number of callbacks for black-sounding names
sum(data[data.race=='w'].call)

235.0

In [4]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


In [5]:
len(data)

4870

In [6]:
data.columns

Index(['id', 'ad', 'education', 'ofjobs', 'yearsexp', 'honors', 'volunteer',
       'military', 'empholes', 'occupspecific', 'occupbroad', 'workinschool',
       'email', 'computerskills', 'specialskills', 'firstname', 'sex', 'race',
       'h', 'l', 'call', 'city', 'kind', 'adid', 'fracblack', 'fracwhite',
       'lmedhhinc', 'fracdropout', 'fraccolp', 'linc', 'col', 'expminreq',
       'schoolreq', 'eoe', 'parent_sales', 'parent_emp', 'branch_sales',
       'branch_emp', 'fed', 'fracblack_empzip', 'fracwhite_empzip',
       'lmedhhinc_empzip', 'fracdropout_empzip', 'fraccolp_empzip',
       'linc_empzip', 'manager', 'supervisor', 'secretary', 'offsupport',
       'salesrep', 'retailsales', 'req', 'expreq', 'comreq', 'educreq',
       'compreq', 'orgreq', 'manuf', 'transcom', 'bankreal', 'trade',
       'busservice', 'othservice', 'missind', 'ownership'],
      dtype='object')

In [7]:
data['call'].unique()

array([ 0.,  1.])

In [8]:
data['race'].unique()

array(['w', 'b'], dtype=object)

## What test is appropriate for this problem? Does CLT apply?

In [9]:
# Separate the data for whites and blacks
w = data[data.race=='w']
b = data[data.race=='b']

# Count number of observations in each data set
print("There are " + str(len(w)) + " observations in the white sounding names dataset.")
print("There are " + str(len(b)) + " observations in the black sounding names dataset.")

# Count number of callbacks in each data set
print("There are " + str(sum(data.loc[data['race']=='w', 'call'])) + " white callbacks.")
print("There are " + str(sum(data.loc[data['race']=='b', 'call'])) + " black callbacks.")

There are 2435 observations in the white sounding names dataset.
There are 2435 observations in the black sounding names dataset.
There are 235.0 white callbacks.
There are 157.0 black callbacks.


I will use test for proportions using normal (z) test. Central Limit Theorem does apply because the datasets are large enough (n>30).

## What are the null and alternate hypotheses?
H_null: The callbacks for white-sounding and black-sounding names are not significantly different <br>
H_alternative: The callbacks for white-sounding and black-sounding names are significantly different

## Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.

#### Bootstrap Approach

In [10]:
# generate bootstrap replicates function
def generate_bootstrap_replicates(data, func):
    sample = np.random.choice(data, len(data))
    return func(sample)

# calculate proportion
def calculate_proporation(data):
    return np.sum(data)/len(data)

In [11]:
# generate bootstrap replicates for white and black sounding names
w_bs_replicates = []
b_bs_replicates = []
for i in range(1000):
    w_bs_replicates.append(generate_bootstrap_replicates(w['call'], calculate_proporation))
    b_bs_replicates.append(generate_bootstrap_replicates(b['call'], calculate_proporation))

In [12]:
# calculate margins of error and confidence intervals
alpha = 5 # significance level (percentage)
w_confidence_interval = np.percentile(w_bs_replicates, [alpha/2, 100 - alpha/2])
b_confidence_interval = np.percentile(b_bs_replicates, [alpha/2, 100 - alpha/2])
print("Confidence interval for white-sounding names: " + str(w_confidence_interval))
print("Confidence interval for black-sounding names: " +  str(b_confidence_interval))
print("Margin of error for white-sounding names is +/-", (w_confidence_interval[1] - w_confidence_interval[0])/len(w_confidence_interval))
print("Margin of error for black-sounding names is +/-", (b_confidence_interval[1] - b_confidence_interval[0])/len(b_confidence_interval))

Confidence interval for white-sounding names: [ 0.08501027  0.10841889]
Confidence interval for black-sounding names: [ 0.0550308   0.07474333]
Margin of error for white-sounding names is +/- 0.011704312115
Margin of error for black-sounding names is +/- 0.00985626283368


At a significance level of 0.05, the confidence intervals do not intersect. We reject the null hypothesis and conclude that race has a significant impact on the rate of callbacks for resumes.

#### Frequentist Approach

In [13]:
counts = np.array([sum(data.loc[data['race']=='w', 'call']), sum(data.loc[data['race']=='b', 'call'])])
nobs = np.array([len(w), len(b)])

z_stat, p_value = proportion.proportions_ztest(counts, nobs)
confidence_interval = proportion.proportion_confint(counts, nobs, alpha=0.05, method='normal')
w_confidence_interval = np.array([confidence_interval[0][0], confidence_interval[1][0]])
b_confidence_interval = np.array([confidence_interval[0][1], confidence_interval[1][1]])

print("The p-value from the z proportion test is", p_value)
print("Confidence interval for white-sounding names: " + str(w_confidence_interval))
print("Confidence interval for black-sounding names: " +  str(b_confidence_interval))
print("Margin of error for white-sounding names is +/-", (w_confidence_interval[1] - w_confidence_interval[0])/len(w_confidence_interval))
print("Margin of error for black-sounding names is +/-", (b_confidence_interval[1] - b_confidence_interval[0])/len(b_confidence_interval))

The p-value from the z proportion test is 3.98388683759e-05
Confidence interval for white-sounding names: [ 0.08478067  0.10823781]
Confidence interval for black-sounding names: [ 0.05472141  0.07423136]
Margin of error for white-sounding names is +/- 0.01172856595
Margin of error for black-sounding names is +/- 0.00975497877459


At a significance level of 0.05, we reject the null hypothesis and conclude that race has a significant impact on the rate of callbacks for resumes.

## Write a story describing the statistical significance in the context or the original problem.
I was investigating whether race impacted callback rates for resumes. The data I used was collected by the Abdul Latif Jameel Poverty Action Lab (J-PAL) between 2000-2002 in Chicago and Boston, and it contains about 5,000 fictitious resume submissions for 1,300 jobs. After performing statistical tests on the data, I found that whether the name sounded white or black had a significant impact on the callback rate, and concluded that race impacted callbacks for resumes.

## Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

I would not conclude that race/name is the most important factor in callback success. My analysis did not include other factors that could potentially impact callback success, so I would have to perform additional analysis with data representing those factor before making a comparison to the impacts of race/name. In addition, the data only included jobs in Chicago and Boston, which are urban areas, and would not be representative of jobs in rural areas. To address this, I would have to collect data across the nation to avoid selection bias.