# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

<div class="span5 alert alert-info">
### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
+ Formulas for the Bernoulli distribution: https://en.wikipedia.org/wiki/Bernoulli_distribution
</div>
****

In [34]:
import pandas as pd
import numpy as np
from scipy import stats

np.random.seed(24)

In [28]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [29]:
# number of callbacks for black-sounding names
sum(data[data.race=='w'].call)

235.0

In [30]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


For this question, botht a bootstrap analysis and a z-test would be appropriate to answer the question of whether there is a difference in call backs for black- or white-sounding names. The CLT will apply to this scenario because we have a large sample size of independent measurements.

$H_{0}$: Probability of callback for white names = Probability of callback for black names.

$H_{A}$: Probability of callback for white names != Probability of callback for black names.


In [31]:
black = data[data.race=='b']
white = data[data.race=='w']
all_calls = data['call']

## Frequentist Approach

In [32]:
#Calculate proportion of resumes that resulted in a callback (success)
b_total = len(black)
b_success = black.call.sum()
b_prop = b_success/b_total

w_total = len(white)
w_success = white.call.sum()
w_prop = w_success/w_total

total_n = w_total + b_total
total_suc = w_success + b_success
p_pool = total_suc/total_n

#calculate margin of error and 95-percent confidence intervals
z_star = 1.96

SE_black = np.sqrt((b_prop*(1-b_prop))/b_total)
ci_min_b, ci_max_b = (b_prop - (z_star*SE_black)).round(3), (b_prop + (z_star*SE_black)).round(3)

print('Margin of error for black-sounding names =', (z_star*SE_black).round(4))
print('95-Percent CI for black-sounding names: [%g, %g]' %(ci_min_b, ci_max_b))

SE_white = np.sqrt((w_prop*(1-w_prop))/w_total)
ci_min_w, ci_max_w = (w_prop - (z_star*SE_white)).round(3), (w_prop + (z_star*SE_white)).round(3)

print()
print('Margin of error for white-sounding names =', (z_star*SE_white).round(4))
print('95-Percent CI for white-sounding names: [%g, %g]' %(ci_min_w, ci_max_w))


#Calculate parameters for z-test
SE = np.sqrt((p_pool*(1-p_pool))/b_total + (p_pool*(1-p_pool))/w_total)
prop_diff = b_prop - w_prop
Z = prop_diff/SE

#calculate p-value
p_value = (stats.norm.cdf(Z, loc=0, scale=1)*2).round(7)

print()
print('p-value =',p_value)



Margin of error for black-sounding names = 0.0098
95-Percent CI for black-sounding names: [0.055, 0.074]

Margin of error for white-sounding names = 0.0117
95-Percent CI for white-sounding names: [0.085, 0.108]

p-value = 3.98e-05


## Bootstrap Method

### Confidence Intervals

In [36]:
def bs_rep_1d(data, func):
    """
    Takes an array of data and a specified function
    Returns the results of a specified function run on a bootstrap sample of those data
    """
    bs_sample = np.random.choice(data, len(data))
    return func(bs_sample)

def get_prop_success(data):
    """
    Takes a boolean array
    Calculates and returns the proportion of True values
    """
    return sum(data)/len(data)

def get_bs_reps(data, func, size):
    """
    Takes an array of data, a specified function, and a number or replicates to take
    Returns an array of bootstrap replicates based on the specified function and size
    """
    bs_reps = np.empty(size)
    for i in range(size):
        bs_reps[i] = bs_rep_1d(data, func)
    return bs_reps

#generate arrays of callbacks for black and white names
b_call = black['call']
w_call = white['call']

#generate bootstrap reps of length 10,000
black_reps = get_bs_reps(b_call, get_prop_success, 10000)
white_reps = get_bs_reps(w_call, get_prop_success, 10000)

#calculate 95% confidence intervals
ci_black = np.percentile(black_reps, [2.5, 97.5])
ci_white = np.percentile(white_reps, [2.5, 97.5])

print('95% Confidence Interval for callbacks with black-sounding name:',ci_black)
print('95% Confidence Interval for callbacks with white-sounding name:',ci_white)

95% Confidence Interval for callbacks with black-sounding name: [0.0550308  0.07474333]
95% Confidence Interval for callbacks with white-sounding name: [0.08459959 0.10841889]


### Hypothesis Test

In [22]:
def permut_sample(data, length):
    """
    Takes an array and a specified length.
    Returns two arrays of data permuted from the original array.
    Both returned arrasy are of the length specified.
    """
    calls_perm = np.random.permutation(data)
    perm_sample_black = calls_perm[:length]
    perm_sample_white = calls_perm[length:]
    return perm_sample_black, perm_sample_white
    
def get_prop_diff(d1, d2):
    """
    Takes two boolean arrays.
    Returns the difference of proportions of True values between the arrays
    """
    d1_prop = sum(d1)/len(d1)
    d2_prop = sum(d2)/len(d2)
    return d1_prop-d2_prop

def get_permut_reps(data, length, func size=1):
    """
    Takes an array of boolean values, a specified length, a function and a size.
    Generates an bs replicate array with 'size' number of values.
    Length refers to the number of values that should go into permutation arrays
    Function refers to the function that should be applied to permutation arrays
    """
    perm_reps = np.empty(size)
    
    for i in range(size):
        perm_sample_black, perm_sample_white = permut_sample(data, length)
        perm_reps[i]=func (perm_sample_black, perm_sample_white)
        
    return perm_reps

#generate 10,000 bs replicates of difference between proportions using permutation of all callbacks
permutation_replicates = get_permut_reps(all_calls, len(black), get_prop_diff, 10000)

#calculate the p-vale 
p_value = sum(permutation_replicates <= prop_diff)/len(permutation_replicates)

print('p-value =',p_value)


p-value = 0.0


## Significance

Based on the results of both the Frequentist and Bootstrap approaches, evidence shows that there is a statistically significant difference in the number of callbacks for individuals with black- and white-sounding names. Since this is an experiment with race randomly assigned to resumes, we can conclude that there is a causal relationship between race and callback. It shows that resumes with black-sounding names are less likely to receive a callback than resumes with white-sounding names.

## Is Race the Most Important Factor in Callback Success?

Not necessarily. Our work above does demonstrates that resumes with black-sounding names are statistically less likely to receive a callback. But, there are a number of other variables in this dataset that could also affect whether or not a potential employer decides to make a callback. On the whole, only 8% of resumes actually received callbacks.

In order to determine the variable that has the most influence on callback success, you could run series of regressions for each individual variable. The variable with the highest R-squared would be the one that has the most influence on callback success.