# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

<div class="span5 alert alert-info">
### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
+ Formulas for the Bernoulli distribution: https://en.wikipedia.org/wiki/Bernoulli_distribution
</div>
****

In [1]:
import pandas as pd
import numpy as np
from scipy import stats

In [2]:
data = pd.io.stata.read_stata('/Users/vickimoore/Downloads/EDA_racial_discrimination/data/us_job_market_discrimination.dta')

# Information about the dataset

In [3]:
# number of callbacks based on race associated with names
print(sum(data[data.race=='w'].call))
print(sum(data[data.race=='b'].call))

235.0
157.0


In [8]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


# Statistical tests of the impact of race on applicants receiving calls

The appropriate statistical tests to address any differences in allocation of calls to perceived race include chi-squared and t- or z-tests, since these are easily suited to comparisons of values from two sample populations with each other. First, we will look at chi-squared analysis, and then a z-test. A z-test is reasonable here because there is a very large sample size (n=4870), and the data represent the entirety of the experimentally-developed population of interest. Additionally, the z-test will be a test of proportions instead of a comparison of means since what is relevant is the number of calls in the context of the number of individuals from each condition.  

There is not a reason to believe the central limit theorem would not apply here because we are examining randomly assigned and independently drawn samples. 

We can perform a permutation test on the study population to estimate whether sampling bias influences any differences in proportions, as we have full datasets for each subpopulation from which to draw permutations of sample data. This test will follow the chi-squared and z-statistic tests.

The null hypothesis for each test that an individual's race does not influence whether the individual receives a call.

## Chi-squared contingency tests
First, we will look at the results of a manually calculated chi-squared test for an in-depth perspective on the calculations. Then, we will employ Scipy's formula for a chi-squared contingency test.

In [89]:
w = data[data.race=='w']
b = data[data.race=='b']

#Manual calculation of chi-squared test. DF = 1.
def chisq(a1,a2,b1,b2):
    col_1a = ((a1+a2)/(a1+a2+b1+b2))*(a1+b1)
    col_1b = ((a1+a2)/(a1+a2+b1+b2))*(a2+b2)
    col_2a = ((b1+b2)/(a1+a2+b1+b2))*(a1+b1)
    col_2b = ((b1+b2)/(a1+a2+b1+b2))*(a2+b2)
    chi = (((a1-col_1a)**2)/col_1a)+(((a2-col_1b)**2)/col_1b)+(((b1-col_2a)**2)/col_2a)+(((b2-col_2b)**2)/col_2b)
    print('Chi-squared value:', chi)
    if chi >= 3.84 and chi < 6.63:
        print('p <= 0.05')
    elif chi >= 6.63 and chi < 11.0:
        print('p <= 0.01')
    elif chi >= 11.0:
        print('p < 0.001')
    else:
        print('p > 0.05')
    return chi
chisq(np.sum(w.call == 1), np.sum(w.call == 0), np.sum(b.call == 1), np.sum(b.call == 0))

Chi-squared value: 16.8790504143
p < 0.001


16.879050414270221

In [85]:
#Chi-squared test with Scipy calculation.
obs = np.array([[np.sum(w.call == 1), np.sum(w.call == 0)], [np.sum(b.call == 1), np.sum(b.call == 0)]])
stats.chi2_contingency(obs)
print("Chi-2 statistic:",stats.chi2_contingency(obs)[0], "p-value:",stats.chi2_contingency(obs)[1])

Chi-2 statistic: 16.4490285842 p-value: 4.99757838996e-05


Both methods of calculating chi-squared statistics on the data of interest led to a chi-squared value just above 16 (16.9 and 16.4, respectively). The first calculation relied on finding p-values based on critical values for one degree of freedom found in reference tables (p < 0.001 for this statistic). The second calculation using Scipy's chi-squared contingency test function produced a p-value of 0.00005. These results should compel us to reject the null hypothesis that race does not affect callbacks, and with this extremely low p-value this is not an ambiguous result.

## Z-statistic on proportions
There may be different numbers of individuals from either race within this experimental dataset, so instead of raw values for summary statistics, we will look at proportions of members of each race who receive calls and base z-statistics on these.  

In [86]:
b_pr = np.sum(b.call) / len(b)
w_pr = np.sum(w.call) / len(w)

diff_pr = b_pr - w_pr
pooled_pr = ((b_pr*len(b)) + (w_pr*len(w))) / (len(b) + len(w))
se = np.sqrt(pooled_pr * (1 - pooled_pr) * ((1 / len(b)) + (1 / len(w))))

z = diff_pr / se
p_z = (1 - stats.norm.cdf(abs(z))) * 2
print("z-statistic:",z, "p-value of z-statistic:", p_z)

z-statistic: -4.10841215243 p-value of z-statistic: 3.98388683758e-05


The z-statistic is -4.10, and the p-value for this statistic is 0.00004, indicating that, again, the null hypothesis that race is not a factor in calls should be unambiguously rejected.

## Margin of error and confidence interval with z-statistic
We will use 1.96 as the critical value of standard deviations for 95% of a population with a normal distribution, but will calculate margin of error (moe) using standard error while comparing these proportions.

In [87]:
moe = 1.96 * se
ci_lower = diff_pr - moe
ci_upper = diff_pr + moe
print("Difference between proportions:",diff_pr)
print("Margin of error:",moe, "95% confidence interval:",ci_lower,ci_upper)

Difference between proportions: -0.0320328542094
Margin of error: 0.0152819123109 95% confidence interval: -0.0473147665203 -0.0167509418986


The difference between proportions (-0.032) for calls received by race falls within a 95% confidence interval of -0.047 to -0.017, calculated from the z-test of proportions.

## Permutation test
A permutation test will be used to help us deduce whether differences we see in proportions of calls by race are likely to be meaningful over many permutations of sampling.

In [88]:
#Permutation replicate functions.
def permutation_sample(data1, data2):
    data = np.concatenate((data1, data2))
    permuted_data = np.random.permutation(data)
    perm_sample_1 = permuted_data[:len(data1)]
    perm_sample_2 = permuted_data[len(data1):]
    return perm_sample_1, perm_sample_2

def draw_perm_reps(data1, data2, func, size=1):
    perm_replicates = np.empty(size)
    for l in range(size):
        perm_sample_1, perm_sample_2 = permutation_sample(data1, data2)
        perm_replicates[l] = func(perm_sample_1, perm_sample_2)
    return perm_replicates

def mean_diff(data1, data2):
    diff_mean = np.mean(data1) - np.mean(data2)
    return diff_mean

#Applied to calls received by race.
np.random.seed(74)
raw_diff_means = mean_diff(w.call, b.call)
perm_replicates = draw_perm_reps(w.call, b.call, mean_diff, size=10000)
p_racecall = np.sum(perm_replicates >= raw_diff_means) / len(perm_replicates)
print("P-value for influence of race on calls:",p_racecall)

P-value for influence of race on calls: 0.0


This permutation test shows a p-value of 0.0, indicating that we should reject the null hypothesis. Taken together with the previous tests, this suggests that race has a meaningful impact on probability of receiving a call.

# Conclusion

Each statistical test employed to answer the question of whether race impacts probability of receiving a call provides strong evidence that perception of race is a significant factor in whether an applicant receives a phone call. These tests are not comprehensive enough to address the relative strength of this connection compared with other attributes, but provide a strong enough signal to indicate that race is unambiguously a factor. To evaluate the relative impact of this versus other factors, further univariate tests of individual factors can be done. The most useful analysis would be multivariate in order to determine the impact of race in conjunction with other factors.