# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

<div class="span5 alert alert-info">
### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
</div>
****

In [20]:
import pandas as pd
import numpy as np
from scipy import stats
%matplotlib inline

In [2]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [3]:
# number of callbacks for black-sounding names
sum(data[data.race=='b'].call)

157.0

In [15]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


In [16]:
data.call.value_counts()

0.0    4478
1.0     392
Name: call, dtype: int64

In [None]:
#Q: What test is appropriate for this problem? Does CLT apply?
#A: We want to determine if there is a stastically significant difference in call value based on the b or w categorization
#The sample size is sufficiently large and the observations are independent, so the CLT does apply
#H0 There is no difference between the proportion of call-backs for b and w
#H1 There is a differe between the proportion of call-backs for b and w

In [17]:
len(data)

4870

In [31]:
#Generate sample size of each category
count = data.groupby("race").count() 
count_b = count.loc["b","id"]
count_w = count.loc["w","id"]

In [61]:
#Generate proportion of calls for each category and the sample difference between w and b call proportion
sums = data.groupby("race").sum() 
calls_b = sums.loc["b","call"]
calls_w = sums.loc["w","call"]
prop_b = calls_b/count_b
prop_w = calls_w/count_w
sample_diff = prop_w - prop_b
sample_diff

0.032032854209445585

In [43]:
#Calculate pooled standard proportion
p = ((prop_b)*(count_b)+(prop_w)*(count_w))/(count_b+count_w)
p

0.080492813141683772

In [52]:
#Calculate Standard Error of the sampling distribution
se = np.sqrt((p * ( 1 - p ))*(1/float(count_b)) + (1/float(count_w)))
se

0.021001747460043661

In [55]:
#Calculate z-statistic
z=(prop_w-prop_b)/se
z

1.5252470905284847

In [56]:
#The probability of seeing a difference as extreme as 3.2% between the w and b proportions is 12.6%, thus given our significance level of .05 we cannot reject the null hypothesis that the proportions are different
.063*2

0.126

In [None]:
#Write a story describing the statistical significance in the context or the original problem.
#Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

In [66]:
variances = data.groupby("race").var() 
var_b = variances.loc["b","call"]
var_w = variances.loc["w","call"]
sum_of_vars = var_b + var_w
sum_of_vars

0.14757499471306801

In [67]:
#Calculate a 95% CI. Find the z for a 95% CI
z=1.96
moe = z * sum_of_vars
CI_upper = sample_diff + moe
CI_lower = sample_diff - moe
print(CI_lower,CI_upper)

(-0.25721413542816773, 0.32127984384705888)


In [None]:
#Conclusion: Based on the sample difference of 3% between the w proportion and the b proportion, it seems like w
#gets more calls, but the 95% CI shows that sometimes the w proportion is higher than b, but sometimes it is lower
#Further, while it is not very likely for us to see this sample difference, there is a 12.6% chance of seeing this difference
#so we can't rule out that there is no difference at all between the b and w proportion
#Accordingly, we can't confidently say that race is a factor at all in the callback proportion. We could run this same analysis across other 
#categories in the table to see if we find a variable that impacts the callback rate at a level that we are confident in