# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
****

In [1]:
import pandas as pd
import numpy as np
from scipy import stats
import math

In [2]:
data = pd.io.stata.read_stata('us_job_market_discrimination.dta')

In [3]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


In [4]:
# number of callbacks for black-sounding names, and white sounding names
sum(data[data.race=='b'].call), sum(data[data.race=='w'].call)

(157.0, 235.0)

In [5]:
# sample size
n = len(data)

In [6]:
# sample size of black-sounding names
nb = len(data[data.race=='b'])

In [7]:
# sample size of white-sounding names
nw = len(data[data.race=='w'])

In [8]:
# number of callbacks for black-sounding names
cb = sum(data[data.race=='b'].call)

In [9]:
# number of callbacks for white-sounding names
cw = sum(data[data.race=='w'].call)

## 1.What test is appropriate for this problem? Does CLT apply?

Null hypothesis Z-statistic test is appropriate for this problem because CLT does apply.

## 2.What are the null and alternate hypotheses?

Null hypothesis: race has zero impact on the rate of callbacks for resumes.

Alternative hypothesis: race has a significant impact on the rate of callbacks for resumes.

H0: Rb = Rw

Ha: Rb < Rw

## 3.Compute margin of error, confidence interval, and p-value.

In [10]:
# Rate of callbacks for black-sounding names
Rb = cb / nb
# Rate of callbacks for white-sounding names
Rw = cw / nw

In [11]:
# mean of rate of callbacks for black-sounding names
mb = Rb
# mean of rate of callbacks for white-sounding names
mw = Rw

In [12]:
# Sample variance
stdev_b2 = (cb * (1-mb)**2 + (nb-cb) * (0-mb)**2) / (nb-1)
stdev_w2 = (cw * (1-mw)**2 + (nw-cw) * (0-mw)**2) / (nw-1)
# Standard error
stdev_pooled = math.sqrt(stdev_b2/nb + stdev_w2/nw)
stdev_pooled

0.007784969307159196

If the null hypothesis H0 is right, then Rw = Rb, or mw = mb.  This implies the difference is zero: md = mb - mw = 0.

In [13]:
# confidence interval
# From Z table, for conficence of 99%, z_score=2.33
z = 2.33
# margin of error
std_err = z * stdev_pooled
std_err

0.018138978485680926

In [14]:
# Confidence interval
confid_upper = 0 + std_err
confid_lower = 0 - std_err
print(confid_lower,confid_upper)

-0.018138978485680926 0.018138978485680926


So if null hypothesis is right, there is a 99% confidence that the difference of callback rate for white-sounding names and black-sounding names (md = mb - mw) is less than 1.8%.

In [15]:
# Sample mean difference
md = Rw - Rb
md

0.032032854209445585

The actual sample mean difference of callback rate for white-sounding names and black-sounding names is 3.2%.  It is  much bigger than the upper limit of the condifence interval.  Therefore, the null hypothesis can be rejected, and the alternative hypothesis accepted.

In [16]:
# actual z score 
z_score = md / stdev_pooled
z_score

4.114705266723095

In [17]:
# p-value
# From Z table, for z_score=4.11, p(z<4.11)=0.99998
p_value = 1 - 0.99998
print(p_value)

2.0000000000020002e-05


If null hypothesis is correct, then there is a 0.002% probability that the actual sample mean difference, md, is as or more extreme than what was observed of 3.2%.  Therefore, the null hypothesis is rejected.

# 4.Write a story describing the statistical significance in the context of the original problem.

In [18]:
# Rate of callbacks for black-sounding names, and white-sounding names
print(Rb, Rw)

0.064476386037 0.0965092402464


The above analysis shows that the mean rate of callbacks for white-sounding names (mw) is 9.65%, and the mean rate of callbacks for black-sounding names (mb) is 6.45%.  This gives a 3.2% difference of the mean rate of callbacks (md = mw - mb).
Under a NULL hypothesis that the mean rates are equal, then the uppser limit of a 99% confidence interval is 1.8%.  The actual difference of 3.2% implies a p-value 0.002%.  So the NULL hypothesis can be rejested.  Race-sounding names are important for callbacks to ocurr.  This is indicative of racial discrimination in the United States.

# 5.Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

Since the analysis above was performed on just the race/name factor, the race/name factor is important to callback success. But, there may be other unanalyzed factors that are just or more important to callback success. The analysis can be amended by using many factors with linear regression using stepwise forward regression methodology using an adjusted R square as a penalized metric of goodness of fit.     