
### Examining racial discrimination in the US job market

#### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

#### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes.

#### Exercise
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value.
   4. Discuss statistical significance.

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

****

In [None]:
import pandas as pd
import numpy as np
from scipy import stats

In [None]:
data = pd.io.stata.read_stata('us_job_market_discrimination.dta')
print data.head()


In [None]:

#print data.describe()
# number of callbacks for balck-sounding names
blk_call=sum(data[data.race=='b'].call)
print blk_call
wht_call = sum(data[data.race=='w'].call)
print wht_call

In [None]:
blacks = data[data.race=='b']
whites = data[data.race=='w']
#print black.describe()
print len(blacks)
print len(whites)

In [None]:
#null hypothesis : race has no effect on getting an interview call
#let significance level is 10%

blacks_called = len(blacks[blacks['call'] == True])
print blacks_called
blacks_not_called = len(blacks[blacks['call'] == False])
print blacks_not_called
whites_called = len(whites[whites['call'] == True])
print whites_called
whites_not_called = len(whites[whites['call'] == False])
print whites_not_called

In [None]:
called_back = blacks_called + whites_called
print called_back
not_called = blacks_not_called + whites_not_called
print not_called
print len(data)

In [None]:
rate_of_callback = (called_back)/not_called#len(data)
rate_of_callback = 0.087

In [None]:
expected_call_black = (blacks_called + blacks_not_called) *0.087
print expected_call_black

In [None]:
expected_call_white = (whites_called + whites_not_called)*(0.087)
print expected_call_white

In [None]:
expected_no_call_black = len(blacks)- expected_call_black
print expected_no_call_black

In [None]:
expected_no_call_white = len(whites)- expected_call_white
print expected_no_call_white

In [None]:
import scipy.stats as stats
observed_frequencies = [blacks_not_called, whites_not_called, whites_called, blacks_called]
expected_frequencies = [expected_no_call_black, expected_no_call_white, expected_call_white, expected_call_black]


# We use degrees of freedom 1, which we calculate as 
# (num columns - 1) * (num_rows - 1) = 1 
# The degrees of freedom is k - 1 - ddof (the adjustment to the degrees of freedom)
stats.chisquare(f_obs = observed_frequencies,
                f_exp = expected_frequencies, ddof=2)

In [None]:
#The result is 18.324 and the p-value for the given degrees of freedom is given as 1.8634423521092039e-05, which is highly significant. We conclude that race plays a significant role in callbacks.

In [None]:

n1 = len(whites)
n2 = len(blacks)
p1 = float(whites_called)/ n1
p2 = float(blacks_called) / n2
print n1
print n2
print p1
print p2

In [None]:
#calculating standard error:
p = float(p1 * n1 + p2 * n2) / (n1 + n2)
print p

In [None]:
p1= float(p)
print p1
t = p * (1-p)
r = n1 + n2
q = n1*n2
s = float(r)/q
#print s
#print q
#print r
#print t
d = float(r)/q
print d
stderr = np.sqrt(t * d)
print stderr
z = (p1-p2)/stderr
print z

In [None]:

p_values = stats.norm.sf(abs(z))*2
print p_values

In [None]:
#The p-value is less than alpha (0.10) so we can reject the null hypothesis. There is evidence that the rate of callbacks is higher among whites than blacks.

In [None]:
#confidence interval
z = 1.96 #for alpha = 0.05
confidence_interval = [p - z * stderr, p + z * stderr]
confidence_interval

In [None]:
#In the above hypothesis tests, we used the 95% confidence level. Another of stating this is alpha, the error rate we're willing to accept, is 5%. Less than 5% of the time, random variation would account for the results we saw.