
### Examining racial discrimination in the US job market

#### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

#### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes.

#### Exercise
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value.
   4. Discuss statistical significance.

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

****

In [14]:
import pandas as pd
import numpy as np
from scipy import stats
import scipy.stats as sp
import statsmodels.stats.weightstats as sm

In [3]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [3]:
# number of callbacks for balck-sounding names
sum(data[data.race=='b'].call)

157.0

In [17]:
data.describe()

Unnamed: 0,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,occupbroad,workinschool,...,educreq,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind
count,4870.0,4870.0,4870.0,4870.0,4870.0,4870.0,4870.0,4870.0,4870.0,4870.0,...,4870.0,4870.0,4870.0,4870.0,4870.0,4870.0,4870.0,4870.0,4870.0,4870.0
mean,3.61848,3.661396,7.842916,0.052772,0.411499,0.097125,0.448049,215.637782,3.48152,0.559548,...,0.106776,0.437166,0.07269,0.082957,0.03039,0.08501,0.213963,0.267762,0.154825,0.165092
std,0.714997,1.219126,5.044612,0.223601,0.492156,0.296159,0.497345,148.127551,2.038036,0.496492,...,0.308866,0.496083,0.259649,0.275854,0.171677,0.278932,0.410141,0.442847,0.361773,0.371308
min,0.0,1.0,1.0,0.0,0.0,0.0,0.0,7.0,1.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,3.0,3.0,5.0,0.0,0.0,0.0,0.0,27.0,1.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
50%,4.0,4.0,6.0,0.0,0.0,0.0,0.0,267.0,4.0,1.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
75%,4.0,4.0,9.0,0.0,1.0,0.0,1.0,313.0,6.0,1.0,...,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
max,4.0,7.0,44.0,1.0,1.0,1.0,1.0,903.0,6.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0


In [4]:
data1 = data[['race','call']]
data1.head()

Unnamed: 0,race,call
0,w,0
1,w,0
2,b,0
3,b,0
4,w,0


What test is appropriate for this problem? Does CLT apply?

In [5]:
#separating dataframes by black and white
data1_b = data1[data.race == 'b']
data1_w = data1[data.race == 'w']

In [6]:
#counting the number of calls by call type 0,1 where 0: no call back and 1:call back
b_nocall = data1_b[data1_b.call == 0]
print("The number of blacks who didnot get a call for interview:", len(b_nocall.call))

b_call = data1_b[data1_b.call == 1]
print("The number of blacks who didnot get a call for interview:", len(b_call.call))

w_nocall = data1_w[data1_w.call == 0]
print("The number of whites who didnot get a call for interview:", len(w_nocall.call))

w_call = data1_w[data1_w.call == 1]
print("The number of whitess who didnot get a call for interview:", len(w_call.call))

The number of blacks who didnot get a call for interview: 2278
The number of blacks who didnot get a call for interview: 157
The number of whites who didnot get a call for interview: 2200
The number of whitess who didnot get a call for interview: 235


Yes, CLT applied because the sample size for each of the cases mentioned above is more than 10. The appropriate test will be to test whether the proportion of callbacks between white and black differ significantly in the population or not.

In [20]:
#calculating the proportion
p_b_call = len(b_call.call)/(len(b_call.call)+ len(b_nocall.call))
p_b_nocall = 1 - p_b_call
n_b = len(b_call.call)+ len(b_nocall.call)
print("The proportion of blacks who got an interview call:", round(p_b_call,2))
print("The proportion of blacks who didnot get an interview call:",round(p_b_nocall,2) )
print("The number of blacks:",n_b)

p_w_call = len(w_call.call)/(len(w_call.call)+ len(w_nocall.call))
p_w_nocall = 1 - p_w_call
n_w = len(w_call.call)+ len(w_nocall.call)
print("The proportion of whites who got an interview call:", round(p_w_call,2))
print("The proportion of whites who didnot get an interview call:", round(p_w_nocall,2))
print("The number of white:",n_w)

p_call = (n_b*p_b_call + n_w*p_w_call)/(n_b + n_w)
print("The pooled proportion of interview call:", round(p_call,2))

The proportion of blacks who got an interview call: 0.06
The proportion of blacks who didnot get an interview call: 0.94
The number of blacks: 2435
The proportion of whites who got an interview call: 0.1
The proportion of whites who didnot get an interview call: 0.9
The number of white: 2435
The pooled proportion of interview call: 0.08


What are the null and alternate hypotheses?
Compute margin of error, confidence interval, and p-value.
Discuss statistical significance

In this case, the hypothesis is to test whether race  has a significant impact on the rate of callbacks for resumes or not. The hypothesis for this study is as follows:

Null hypothesis: There is no statistically significant difference in the proportion of call backs between white and black in the population

Alternative hypothesis: There exists a statistically significant difference in the proportion of call backs between black and white in the population

Compute margin of error, confidence interval

Margin of error: z(alpha = 5%)*SE

SE = sqrt((p_b_call*(1-p_b_call)/n_b + (p_w_call*(1-p_w_call)/n_w)

z(alpha = 5%) = 1.96 (two-tail test)

CI = ((p_w_call - p_b_call)- me,(p_w_call - p_b_call)+ me)  

In [19]:
se = np.sqrt((p_b_call*(1-p_b_call))/n_b + (p_w_call*(1-p_w_call))/n_w)
print('Standard Error is:',round(se,4))

me = 1.96 * se
print ('At 95% level of CI Margin of Error:',round(me,4))

CI = ((p_w_call - p_b_call)- me,(p_w_call - p_b_call)+ me)               
print('Confidence Interval:',CI)

Standard Error is: 0.0078
At 95% level of CI Margin of Error: 0.0153
Confidence Interval: (0.016777447859559147, 0.047288260559332024)


For testing of hypothesis, under the Null hypothesis of equal proportion of call back for both white and black, the SE will differ from what we have calculated during confidence interval. In case of hypothesis testing under the null hypothesis, we have to consider only the pool proportion of call back. In this case as i have earlier calculated p_call = 0.08

In [21]:
SE = np.sqrt((p_call*(1-p_call))/n_b + (p_call*(1-p_call))/n_w)
print('Standard Error is:',round(se,4))

z1 = (p_w_call - p_b_call)/SE
print('Z-statistic:',round(z1,2))


Standard Error is: 0.0078
Z-statistic: 4.11


From the above calculation the value of z-statistic is well above the critical level z-value given the Null hypothesis as true i.e. p-value is below the level of significance of .05.Hence the Null hypothesis is rejected. It implies that in the population there exists a statistically significant difference in call back proportion between black and white.