# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
****

In [2]:
import pandas as pd
import numpy as np
from scipy import stats

In [3]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [4]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


How is data divided by race?

In [5]:
'Black applicants: {:4.1f}%'.format((sum(data.race=='b') / len(data.race)) * 100)

'Black applicants: 50.0%'

50% Black applicants, 50% White applicants

Number of callbacks for black-sounding names.

In [6]:
callb = sum(data[data.race=='b'].call)
'Black callbacks: {:2.0f}'.format(callb)

'Black callbacks: 157'

Number of callbacks for white-sounding names.

In [7]:
callw = sum(data[data.race=='w'].call)
'White callbacks: {:2.0f}'.format(callw)

'White callbacks: 235'

Total number of callbacks

In [9]:
ncall = sum(data.call)
'Total callbacks: {:2.0f}'.format(ncall)

'Total callbacks: 392'

Proportion of black callbacks

In [10]:
pcallb = callb / ncall
'Proportion of black callbacks: {:4.2f}'.format(pcallb)

'Proportion of black callbacks: 0.40'

In [11]:
pcallw = callw / ncall
'Proportion of white callbacks: {:4.2f}'.format(pcallw)

'Proportion of white callbacks: 0.60'

In [12]:
'White callbacks: {:2.0f}% greater than black callbacks'.format(((pcallw - pcallb) / pcallb) * 100)

'White callbacks: 50% greater than black callbacks'

If no racism in names, we would expect the probability, p, of a callback to be equally 0.5 for blacks and whites,
a binomial distribution. This then should approximate, by the Central Limit Theorm, a normal distribution
(for sufficently large number of observations, N, say greater than 20), then the mean is
approximately __p\*N__ with a standard deviation of __sqrt(N\*p\*(1 - p))__.

The null hypothesis is that there is no racism, so the proportion of black callbacks should be 50%.

In [13]:
mu = ncall * 0.5
'Mean of null hypothesis: {:5.1f}'.format(mu)

'Mean of null hypothesis: 196.0'

In [14]:
std = 0.5 * np.sqrt(ncall)
'Standard deviation of null hypothesis: {:3.1f}'.format(std)

'Standard deviation of null hypothesis: 9.9'

Actual number of black callbacks lies far from expected mean if we were to pick that number from a normal
distribution with the observed mean and standard deviation.

In [15]:
'Number of black callbacks lies {:3.1f} standard deviations from expected mean.'.format((callb - mu) / std)

'Number of black callbacks lies -3.9 standard deviations from expected mean.'

The probabilty that the number of black callbacks is greater than the number observed assuming the null hypothesis
extremely large

In [16]:
'The probabilty that the number of black callbacks is greater than {:3.0f} assuming null hypothesis: {:6.3f}%'.format(callb, (1 - stats.norm.pdf(callb, mu, std)) * 100)

'The probabilty that the number of black callbacks is greater than 157 assuming null hypothesis: 99.998%'

In [17]:
lolim, hilim = stats.norm.interval(0.95, loc=mu, scale=(std/np.sqrt(ncall)))
'The 95% confidence interval assuming the null hypothesis: {:5.1f}, {:5.1f}'.format(lolim, hilim)

'The 95% confidence interval assuming the null hypothesis: 195.0, 197.0'

The standard error on the data is __sqrt(p\*(1-p)/n)__.

In [18]:
se = np.sqrt((pcallb * pcallw) / ncall)
'Standard error: {:3.1f}%'.format(se *100)

'Standard error: 2.5%'

The Z_score on the data is __(p_obs - p_null) / std_err__.
The p_score is the integral of the normal distribution's tail up to the Z-score.

In [155]:
z_score = (pcallb - 0.5) / se
p_score = stats.norm.cdf(z_score)
'Z_score of {:3.1f} yields a p_score of {:3.1g}'.format(z_score, p_score)

'Z_score of -4.0 yields a p_score of 3e-05'

The margin of error is Z_score of confidence interval (1.96 for 95%) multiplied by standard error.

In [156]:
moe = 1.96 * se
'Margin of error at 95% confidence level: {:3.1f}%'.format(moe * 100)

'Margin of error at 95% confidence level: 4.9%'

At the high end of the MoE, the proportion of black callbacks is about 45%.

In [157]:
z_score = (pcallb + moe - 0.5) / se
p_score = stats.norm.cdf(z_score)
'Z_score of {:3.1f} yields a p_score of {:3.1g}'.format(z_score, p_score)

'Z_score of -2.1 yields a p_score of 0.02'

The probability here indicates that we cannot reject the null hypothesis outright. On the low side of the MoE, the
the probability that the null hypothesis is true becomes extremely unlikely. The null hypothesis that there is no 
no racism in hiring seems improbable based on the 'names' data. Testing the other variables for