# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
****

In [1]:
import pandas as pd
import numpy as np
from scipy import stats

In [2]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [3]:
# number of callbacks for black-sounding names
sum(data[data.race=='b'].call)


157.0

In [4]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


##    1. What test is appropriate for this problem? Does CLT apply?

For CLT to apply:
    1. n has to be large (typically n > 30)
    2. Observations must be independent
    3. Variables must be independent

In [5]:
data.shape

(4870, 65)

We can assume that the CLT holds as the data set is quite large (n = 4870), and we can assume variable and observation independence from the methods of the study.

A 2 sample t-test is appropriate for this problem in order to compare callback rates for the two whites and blacks. Because this study is concerned with the probability of a one-time event occuring (callback from a resume), it can be approximated using a Bernoulli distrubition.<br>

$\bar{x}_b$ : sample callback rate mean for blacks<br>
$\bar{x}_w$ : sample callback rate mean for whites<br><br>

$$t = \frac{\bar{x}_b - \bar{x}_w}{\sqrt{\frac{s^2_b}{n_b}+{\frac{s^2_w}{n_w}}}}$$

## 2. What are the null and alternate hypotheses?

$\mu_{b}$ : population callback rate mean for blacks<br>
$\mu_{w}$ : population callback rate mean for whites<br><br>

$h_{0} : \mu_{b} = \mu_{w}$ There is no statistically significant difference in callbacks for blacks and whites<br>

$h_{a} : \mu_{b} \neq \mu_{w}$ There is a statistically significant differencee in callbacks for blacks and whites

## 3. Compute margin of error, confidence interval, and p-value.

__Margin of Error for Difference of Means:__
$$ \sigma_\bar{x} = \sqrt{\frac{s_b}{n_b} + \frac{s_w}{n_w}}$$

__Confidence Interval:__
$$CI = \bar{x}\pm z * \sigma_{\bar{x}}$$

__Bernoulli Distribution Variance:__
$$ s^2 = p*(1-p)$$

In [6]:
from scipy import stats

#Compute callback rates for blacks and whites
blacks_calls = data.call.loc[data.race=='b']
whites_calls = data.call.loc[data.race=='w']

blacks_n = len(data.call.loc[data.race=='b'])
whites_n = len(data.call.loc[data.race=='w'])
blacks_rate = blacks_calls.sum()/blacks_n
whites_rate = whites_calls.sum()/whites_n
both_rate = (data.call.sum())/len(data.call)

# Compute variance
bernoulli_std = lambda x: (x*(1-x))
blacks_std = bernoulli_std(blacks_rate)
whites_std = bernoulli_std(whites_rate)

# Compute t-test
tstat, pvalue = stats.ttest_ind(blacks_calls, whites_calls)


# Compute Margin of Error
diff_standard_error = (blacks_std**2/blacks_n + whites_std**2/whites_n)**.5
# Compute CI
standard_error_margin = diff_standard_error * 1.96
#Compute difference of whites and blacks callback rate
diff_of_means = whites_rate - blacks_rate
print('Margin of error: {}'.format(diff_standard_error))
print("Callback rate difference with 95% CI: {} +/- {}".format(diff_of_means,standard_error_margin))
print('t-stat: {}; pvalue {}'.format(tstat,pvalue))

Margin of error: 0.0021486262041695925
Callback rate difference with 95% CI: 0.032032854209445585 +/- 0.0042113073601724015
t-stat: -4.114705290861751; pvalue 3.940802103128886e-05


As p < .05, the null hypothesis is rejected at the 95% CI level. This means that there is a statistically significant difference in the callback rate for blacks and whites. This can also be seen in the confidence interval, as the minimum difference is still greater than 0.
##    4. Write a story describing the statistical significance in the context or the original problem.

This dataset contains information on whether a job applicant received a callback. The dataset marks whether the person had a white-sounding name or a black-sounding name. The sample size is 4870 split equally amongst black names and white names. Black names had a callback rate of 6.44%, while white-sounding names had a callback rate of 9.65%. To determine whether the difference in callback rates is statistically significant, a 2 sample $t$-test was used.

__Null Hypothesis:__
There is not a statistically significant difference in the callback rates of black and white names ($h_{0} : \mu_{b} = \mu_{w}$)<br>
__Alternate Hypothesis:__
There is a statistically significant difference in the callback rates of black and white names ($h_{a} : \mu_{b} \neq \mu_{w}$)

The difference in callback rates was found to be statistically significant as p <.05. This means that the null hypopthesis is rejected and alternate hypothesis is accepted at the .05 significance level.

## 5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

Not necessarily. Although there is a statistically signifcant difference in callback rates bewteen the two types of names, that difference could be the result of many factors which are present in the dataset such as gender, zip code, years of experience, military service, etc. Multiple analyses could be run to determine if there is a difference in callback rate based on those other factors. A logistic regression analysis could also be run with regularization in order to determine the weights of all of the features in the dataset.