# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

****

## Exercises

In [3]:
import pandas as pd
import numpy as np
from scipy import stats

In [4]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [5]:
# number of callbacks for black-sounding names
sum(data[data.race=='b'].call)

157.0

### Q1: What test is appropriate for this problem? Does CLT apply?
- this problem check for whether race has a significant impact on the rate of callbacks for resumes, the relation between two categorical variables, so use **Pearson's $\chi^2$ test**.
- CLT applies. This is a sufficiently large set of identically distributed independent random variables. Each of these data points if independent and drawn from the same probability distribution.

In [7]:
# check sample size
print('sample size is ',data.shape[0])

sample size is  4870


### Q2: What are the null and alternate hypotheses?
- null hypothsis: race and callback rate is independent
- alternate hyposisi: race and callback rate is not independent (related)

### Q3: Compute margin of error, confidence interval, and p-value.


In [6]:
def summarize(x):
    mu, s, n = x.mean(), x.std(), len(x)
    sem = stats.sem(x)
    CI = stats.norm.interval(0.90, mu, sem)
    print ('Sample mean:             {mu:0.2f}\n'
           'Sample variance:         {v:0.3f}\n'
           'Sample margin of error:  {sem:0.4f}\n'
           '90% confidence interval: {CI}'
           .format(mu=mu, v=s**2, sem=sem, CI=CI))

In [7]:
summarize(data[data.race=='b'].call)

Sample mean:             0.06
Sample variance:         0.060
Sample margin of error:  0.0050
90% confidence interval: (0.05628806842760025, 0.07266470299764693)


In [8]:
summarize(data[data.race=='w'].call)

Sample mean:             0.10
Sample variance:         0.087
Sample margin of error:  0.0060
90% confidence interval: (0.08666429585560362, 0.10635418527976473)


To get the p-value from performing the **Pearson's $\chi^2$ test**, first compute the contingency table:

In [9]:
df = pd.DataFrame([data.race, data.call==1.0], ['race', 'call']).transpose()
df['values'] = 1
contingency_table = pd.pivot_table(df, values='values', index='race', columns=['call'], aggfunc=np.sum)
contingency_table

call,False,True
race,Unnamed: 1_level_1,Unnamed: 2_level_1
b,2278,157
w,2200,235


According to the [wikipedia article](https://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test#Problems), the $\chi^2$ test

>will normally be acceptable so long as no more than 20% of the events have expected frequencies below 5. Where there is only 1 degree of freedom, the approximation is not reliable if expected frequencies are below 10.

And for this test, the expected frequencies are all greater then 10, the central limit theorem applies and we can use the approximation. Setting theshold value $\alpha = 0.05$, the test will give:

In [11]:
def chi2_test(x):
    chi2, p, dof, expected = stats.chi2_contingency(x)
    print('Test statistic:     {chi2:0.02f}\n'
          'Degrees of freedom: {dof}\n'
          'p value:            {p:0.2e}'.format(chi2=chi2, p=p, dof=dof))

In [13]:
chi2_test(contingency_table)

Test statistic:     16.45
Degrees of freedom: 1
p value:            5.00e-05


Since $p < \alpha$, reject the null hypothesis.

Then to quantify the observed dependence effect size, calculate odds ratio:

In [14]:
odds_ratio = (235/2200) / (157/2278)
print('Odds ratio: {odds_ratio:0.2f}'.format(odds_ratio=odds_ratio))

Odds ratio: 1.55


This means the odds that a person with a 'white' name will receive a callback are 1.55 times the odds that a person with a 'black' name.

### Q4: Write a story describing the statistical significance in the context or the original problem.

p-value means the probability of getting at least as extreme as the observed given null hypothsis is true. In this case, p-value is 5.00*e-05, very small, and definitely smaller than the critical value (0.05) set. So we reject the null hypothesis, and conclude that there is a statistically siginificant relationship between race and callback rate. 

### Q5: Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

No, this test is not enough for concluding that race/name is the most important factor in callback success. Actually, this test has nothing to do with to test race/name importance on callback success, all it indicates is just that there is a sigificiant relationship between race/name and callback rate.

If finding the most important factor in callback success is the goal for the test, then the problem becomes much more complicated. One way this could be achieved may be through the data modeling to build the model for predicting the callback success (e.g.logistic regression) and then based on the model eatablished to assess the importance of its dependend factors. The other ways may include traditional statistics based analysis like predictive factor analysis, etc.