# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

<div class="span5 alert alert-info">
### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
+ Formulas for the Bernoulli distribution: https://en.wikipedia.org/wiki/Bernoulli_distribution
</div>
****

In [29]:
import pandas as pd
import numpy as np
from scipy import stats
from statsmodels.stats import proportion
import matplotlib.pyplot as plt
import seaborn as sns

sns.set()


In [30]:
df = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [31]:
# number of callbacks for black-sounding names
sum(data[df.race=='w'].call)

235.0

In [32]:
df.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


<div class="span5 alert alert-success">
<p>Your answers to Q1 and Q2 here</p>
</div>

Q1. The Central Limit Theorem applies as n is sufficiently large (> 30) and the observations are independent (as far as we know). This is a Bernoulli distribution, which despite not being a continuous random variables, is still presumed to be normally distributed.

The hypothesis can be tested using a z test. 


Q2. Null Hypothesis: the proportion of callbacks for people with African-American-sounding names is equal to the proportion of callbacks for people with white sounding names

Alternate Hypothesis: the proportion of callbacks for people with African-American-sounding names IS NOT equal to the proportion of callbacks for people with white sounding names




In [38]:
white_call = df.call[data.race=='w']
black_call = df.call[data.race=='b']

In [34]:
# Your solution to Q3 here

<div class="span5 alert alert-success">
<p> Your answers to Q4 and Q5 here </p>
</div>

In [39]:
len(white_call) == len(black_call) # Check if we have an equal number of observations for each race

True

In [40]:
# inputs for proportions for white and black applicants, respectively 
s_w = white_call.sum()
n_w = int(len(white_call))

s_b  = black_call.sum()
n_b = int(len(black_call))

# Proportions

p_w = s_w /n_w

p_b = s_b / n_b

p_diff = p_w - p_b

# Margin of error
# Z critical of 95% confidence interval is 1.96

moe = np.sqrt(p_w * (1 - p_w)/n_w + p_b * (1 - p_b)/n_b) * 1.96

# Confidence Intervals 

conf_lower, conf_upper = np.array([p_b - p_w - moe, p_b - p_w + moe])

# Z Test

z, p = proportion.proportions_ztest(np.array([s_w, s_b]), np.array([n_w, n_b]), alternative = 'two-sided')

print('Z = ' + str(z))
print('P = ' + str(p))


Z = 4.10841215243
P = 3.98388683759e-05


In [42]:
# Bootstrap version using permutation

def permutation_sample(data_1, data_2):
    
    permuted_data = np.random.permutation(np.concatenate((data_1, data_2)))
    
    
    return permuted_data[:len(data_1)], permuted_data[len(data_1):]


def diff_of_proportions(data_1, data_2):
    """Difference in means of two arrays."""

    # The difference of means of data_1, data_2: diff
    diff = data_1.sum()/len(data_1) - data_2.sum()/len(data_2)

    return diff

def draw_perm_reps(data_1, data_2, func, size=1):
    """Generate multiple permutation replicates."""

    # Initialize array of replicates: perm_replicates
    perm_replicates = np.empty(size)

    for i in range(size):
        # Generate permutation sample
        perm_sample_1, perm_sample_2 = permutation_sample(data_1, data_2)

        # Compute the test statistic
        perm_replicates[i] = abs(func(perm_sample_1, perm_sample_2))

    return perm_replicates


perm_reps = draw_perm_reps(white_call, black_call, diff_of_proportions, size = 10000)

p = np.sum(perm_reps >= p_diff) / len(perm_reps)

print('P = ' + str(p))

P = 0.0001


P value is significantly below alpha (0.05) thus we can reject our null hypothesis based off this simulation

Q4. 
    The original question was, does having an African-American-sounding name significantly affect the rate of callbacks for resumes. The provided dataset includes a breakout of race and resume callbacks. With this data we formed the null hypothesis, which was that the proportion of callbacks for African-Americans was the same as the proportion of callbacks for caucasions. Our confidence level for this test is 95%. 
    
    We then performed a hypothesis test using a z test, frequentist approach, comparing the proportions of the two subpopulations. The p value for our z test was well below our 0.05 alpha, thus we rejected the null hypothesis with this method.
    Next, we perfomed a bootstrap permutation simulation, where we concatenated the two populations used randomization for 10000 trials to determine the percentage liklihood of seeing a difference in proportions at least as extreme as the difference between our observed proportions. Similarly, the p value was very low. We rejected our null hypothesis with this approach as well.


Q5.
    Although our hypothesis test revealed there is a significant difference in the proportion of callbacks of African-Americans vs caucasians, this does not necessarily mean race is the most significant factor. There could be other factors that contribute to the callback ratio like years of experience, education level, number of jobs or even gender. Determining the most important factor(s) would require further data exploration, checking for correlation and covariance amongst the variables, in addition to more hypothesis testing.
    