# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

## Summary
***

In [1]:
# IMPORTS
import pandas as pd
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
import seaborn as sns
import pylab

In [2]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


### Q1: What test is appropriate for this problem? Does CLT apply?

<p> First lets check to see if the Central Limit Theorem applies to this problem. Since we are comparing two proportions: rate of callback for black sounding names and rate of callback of white sounding names, we have the following conditions to meet. </p>

1. **The samples are independent.** It is safe to assume that the samples are independent based on the background information. Also it is safe to assume that the population is greater than 10 times the size of the sample. &#9745;
2. **The samples are random.** Again, based off the background information it is safe to assume that the samples are random. &#9745;
3. **Each sample includes at least 10 successes and 10 failures.** To determine this one we have to look at the samples and see if they are large enough to justify that us using the Normal Distribution. The equations to follow will be
 + $n_{1} * p_{1} >= 5$
 + $n_{1} * (1- p_{1}) >= 5$
 + $n_{2} * p_{2} >= 5$
 + $n_{2} * (1- p_{2}) >= 5$

The guideline to follow will be 

In [3]:
# Determine size of each sample: b_names for black sounding names, w_names for white sounding names
b_names = len(data[data.race=='b'])
w_names = len(data[data.race=='w'])

# Determine number of callbacks for each sample: b_call, w_call
b_call = sum(data[data.race=='b'].call)
w_call = sum(data[data.race=='w'].call)

# Determine percentage of names that were called back: rate_b, rate_w
rate_b = b_call / b_names
rate_w = w_call / w_names

print('Number of Black Names Called Back: ', b_call)
print('Total Black Sounding Names: ', b_names)
print('Black Rate of Callback: ', rate_b)

print('\nNumber of White Names Called Back: ', w_call)
print('Total White Sounding Names: ', w_names)
print('White Rate of Callback: ', rate_w)

Number of Black Names Called Back:  157.0
Total Black Sounding Names:  2435
Black Rate of Callback:  0.06447638603696099

Number of White Names Called Back:  235.0
Total White Sounding Names:  2435
White Rate of Callback:  0.09650924024640657


In [4]:
# Check for success and failures amongst black sounding names sample: s_b, f_b
s_b = b_names * rate_b >= 10
f_b = b_names * (1 - rate_b) >= 10

# Check for successes and failures amongst white sounding names sample: s_w, f_w
s_w = w_names * rate_w >= 10
f_w = w_names * (1 - rate_w) >= 10

print(s_b, f_b, s_w, f_w)

True True True True


<div class="span5 alert alert-success">

**Analysis:**

Since both of our samples have at least 10 successes and at least 10 failures, we can go back and say that **the Central Limit Theorem applies**.

We are looking at two different proportions and therefore will perform a **two sample proportion Z-test.**

***
### Q2: What are the null and alternate hypotheses?

**Null Hypothesis:** The probability of getting a callback is the same for both white-sounding names and black-sounding names.
+ $H_0: p_w - p_b = 0$

**Alternate Hypothesis:** The probability of getting a call back is not the same between white sounding names and black sounding names.
+ $H_a: p_w - p_b \neq 0$

***
### Q3: Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.

**3.1 Frequentist Statistical Testing**

In [5]:
# Calculate difference in proportions: p_diff
p_diff = rate_w - rate_b

# Calculate standard deviation of sample parameters: std
std = np.sqrt((rate_w * (1 - rate_w) / w_names) + (rate_b * (1 - rate_b) / b_names))
z_crit = 1.96 

# Calculate p-value, Margin of Error, and Confidence Interval: p_val, ME, CI
p_val = stats.norm.sf(abs(p_diff/std))*2
ME = std * z_crit
CI = (p_diff - ME, p_diff + ME)

print('P-Value: ', p_val)
print('Margin of Error: ', ME)
print('95% Confidence Interval: ', CI)

P-Value:  3.862565207522622e-05
Margin of Error:  0.015255406349886438
95% Confidence Interval:  (0.016777447859559147, 0.047288260559332024)


In [10]:
abs(p_diff/std)

4.11555043573

**3.2 Bootstrap Hypothesis Test**

In [43]:
# Functions
def bootstrap_replicate_1d(data, func):
    """Generate bootstrap replicate"""
    
    return func(np.random.choice(data, size=len(data)))


def draw_bs_reps(data, func, size=1):
    """Draw bootstrap replicates."""
    
    # Initialize array of replicates: bs_replicates
    bs_replicates = np.empty(size)

    # Generate replicates
    for i in range(size):
        bs_replicates[i] = bootstrap_replicate_1d(data, func)

    return bs_replicates

def permutation_sample(data1, data2):
    """Generate a permutation sample from two data sets."""
    
    # Concatenate the data sets: data
    data = np.concatenate((data1, data2))
    
    # Permute the concatenated array: permuted_data
    permuted_data = np.random.choice(data, size=len(data))
    
    # Split the permuted array into two: perm_sample1, perm_sample2
    perm_sample1 = permuted_data[:len(data1)]
    perm_sample2 = permuted_data[len(data1):]
    
    return perm_sample1, perm_sample2


def draw_perm_reps(data1, data2, size=1):
    """Generate multiple permutation replicates"""
    
    # Initialize array of replicates: perm_replicates
    perm_replicates = np.empty(size)
    
    for i in range(size):
        # Generate Permutation Sample
        perm_sample1, perm_sample2 = permutation_sample(data1, data2)
        
        # Compute the test statistic
        perm_replicates[i] = sum(perm_sample1)/len(perm_sample1) - sum(perm_sample2)/len(perm_sample2)
        
    return perm_replicates

In [44]:
b = data[data.race=='b'].call
w = data[data.race=='w'].call

# Calculate difference in proportions: p_diff
p_diff = rate_w - rate_b

# Set Random Generator Seeed
np.random.seed(17)

# Draw boostrap replicates for black sounding names: b_reps
b_reps = draw_bs_reps(b, np.mean, size=10000)

# Draw boostrap replicates for white sounding names: w_reps
w_reps = draw_bs_reps(w, np.mean, size=10000)

# Calculate difference in proportions for bootstrap replicates: diff_bs_reps
diff_bs_reps = w_reps - b_reps

# Calculate difference in proportions for permutation replicates: diff_perm_reps
diff_perm_reps = draw_perm_reps(w, b, size=10000)

In [45]:
z_crit = 1.96
p_val = np.sum(diff_perm_reps >= p_diff) / len(diff_perm_reps)
ME = z_crit * np.std(diff_bs_reps)
CI = np.percentile(diff_bs_reps, [2.5, 97.5])

print('P-Value: ', p_val)
print('Margin of Error: ', ME)
print('95% Confidence Interval: ', CI)

P-Value:  0.0
Margin of Error:  0.015300939670857914
95% Confidence Interval:  [0.01683778 0.04722793]


<div class="span5 alert alert-success">

**Analysis:**

The p-value is below the significance level of 0.05 and therefore we are able to reject the null hypothesis. 

### Q4: Write a story describing the statistical significance in the context or the original problem.

Is there racial discrimination in the job market? According to the data collected by researchers in this study there indeed is. Based off the samples analyzed above it appears that the likelihood of a resume with a white sounding name getting a callback is roughly 50% higher than their black counterpart.  It has been proven that the proportion of callbacks for white sounding names is significantly higher than the proportion of callbacks for black sounding names. We are 95% confident that the different in rate of callbacks between the two groups is approximately between 1.7% and 4.7%.

***

### Q5: Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

The analysis done above tells us that there is a significant relationship between race and callbacks. However, we cannot say for certain that this is the most important factor in callback success. There are other variables that have to be taken into account that we did not analyze from the dataset. Similar to what we did for race and callback rate, we must perform tests to determine which one is the most important factor. In addition to testing for significance we'd want to see if there is any correlation between different variables. 