# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
****

In [34]:
import pandas as pd
import numpy as np
from scipy import stats

In [35]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [36]:
pd.set_option('display.max_columns', 500)
data.head()


Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,occupbroad,workinschool,email,computerskills,specialskills,firstname,sex,race,h,l,call,city,kind,adid,fracblack,fracwhite,lmedhhinc,fracdropout,fraccolp,linc,col,expminreq,schoolreq,eoe,parent_sales,parent_emp,branch_sales,branch_emp,fed,fracblack_empzip,fracwhite_empzip,lmedhhinc_empzip,fracdropout_empzip,fraccolp_empzip,linc_empzip,manager,supervisor,secretary,offsupport,salesrep,retailsales,req,expreq,comreq,educreq,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,1,0,0,1,0,Allison,f,w,0.0,1.0,0.0,c,a,384.0,0.98936,0.0055,9.527484,0.274151,0.037662,8.706325,1.0,5,,1.0,,,,,,,,,,,,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,6,1,1,1,0,Kristen,f,w,1.0,0.0,0.0,c,a,384.0,0.080736,0.888374,10.408828,0.233687,0.087285,9.532859,0.0,5,,1.0,,,,,,,,,,,,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,1,1,0,1,0,Lakisha,f,b,0.0,1.0,0.0,c,a,384.0,0.104301,0.83737,10.466754,0.101335,0.591695,10.540329,1.0,5,,1.0,,,,,,,,,,,,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,5,0,1,1,1,Latonya,f,b,1.0,0.0,0.0,c,a,384.0,0.336165,0.63737,10.431908,0.108848,0.406576,10.412141,0.0,5,,1.0,,,,,,,,,,,,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,5,1,1,1,0,Carrie,f,w,1.0,0.0,0.0,c,a,385.0,0.397595,0.180196,9.876219,0.312873,0.030847,8.728264,0.0,some,,1.0,9.4,143.0,9.4,143.0,0.0,0.204764,0.727046,10.619399,0.070493,0.369903,10.007352,0.0,0.0,1.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


1) What test is appropriate for this problem? Does CLT apply?
    
    A two-sample proportion Z-test is appropriate to test whether race has a significant impact on the rate of callbacks for resumes. Of course the Central Limit theorem applies as we have a sample more than 30 observations. 
    
2) What are the null and alternate hypotheses?

Successful callback rate = # successful callbacks for a race/total number of calls or candidates for a race

**Null Hypothesis:** Proportions of successful callback rates are equal

**Alternate Hypothesis:** Proportions of successful callback rates are not equal

**Confidence:** 95% or .95, **Alpha:** 5% or .05

### Finding Callback Ratios

In [37]:
# number of callbacks for black-sounding names
black_call_success = sum(data[data.race=='b'].call)
print("Blacks were called back {} times".format(black_call_success))

#finding total number of black resumes or candidates
black_call_total = len(data[data.race=='b'])
print("Total number of black candidates is {}".format(black_call_total))


# calculating number of times whites were called back for an interview
white_call_success = sum(data[data['race']=='w'].call)
print("\nWhites were called back {} times".format(white_call_success))

white_call_total = len(data[data.race=='w'])
print("Total number of white candidates is {}".format(white_call_total))



Blacks were called back 157.0 times
Total number of black candidates is 2435

Whites were called back 235.0 times
Total number of white candidates is 2435


### Two Sample Proportion Z-test

In [38]:
# Performing population proportion ztest

import statsmodels.api as sm

z_score, p_value = sm.stats.proportions_ztest([white_call_success, black_call_success], [white_call_total, black_call_total])

print(z_score, p_value)

4.10841215243 3.98388683759e-05


### Reminder
Null Hypothesis: Proportions of successful callback rates are equal

Alternate Hypothesis: Proportions of successful callback rates are not equal

Confidence: 95% or .95, Alpha: 5% or .05

***

After running two sample proportion Z-test with 95% confidence, we get a P-value of less than our alpha (5%) which calls for a rejection of the null hypothesis, telling us with strong confidence (it's not random noise) that there is in fact a difference between callback rates for whites and blacks. 


### Computing Confidence Intervals

In [58]:
#Computing Confidence Interval for white and black callbacks
import statsmodels

#Whites confidence interval
ci_low_white, ci_upp_white = statsmodels.stats.proportion.proportion_confint(white_call_success,white_call_total, alpha=.05)

print("The callback confidence interval (95%) for whites is {} to {}".format(round(ci_low_white,4), round(ci_upp_white,4)))

#Blacks confidence interval
ci_low_black, ci_upp_black = statsmodels.stats.proportion.proportion_confint(black_call_success, black_call_total, alpha=.05)

print("\nThe callback confidence interval (95%) for blacks is {} to {}".format(round(ci_low_black,4),round(ci_upp_black,4)))

The callback confidence interval (95%) for whites is 0.0848 to 0.1082

The callback confidence interval (95%) for blacks is 0.0547 to 0.0742


### Computing Margin of Error

In [57]:
#Writing a function to compute margin of error for sample proportion with 95% confidence interval
import math

def MarginofError(call_success, total_candid):
    
    call_success_prop = call_success/total_candid
    
    MOE = 1.96 * math.sqrt((call_success_prop*(1-call_success_prop))/total_candid)
    return MOE

#Margin of Error for White callbacks
moe_whites = MarginofError(white_call_success,white_call_total)
print("The Margin of Error for White callbacks is:", round(moe_whites,4))

#Margin of Error for Black callbacks
moe_blacks = MarginofError(black_call_success, black_call_total)
print("\nThe Margin of Error for Black callbacks is:", round(moe_blacks,4))

The Margin of Error for White callbacks is: 0.0117

The Margin of Error for Black callbacks is: 0.0098


5) Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

This analysis or hypothesis test does not tell us that race/name are the most important factors in callback success.The test only tells us that statisically there is a difference between the callback sucess for blacks and whites.  A model would have to be developed to decide upon the most import features or variables used to predict callback sucess. 