# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In this dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

The 'b' and 'w' values in race are consider to be assigned randomly to the resumes when presented to the employer.

For this exercise, I will be performing a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

And some of the following questions will be tried to be answered:

1. What test is appropriate for this problem? Does CLT apply?
2. I am defining an appropriate null and alternate hypotheses.
3. I am computing a margin of error, confidence interval, and p-value.
4. I am summarizing the statistical significance in the context or the original problem.
5. And finally, does the analysis mean that race/name is the most critical factor in callback success? And if not, how would you amend your analysis?


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states


## Modules
---

In [1]:
import pandas as pd
import numpy as np
from scipy import stats
import math
from scipy.stats import chi2_contingency
from statsmodels.stats.proportion import proportions_ztest
pd.set_option('display.max_columns', 100)

## Data
---

In [2]:
data = pd.io.stata.read_stata('us_job_market_discrimination.dta')
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,occupbroad,workinschool,email,computerskills,specialskills,firstname,sex,race,h,l,call,city,kind,adid,fracblack,fracwhite,lmedhhinc,fracdropout,fraccolp,linc,col,expminreq,schoolreq,eoe,parent_sales,parent_emp,branch_sales,branch_emp,fed,fracblack_empzip,fracwhite_empzip,lmedhhinc_empzip,fracdropout_empzip,fraccolp_empzip,linc_empzip,manager,supervisor,secretary,offsupport,salesrep,retailsales,req,expreq,comreq,educreq,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,1,0,0,1,0,Allison,f,w,0.0,1.0,0.0,c,a,384.0,0.98936,0.0055,9.527484,0.274151,0.037662,8.706325,1.0,5,,1.0,,,,,,,,,,,,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,6,1,1,1,0,Kristen,f,w,1.0,0.0,0.0,c,a,384.0,0.080736,0.888374,10.408828,0.233687,0.087285,9.532859,0.0,5,,1.0,,,,,,,,,,,,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,1,1,0,1,0,Lakisha,f,b,0.0,1.0,0.0,c,a,384.0,0.104301,0.83737,10.466754,0.101335,0.591695,10.540329,1.0,5,,1.0,,,,,,,,,,,,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,5,0,1,1,1,Latonya,f,b,1.0,0.0,0.0,c,a,384.0,0.336165,0.63737,10.431908,0.108848,0.406576,10.412141,0.0,5,,1.0,,,,,,,,,,,,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,5,1,1,1,0,Carrie,f,w,1.0,0.0,0.0,c,a,385.0,0.397595,0.180196,9.876219,0.312873,0.030847,8.728264,0.0,some,,1.0,9.4,143.0,9.4,143.0,0.0,0.204764,0.727046,10.619399,0.070493,0.369903,10.007352,0.0,0.0,1.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


In [3]:
samplesize_t = len(data)

callbacks_b = sum(data[data.race=='b'].call)
samplesize_b = len(data[data.race=='b'])

callbacks_w = sum(data[data.race=='w'].call)
samplesize_w = len(data[data.race=='w'])

print('total sample size:', samplesize_t)
print('black-sounding names call backs:', callbacks_b)
print('black-sounding names sample size:', samplesize_b)
print('white-sounding names call backs:', callbacks_w)
print('white-sounding names sample size:', samplesize_w)

total sample size: 4870
black-sounding names call backs: 157.0
black-sounding names sample size: 2435
white-sounding names call backs: 235.0
white-sounding names sample size: 2435


## Analysis
---

## Appropriate test and The central-limit-theorem

* Looking at our sample data, we do see a difference between the number of callbacks between white and black sounding names in resumes.


* Their purpose of this analysis will be to find whether this difference is due to random chance alone coming from the data collected or whether there is significant evidence that there may be a correlation between the name sounding type of race and call back from resume.


* We have a large sample size for both groups(2435 each), and each of the groups is entirely independent of each in regards of the rest of factors in our sample data, with this said, the **central-limit-theorem can be applied.**


* The data that we need for analysis is qualitative rather than quantitative, consisting of names being either black or white, and success or failure in resume callbacks.


* When dealing for significance in a correlation between two qualitative variables, the use of the **Chi-Square Test for independence is appropriate.**

### Hypothesis

**Null hypothesis:** Quantity of call backs between race sounding name is resume is not statistically significant. There is no relationship.

**Alternative hypothesis:** There is a relationship between race sounding names in resumes and callback for interviews.

**Significance level:** 0.05

### Contingency table, chi-square, and p-value

In [4]:
# Shaping a contingency table for qualitative dataset
ctg_table = data.pivot_table('id', ['call'], 'race', aggfunc= {'id': 'count'})
ctg_table.index = ['Failure', 'Success']
ctg_table.columns = ['Black', 'White']

ctg_table.loc['Total'] = [ ctg_table['Black'].sum(), ctg_table['White'].sum()]
ctg_table['Total'] = ctg_table['Black'] + ctg_table['White']

ctg_table

Unnamed: 0,Black,White,Total
Failure,2278,2200,4478
Success,157,235,392
Total,2435,2435,4870


In [5]:
# builds dataframe values into array for calculations
ctg_array = np.array(ctg_table.values)


# TOTAL EXPECTED SUCCESSES AND FAILURES SHOULD BE EQUAL FOR BOTH GROUPS
# Expected failures for each group is equals to (Total failures * either total race values) / Total sample
# Expected success for each group is equals to (Total successes * either total race values) / Total sample

expct_failures = (ctg_array[0][2] * ctg_array[2][0]) / ctg_array[2][2]
expct_successes = (ctg_array[1][2] * ctg_array[2][0] / ctg_array[2][2])
print('Expected failures:', expct_failures, '\nExpected successes:', expct_successes)

Expected failures: 2239.0 
Expected successes: 196.0


In [6]:
# Obtaining Chi-square value
chi_square = (((ctg_array[0][0] - expct_failures)**2/ expct_failures) 
            + ((ctg_array[0][1] - expct_failures)**2/ expct_failures)
            + ((ctg_array[1][0] - expct_successes)**2/ expct_successes) 
            + ((ctg_array[1][1] - expct_successes)**2/ expct_successes))

# Obtaining Chi-square value
dg_freedom = (2-1)*(2-1)

# Obtaining p-value
p_value = 1 - stats.chi2.cdf(chi_square, dg_freedom)


print('chi-square:', chi_square, '\ndegrees of freedom:', dg_freedom, '\np-value:', p_value)

chi-square: 16.8790504143 
degrees of freedom: 1 
p-value: 3.98388683759e-05



---
### A shorter method to obtain these results and less likely to bugs can be seen as follow:
* differences in values represent rounding of decimal results from calculations

In [7]:
# Shaping a contingency table for qualitative dataset
ctg_table = data.pivot_table('id', ['call'], 'race', aggfunc= {'id': 'count'})
ctg_table.index = ['Failure', 'Success']
ctg_table.columns = ['Black', 'White']

# builds dataframe values into array for calculations
ctgory_array = np.array(ctg_table.values)
# chi2_contingency returns main values for conclusions
stat, p, dof, expected = chi2_contingency(ctgory_array)

print('chi_square:', stat, '\np-value:', p, '\ndegrees of freedom:', dof, '\nexpected values:\n', expected)

chi_square: 16.4490285842 
p-value: 4.99757838996e-05 
degrees of freedom: 1 
expected values:
 [[ 2239.  2239.]
 [  196.   196.]]


#### Conclusion:

* Based on the obtained p-value of 4.99757838996e-05, which is way below our set significance level of 0.05. We can conclude that the chances that these sample results happened just due to chance alone are very low.


* So we can reject the null hypothesis and decide on the alternative that there is significant evidence that a relationship exists between how a name sound according to race in resume and callbacks for job interviews.


Furthermore, this process can be extended and automated to obtain an answer right away as shown below:

In [8]:
# interpret test-statistic
prob = 0.95
critical = stats.chi2.ppf(prob, dof)
print('probability=%.3f, critical=%.3f, stat=%.3f' % (prob, critical, stat))

if abs(stat) >= critical:
    print('Dependent (reject H0) and go for the (H1)Alternative!')
else:
    print('Independent (fail to reject H0)')

probability=0.950, critical=3.841, stat=16.449
Dependent (reject H0) and go for the (H1)Alternative!


---
## Analysis from a different perspective!

### Hypothesis test for difference in sample proportions 

**Null Hypothesis:** There is no difference in proportions of callbacks for blacks and whites.

**Alternate Hypothesis:** There is a difference in proportions of callbacks for blacks and white

**Significance level:** 0.05

#### Successes and trial counts

In [9]:
succss_count = ctg_array[1][:2]
trial_count = ctg_array[2][:2]
print('success_counts:', succss_count, '\ntrial_counts:', trial_count)

success_counts: [157 235] 
trial_counts: [2435 2435]


In [10]:
# z_statistic and p-value
stat, pval = proportions_ztest(succss_count, trial_count)

#critical value at 95% confidence
crt_value = stats.norm.ppf(1-(0.05/2))
print('critical value:', crt_value, '\nz-stat:', stat, '\np-value:', pval)

critical value: 1.95996398454 
z-stat: -4.10841215243 
p-value: 3.98388683759e-05


#### Confidence Intervals and margin of errors

In [11]:
# Proportion of black, white and total counts
b_propt = ctg_array[1][0]/ctg_array[2][0]
w_propt = ctg_array[1][1]/ctg_array[2][0]
t_count = trial_count[0]

# Estimated sigma (Margin error)
estm_sigma = np.sqrt(((b_propt*(1-b_propt)/t_count) + (w_propt*(1-w_propt)/t_count)))
error = estm_sigma*crt_value
# Proportion differences
propt_diff =  w_propt - b_propt 

# Confidence intervals
low_mrgerror = propt_diff - (error)
high_mrgerror = propt_diff + (error)
print('proportion differences: ', propt_diff, '\nmargin or error: ', error)
print('confidence intervals: ', low_mrgerror, high_mrgerror)

proportion differences:  0.0320328542094 
margin or error:  0.0152551260282
confidence intervals:  0.0167777281812 0.0472879802377


* Base on the resulting low p-value from the analysis, we can reject the null hypothesis and take the alternative that there is a significant difference between the proportion of successes in getting a call for interviews to those with white-sounding names and black sounding names.


* Also, we obtained a 95% confidence interval that the real proportion difference between these groups successes is between (.0167 and .047)

## Final Analysis Conclusion

* Based on two different analysis tests, there is a relationship between a specific race sounding name in resumes and callbacks. And so this might imply that there is a presence of discrimination happening in the hiring process concerning to race.


* I say, there might be because we have not further studied other critical variables that might influence the number of callbacks on resumes; such as education and work experience level, among possible others.


* So, a further study with the presence of other influential variables could be done to have stronger evidence of this case.

Alredo M. 