# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
+ Formulas for the Bernoulli distribution: https://en.wikipedia.org/wiki/Bernoulli_distribution

In [1]:
import pandas as pd
import numpy as np
from scipy import stats

In [2]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [3]:
# number of callbacks for black-sounding names
sum(data[data.race=='w'].call)

235.0

In [4]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


In [5]:
data.columns

Index(['id', 'ad', 'education', 'ofjobs', 'yearsexp', 'honors', 'volunteer',
       'military', 'empholes', 'occupspecific', 'occupbroad', 'workinschool',
       'email', 'computerskills', 'specialskills', 'firstname', 'sex', 'race',
       'h', 'l', 'call', 'city', 'kind', 'adid', 'fracblack', 'fracwhite',
       'lmedhhinc', 'fracdropout', 'fraccolp', 'linc', 'col', 'expminreq',
       'schoolreq', 'eoe', 'parent_sales', 'parent_emp', 'branch_sales',
       'branch_emp', 'fed', 'fracblack_empzip', 'fracwhite_empzip',
       'lmedhhinc_empzip', 'fracdropout_empzip', 'fraccolp_empzip',
       'linc_empzip', 'manager', 'supervisor', 'secretary', 'offsupport',
       'salesrep', 'retailsales', 'req', 'expreq', 'comreq', 'educreq',
       'compreq', 'orgreq', 'manuf', 'transcom', 'bankreal', 'trade',
       'busservice', 'othservice', 'missind', 'ownership'],
      dtype='object')

## 1. What test is appropriate for this problem? Does CLT apply?

We are interested in two variables: racial connotation of applicant's name and whether a callback was recieved. These two variables are categorical and coded with binary values. As such, the appropriate test is a Chi-square test, which tests the assumption that the observed frequencies match the expected frequencies of a variable under the assumption that the groups, or levels of a variable, are independent from one another. 

In the context of the current dataset, this means that the callback frequency is independent of the racial connotation of an applicant's name.

The central limit theorem applies because the Chi-square distribution is related to a normal distribution. If a random variable is normally distributed, the square value of that variable forms a Chi-sqaure distribution. As the degrees of freedom increase, the chi-square distribution begins to approximate a normal distribution. This allows inferences based on the central limit theorem to be applied.

## 2.  What are the null and alternate hypotheses?

The null hypothesis is that the callback rate is independent of racial connotation of an applicant's name. The alternative hypothesis is that there is a different callback rate for the different ratial connotations.

## 3. Compute margin of error, confidence interval, and p-value.

In [6]:
# inspect variables of interest
display(data.race.value_counts())
display(data.call.value_counts())

b    2435
w    2435
Name: race, dtype: int64

0.0    4478
1.0     392
Name: call, dtype: int64

In [7]:
# generate contingency table
c_table = pd.crosstab(data.race, data.call)
display(c_table)

call,0.0,1.0
race,Unnamed: 1_level_1,Unnamed: 2_level_1
b,2278,157
w,2200,235


It appears that applicant's with white sounding names have a higher callback rate. Let's conduct a Chi-square test.

In [8]:
# calculate statistics using a chi-square test
chi, p, df, freq = stats.chi2_contingency(c_table)

In [9]:
print(f'The Chi-square statistic is {round(chi, 2)} with a p-value of {round(p, 2)} and a degrees of freedom of {df}.')

The Chi-square statistic is 16.45 with a p-value of 0.0 and a degrees of freedom of 1.


In [10]:
def conf_int(observed):
    """Method to calculate the Wald 95% confidence interval for delta. Input is a numpy array"""
    # get sample size
    n1, n2 = observed.sum(axis=1)
    # calculate proportions
    p1 = observed[0,0]/n1
    p2 = observed[1,0]/n2
    # calculate delta as difference in proportions
    delta = p1 - p2
    # calculate standard error
    se = np.sqrt(p1*(1 - p1)/n1 + p2*(1 - p2)/n2)
    # calculate confidence intervals
    ci = (round(delta - 1.96*se, 4), round(delta + 1.96*se, 4))
    return round(delta, 4), ci

In [11]:
# calculate delta and confidence interval
delta, ci = conf_int(np.array(c_table.values))
print(f'The delta is {delta}. The 95% confidence interval is {ci} with an associated margin of error of {round(delta - ci[0], 3)}.' )

The delta is 0.032. The 95% confidence interval is (0.0168, 0.0473) with an associated margin of error of 0.015.


## 4. Racial discrimination in job applications

The study was designed to assess whether there is racial discrimination in the job market in the US. This research question was investigated by randomly assigning black or white sounding names to resumes that were submitted to job positings. The hypothesis was that black sounding names would lead to a lower callback rate, indicating that racial discrimination is evident.

This hypothesis was assessed using a Chi-square statistical test. This statistical test indicates whether the observed frequency of a variable, in our case callback rate, is independent of another variable, racial connotation of name. The probability value of the test was below the critical value of 0.05, indicating that callback rate is not independent of racial connotation of name.

This outcome is important as it shows that having a black sounding name can lead to lower callbacks on job applications. The following section will describe why this may not be the case, however.

## 5.Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis? 

The analysis does not mean that racial connotation of a name is the most important factor. This is because our statistical model only included racial connotation. There are many other varaibles that are likely to be related to callback rate.

In [12]:
data.columns

Index(['id', 'ad', 'education', 'ofjobs', 'yearsexp', 'honors', 'volunteer',
       'military', 'empholes', 'occupspecific', 'occupbroad', 'workinschool',
       'email', 'computerskills', 'specialskills', 'firstname', 'sex', 'race',
       'h', 'l', 'call', 'city', 'kind', 'adid', 'fracblack', 'fracwhite',
       'lmedhhinc', 'fracdropout', 'fraccolp', 'linc', 'col', 'expminreq',
       'schoolreq', 'eoe', 'parent_sales', 'parent_emp', 'branch_sales',
       'branch_emp', 'fed', 'fracblack_empzip', 'fracwhite_empzip',
       'lmedhhinc_empzip', 'fracdropout_empzip', 'fraccolp_empzip',
       'linc_empzip', 'manager', 'supervisor', 'secretary', 'offsupport',
       'salesrep', 'retailsales', 'req', 'expreq', 'comreq', 'educreq',
       'compreq', 'orgreq', 'manuf', 'transcom', 'bankreal', 'trade',
       'busservice', 'othservice', 'missind', 'ownership'],
      dtype='object')

Some of these variables include level of education, years of experience, and skills related to the job. These are variables that would need to be factored into a statistical model, as they are likely to explain some of the variance in callback rate.

A more suitable analysis given the dataset is to develop a logistic regression model. These models calculate the liklihood of an outcome, here callback versus no callback, based on a set of predictor variables. The predictor variables would be inputs that are known to be related to job performance, such as education and years of experience. A logistic regression model would assign probability values to each of the predictors indicating whether they were influencial in the overall model. If racial connotation was statistically significant in the model, the interpretation would be that in the prescence of other variables that contribute to the callback rate of an applicant, racial connotation of name influences how often an applicant recieves a callback.