
### Examining racial discrimination in the US job market

#### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

#### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes.

#### Exercise
Perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

In [2]:
import pandas as pd
import numpy as np
from scipy import stats

In [3]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

#### Approach
By using chi-square test for independence check hypothesis H0: there is no significant relationship between 'race' and 'call' versus H1: there is a significant relationship.

#### Building the contingency table

In [4]:
df = pd.DataFrame({'Yes': [data.loc[(data['race'] == 'b') & (data['call'] == 1.0), 'call'].count(), 
                           data.loc[(data['race'] == 'w') & (data['call'] == 1.0), 'call'].count()],
                  'No': [data.loc[(data['race'] == 'b') & (data['call'] == 0.0), 'call'].count(), 
                           data.loc[(data['race'] == 'w') & (data['call'] == 0.0), 'call'].count()]}, index = ('b', 'w'))
df

Unnamed: 0,No,Yes
b,2278,157
w,2200,235


 #### Computing the chi-square statistic and p-value for the hypothesis test of independence

In [5]:
stats.chi2_contingency(df)

(16.449028584189371, 4.9975783899632552e-05, 1, array([[ 2239.,   196.],
        [ 2239.,   196.]]))

P-value is 0.005%, so we reject H0 statement and state that there is a significant relationship.