# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
+ Formulas for the Bernoulli distribution: https://en.wikipedia.org/wiki/Bernoulli_distribution

In [1]:
import pandas as pd
import numpy as np
from scipy import stats

In [2]:
df = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [3]:
df.shape

(4870, 65)

In [4]:
df.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


In [5]:
df.columns

Index(['id', 'ad', 'education', 'ofjobs', 'yearsexp', 'honors', 'volunteer',
       'military', 'empholes', 'occupspecific', 'occupbroad', 'workinschool',
       'email', 'computerskills', 'specialskills', 'firstname', 'sex', 'race',
       'h', 'l', 'call', 'city', 'kind', 'adid', 'fracblack', 'fracwhite',
       'lmedhhinc', 'fracdropout', 'fraccolp', 'linc', 'col', 'expminreq',
       'schoolreq', 'eoe', 'parent_sales', 'parent_emp', 'branch_sales',
       'branch_emp', 'fed', 'fracblack_empzip', 'fracwhite_empzip',
       'lmedhhinc_empzip', 'fracdropout_empzip', 'fraccolp_empzip',
       'linc_empzip', 'manager', 'supervisor', 'secretary', 'offsupport',
       'salesrep', 'retailsales', 'req', 'expreq', 'comreq', 'educreq',
       'compreq', 'orgreq', 'manuf', 'transcom', 'bankreal', 'trade',
       'busservice', 'othservice', 'missind', 'ownership'],
      dtype='object')

<div class="span5 alert alert-success">
<p>Your answers to Q1 and Q2 here</p>
</div>

### What test is appropriate for this problem? Does CLT apply?

Since we dealing with binomial distribution a Chi square statistics test can be used, and CLT can also be applied as we will be using larger number of permutations to test.

## Question 2. What are the null and alternate hypotheses?

__H0__ : Race has no impact on the rate of callbacks for resumes. __H1__ : That black sounding names get a significantly lower callback rate than white sounding names.

In [6]:
# Contigency table
df = pd.DataFrame([df.race, df.call==1.0], ['race', 'call']).transpose()
df['values'] = 1
cont_table = pd.pivot_table(df, values='values', index='race', columns=['call'], aggfunc=np.sum)
cont_table

call,False,True
race,Unnamed: 1_level_1,Unnamed: 2_level_1
b,2278,157
w,2200,235


Pearson's Chi square using a = 0.05

In [7]:
#This function computes the chi-square statistic and p-value for the hypothesis test of independence 
#of the observed frequencies in the contingency table
chi2, p, dof, expected = stats.chi2_contingency(cont_table)
print('T-statistic: ', chi2)
print('Degrees of freedom ', dof) # (rows - 1) * (columns - 1)
print('p value: ', p)


T-statistic:  16.44902858418937
Degrees of freedom  1
p value:  4.997578389963255e-05


Since p < a; we reject the null hypothesis. 

## Question 3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.

In [8]:
def my_func(x):
    mu, s, n = x.mean(), x.std(), len(x)
    sem = stats.sem(x)
    CI = stats.norm.interval(0.95, mu, sem)
    print('Sample mean: ', mu)
    print('Sample variance: ', s**2)
    print('Sample margin of error ', sem)
    print('90% confidence interval: ', CI)
    

In [9]:
my_func(df[df.race == 'b'].call)

Sample mean:  0.06447638603696099
Sample variance:  0.06034396359580818
Sample margin of error  0.004978143753892395
90% confidence interval:  (0.05471940356946887, 0.07423336850445311)


In [10]:
my_func(df[df.race == 'w'].call)

Sample mean:  0.09650924024640657
Sample variance:  0.08723103062534691
Sample margin of error  0.005985301318980728
90% confidence interval:  (0.08477826522458426, 0.10824021526822888)


In [11]:
#Using Z-test

b_call_back = df[df.race == 'b'].call
w_call_back = df[df.race == 'w'].call
stats.ttest_ind(b_call_back, w_call_back)


Ttest_indResult(statistic=-4.114705266723109, pvalue=3.9408025140692845e-05)

 ### Using bootstrap


In [22]:
diff_means = np.mean(df[df['race'] == 'w'].call) - np.mean(df[df['race'] == 'b'].call)

In [23]:
print(diff_means)

0.032032854209445585


In [24]:
def bootstrap_replicate_1d(data, func):
    return func(np.random.choice(data, size=len(data)))

In [25]:
def draw_bs_reps(data, func, size=1):

    bs_replicates = np.empty(size)

    # Generate replicates
    for i in range(size):
        bs_replicates[i] = bootstrap_replicate_1d(data, func)
        
    return bs_replicates

In [27]:
bs_replicates_w = draw_bs_reps(b_call_back , np.mean, size=10000)
bs_replicates_b = draw_bs_reps(w_call_back , np.mean, size=10000)
replicates_diff = bs_replicates_w - bs_replicates_b


In [28]:
p =np.sum(replicates_diff>=diff_means) / len(replicates_diff)
print(p)

0.0


P value is 0.0, this is evident from our earlier test. therefore we should reject the null hypothesis

<div class="span5 alert alert-success">
<p> Your answers to Q4 and Q5 here </p>
</div>

## Write a story describing the statistical significance in the context or the original problem.

There is a significant difference in black and white sounding names in term of call back. This does suggest that names do play a role. and Employers should not consider names in their hiring process.

## Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

The statistical tests and analysis above were not sufficient to evaluate such a claim, As other factors such as previous work experience, gender, education, have not been considered.