<h2> <center> Resume Experiment Analysis¶ <center> <h2>

<center> Preet Khowaja <center>

In [1]:
# Downloading the data and importing relevant packages
import pandas as pd
import numpy as np
from scipy import stats
import statsmodels.formula.api as smf

In [2]:
resume_data = pd.read_stata("resume_experiment.dta")
resume_data.head(7)

Unnamed: 0,education,ofjobs,yearsexp,computerskills,call,female,black
0,4,2,6,1,0.0,1.0,0.0
1,3,3,6,1,0.0,1.0,0.0
2,4,1,6,1,0.0,1.0,1.0
3,3,4,6,1,0.0,1.0,1.0
4,3,3,22,1,0.0,1.0,0.0
5,4,2,6,0,0.0,0.0,0.0
6,4,2,5,1,0.0,1.0,0.0


In [3]:
# Checking dimensions/type of data
print(resume_data.shape)
print(resume_data.dtypes)

# Check for missing data
resume_data.isna().sum()

(4870, 7)
education            int8
ofjobs               int8
yearsexp             int8
computerskills       int8
call              float32
female            float32
black             float32
dtype: object


education         0
ofjobs            0
yearsexp          0
computerskills    0
call              0
female            0
black             0
dtype: int64

### Exercise 1¶

**Check for balance in terms of applicant gender (female), computer skills (computerskills), and years of experience (yearsexp) across the two arms of the experiment (i.e. by black). Calculate both the differences across treatment arms and test for statistical significance of these differences. Do gender and computer skills look balanced across race groups?** 

The p-values we obtain while performing a t-test for gender is 0.377, indicating that both groups are balanced with regards to gender. However, they are not balanced when it comes to computer skills since the p-value is 0.030. This indicates that there is a statistically significant difference between the two groups' computer skills. They are also balanced when it comes to years of experience. 

In [4]:
# Dividing data by black variable
black_data = resume_data[resume_data['black'] == 1]
non_black_data = resume_data[resume_data['black'] == 0]

In [5]:
# Checking for balance across gender, computer skills
# and yearsexp using a t-test
print("The t-test result for gender is:")
print(stats.ttest_ind(black_data['female'], non_black_data['female']))
print("The t-test result for computer skills is:")
print(stats.ttest_ind(black_data['computerskills'], non_black_data['computerskills']))
print("The t-test result for years of experience is:")
print(stats.ttest_ind(black_data['yearsexp'], non_black_data['yearsexp']))

The t-test result for gender is:
Ttest_indResult(statistic=0.8841321018026016, pvalue=0.37666856909823254)
The t-test result for computer skills is:
Ttest_indResult(statistic=2.1664271042751966, pvalue=0.030326933955391936)
The t-test result for years of experience is:
Ttest_indResult(statistic=-0.18461970685747395, pvalue=0.8535350182481283)


### Exercise 2¶

**Do a similar tabulation for education (education).** 

**Because these are categorical, you shouldn’t just calculate and compare means – you should compare share of observations with each value separately using a ttest, or do a chi-squared test (technically chi-squared is the correct test, but I’m ok with either).**

**Does education and the number of previous jobs look balanced across racial groups?**

In [6]:
# Computing frequencies by education
frequencies = pd.crosstab(resume_data.black,resume_data.education)

# Testing whether these frequencies are statistically similar
# in both racial categories
chi2, p, dof, ex = stats.chi2_contingency(frequencies)

print("The p-value of the test is {:.3f}".format(p))

The p-value of the test is 0.492


In [7]:
# Computing frequencies by education
frequencies2 = pd.crosstab(resume_data.black,resume_data.ofjobs)

# Testing whether these frequencies are statistically similar
# in both racial categories
chi22, p2, dof2, ex2 = stats.chi2_contingency(frequencies2)

print("The p-value of the test is {:.3f}".format(p2))

The p-value of the test is 0.741


According to these p-values (> 0.05), we can conclude that there isn't enough evidence to reject the null hypothesis, $H_0$ which claims that the two groups are balanced on education and number of previous jobs. Hence, we can say that there is balance on these two variables.

### Exercise 3¶

**What do you make of the overall results on resume characteristics? Why do we care about whether these variables look similar across the race groups?**

Overall, the groups look balanced in gender, education, number of previous jobs and years of experience. There seems to be an imbalance in the computer skills variable. However, the imbalance is not significant at the 0.01 level, just the 0.05 level. 
The reason it is important to control for these variables in a causal study such as this one is to ensure that the treatment effect is attributed to the treatment and not one of these factors. For example, we want to ensure that it is truly having a black-sounding name that affects likelihood of getting an interview call rather than someone being female. 

### Exercise 4¶

**The variable of interest in the data set is the variable call, which indicates a call back for an interview. Perform a two-sample t-test comparing applicants with black sounding names and white sounding names.**

In [8]:
stats.ttest_ind(black_data.call, non_black_data.call)

Ttest_indResult(statistic=-4.114705290861751, pvalue=3.940802103128886e-05)

According to the t-test above, we observe no statistical difference between the calls received by black sounding names and non-black sounding names. The p-value is very large and we cannot reject the null hypothesis. 

### Exercise 5¶

**Now, use a regression model to estimate the differential likelihood of being called back by applicant race (i.e. the racial discrimination by employers). Is the difference statistically significant?**

In [9]:
# Fitting a regression model
model_1 = smf.ols("call ~ black", resume_data).fit()
model_1.get_robustcov_results(cov_type="HC3").summary()

0,1,2,3
Dep. Variable:,call,R-squared:,0.003
Model:,OLS,Adj. R-squared:,0.003
Method:,Least Squares,F-statistic:,16.92
Date:,"Thu, 27 Jan 2022",Prob (F-statistic):,3.96e-05
Time:,13:29:18,Log-Likelihood:,-562.24
No. Observations:,4870,AIC:,1128.0
Df Residuals:,4868,BIC:,1141.0
Df Model:,1,,
Covariance Type:,HC3,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,0.0965,0.006,16.121,0.000,0.085,0.108
black,-0.0320,0.008,-4.114,0.000,-0.047,-0.017

0,1,2,3
Omnibus:,2969.205,Durbin-Watson:,1.44
Prob(Omnibus):,0.0,Jarque-Bera (JB):,18927.068
Skew:,3.068,Prob(JB):,0.0
Kurtosis:,10.458,Cond. No.,2.62


According to the linear model above, the difference of being called for an interview based on perceived race is very significant and black-sounding names have a 3.2% lower chance of being called for an interview.

### Exercise 6¶

**Now let’s see if we can improve our estimates by adding in other variables as controls. Add in education, yearsexp, female, and computerskills – be sure to treat education as a categorical variable!**

In [10]:
## Fitting a model with other predictors
model_2 = smf.ols("call ~ black + yearsexp + "
"female + computerskills + C(education)", resume_data).fit()
model_2.get_robustcov_results(cov_type="HC3").summary()

0,1,2,3
Dep. Variable:,call,R-squared:,0.008
Model:,OLS,Adj. R-squared:,0.006
Method:,Least Squares,F-statistic:,4.35
Date:,"Thu, 27 Jan 2022",Prob (F-statistic):,3.04e-05
Time:,13:29:19,Log-Likelihood:,-551.02
No. Observations:,4870,AIC:,1120.0
Df Residuals:,4861,BIC:,1178.0
Df Model:,8,,
Covariance Type:,HC3,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,0.0821,0.040,2.053,0.040,0.004,0.160
C(education)[T.1],-0.0017,0.057,-0.030,0.976,-0.113,0.110
C(education)[T.2],-8.953e-05,0.042,-0.002,0.998,-0.082,0.082
C(education)[T.3],-0.0025,0.039,-0.065,0.948,-0.079,0.074
C(education)[T.4],-0.0047,0.038,-0.124,0.901,-0.080,0.070
black,-0.0316,0.008,-4.076,0.000,-0.047,-0.016
yearsexp,0.0032,0.001,3.665,0.000,0.001,0.005
female,0.0112,0.010,1.165,0.244,-0.008,0.030
computerskills,-0.0186,0.011,-1.616,0.106,-0.041,0.004

0,1,2,3
Omnibus:,2950.646,Durbin-Watson:,1.448
Prob(Omnibus):,0.0,Jarque-Bera (JB):,18631.25
Skew:,3.047,Prob(JB):,0.0
Kurtosis:,10.395,Cond. No.,225.0


### Exercise 7¶

**These effects are the average effects. Now let’s look for heterogeneous treatment effects.**

**Look only at candidates with high educations. Is there more or less racial discrimination among these highly educated candidates? Is the difference statistically significant?**

In [11]:
# first we create a new variable for higher ed
resume_data['high_ed'] = np.where(resume_data.education == 4,
 'col_degree', 'no_col_degree')

# Then we create a regression and add the interaction 
# between race and higher ed
model_3 = smf.ols("call ~ black + "
"black:high_ed + high_ed + yearsexp + "
"female + computerskills + C(education) ", resume_data).fit()
model_3.get_robustcov_results(cov_type="HC3").summary()

0,1,2,3
Dep. Variable:,call,R-squared:,0.008
Model:,OLS,Adj. R-squared:,0.006
Method:,Least Squares,F-statistic:,42.83
Date:,"Thu, 27 Jan 2022",Prob (F-statistic):,4.6299999999999994e-82
Time:,13:29:19,Log-Likelihood:,-550.76
No. Observations:,4870,AIC:,1122.0
Df Residuals:,4860,BIC:,1186.0
Df Model:,9,,
Covariance Type:,HC3,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,0.0544,0.015,3.517,0.000,0.024,0.085
high_ed[T.no_col_degree],0.0331,0.026,1.281,0.200,-0.018,0.084
C(education)[T.1],-0.0023,0.057,-0.040,0.968,-0.114,0.110
C(education)[T.2],-0.0012,0.042,-0.030,0.976,-0.083,0.081
C(education)[T.3],-0.0036,0.039,-0.092,0.927,-0.080,0.073
C(education)[T.4],0.0212,0.014,1.501,0.133,-0.007,0.049
black,-0.0282,0.009,-3.091,0.002,-0.046,-0.010
black:high_ed[T.no_col_degree],-0.0123,0.017,-0.710,0.478,-0.046,0.022
yearsexp,0.0032,0.001,3.672,0.000,0.001,0.005

0,1,2,3
Omnibus:,2950.182,Durbin-Watson:,1.448
Prob(Omnibus):,0.0,Jarque-Bera (JB):,18623.859
Skew:,3.046,Prob(JB):,0.0
Kurtosis:,10.393,Cond. No.,5820000000000000.0


The intercept of black and not being a college graduate is -0.0123, which indicates that the probability of getting a call falls. The difference among college graduates is not statistically significant(according to the p-value). This means being a black non-college graduate is not really different from being a black college graduate in terms of the likelihood of getting an interview call. However, as discussed in class, p-values can sometimes be misleading. It may very well be that this experiment is underpowered and we need more samples to make a conclusive decision. 

### Exercise 8¶

**Now let’s compare men and women – is discrimination greater for Black men or Black women? Is the difference statistically significant?**

In [12]:
# Adding an interaction term for black
#  and female
model_4 = smf.ols("call ~ black + black:high_ed + "
"black:female + high_ed + yearsexp + "
"female + computerskills + C(education) ", resume_data).fit()
model_4.get_robustcov_results(cov_type="HC3").summary()

0,1,2,3
Dep. Variable:,call,R-squared:,0.008
Model:,OLS,Adj. R-squared:,0.006
Method:,Least Squares,F-statistic:,38.93
Date:,"Thu, 27 Jan 2022",Prob (F-statistic):,3.2299999999999997e-81
Time:,13:29:19,Log-Likelihood:,-550.76
No. Observations:,4870,AIC:,1124.0
Df Residuals:,4859,BIC:,1195.0
Df Model:,10,,
Covariance Type:,HC3,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,0.0540,0.016,3.327,0.001,0.022,0.086
high_ed[T.no_col_degree],0.0329,0.026,1.264,0.206,-0.018,0.084
C(education)[T.1],-0.0024,0.057,-0.042,0.967,-0.115,0.110
C(education)[T.2],-0.0012,0.042,-0.030,0.976,-0.083,0.081
C(education)[T.3],-0.0036,0.039,-0.092,0.927,-0.080,0.073
C(education)[T.4],0.0211,0.014,1.478,0.140,-0.007,0.049
black,-0.0272,0.016,-1.748,0.080,-0.058,0.003
black:high_ed[T.no_col_degree],-0.0121,0.018,-0.671,0.502,-0.047,0.023
black:female,-0.0013,0.019,-0.070,0.944,-0.038,0.035

0,1,2,3
Omnibus:,2950.179,Durbin-Watson:,1.448
Prob(Omnibus):,0.0,Jarque-Bera (JB):,18623.866
Skew:,3.046,Prob(JB):,0.0
Kurtosis:,10.393,Cond. No.,5940000000000000.0


The intercept indicates that being a black woman reduces your likelihood of getting a call. However, the p-value for this is not significant, indicating that this difference between black men and women is not significant statistically.

### Exercise 9¶

**Calculate and/or lookup the following online:**

**What is the share of applicants in our dataset with college degrees?**



In [13]:
resume_data[resume_data.high_ed == 'col_degree'].shape[0]
resume_data.shape[0]
perc = (3504/4870) * 100
print(f"The percentage of people with a college degree is {perc:.2f}%")

The percentage of people with a college degree is 71.95%


**What share of Black adult Americans have college degrees (i.e. have completed a bachelors degree)?**

The statistic is 16.% for adult African Americans who have attained a bachelors degree or higher.

### Exercise 10¶

**What are the implications of your answers to Exercise 7 and to Exercise 9 to how you interpret the Average Treatment Effect you estimated in Exercise 6?**

The above considerations in questions 7 and 9 change a few things about how we perceive our results from a regression on the likelihood of getting an interview call. It enhances the idea that it really is the race that is causing the treatment effect rather than college_education because the intercept for the interaction between higher education and race is *not* significant. However, we notice that the percentage of applicants in our dataset with college degrees is much higher than the actual percentage of black adults in the US who have completed college. This means that our study may be externally invalid since the percentages seem to be very different.