# Resume Experiment Analysis

In this exercise, we will be using a data set from a randomized experiment conducted by Marianne Bertrand and Sendhil Mullainathan, who sent 4,870 fictious resumes out to employers in response to job adverts in Boston and Chicago in 2001. The resumes differ in various attributes including the names of the applicants, and different resumes were randomly allocated to job openings.  To manipulate perceived race, resumes are randomly assigned Black- or White-sounding names. The researchers collecting these data were interested to learning more about how racial bias impacts job market outcomes by testing whether Black sounding names obtain fewer callbacks for interviews than White names.

You can get access to original article [here](https://www.aeaweb.org/articles?id=10.1257/0002828042002561). 

- Download the data set `resume_experiment.dta` from [github here](https://github.com/nickeubank/MIDS_Data/tree/master/resume_experiment), or by doing to `www.github.com/nickeubank/MIDS_Data` and opening the `resume_experiment` folder.
- For `python` users, use `read_stata` in `pandas` to load the data set; For `R` users, use `read_dta` in `haven` to load the data set
- `black` is the treatment variable in the data set (whether the resume has a Black-sounding name). 
- `call` is the dependent variable of interest (did the employer call the fictitious applicant for an interview)

In addition, the data includes a number of variables the describe the other features in each fictitious resume, including applicants education level (`education`),  years of experience (`yearsexp`), gender (`female`), computer skills (`computerskills`), and number of previous jobs (`ofjobs`). Each resume has a random selection of these attributes, so on average the Black-named fictitious applicant resumes have the same qualifications as the White-named applicant resumes. 

## Checking for Balance


### Exercise 1

Check for balance in terms of applicant gender (`female`), computer skills (`computerskills`), and years of experience (`yearsexp`) across the two arms of the experiment (i.e. by `black`). Calculate both the differences across treatment arms *and* test for statistical significance of these differences. Do gender and computer skills look balanced across race groups? (1 point)

### Exercise 2

Do a similar tabulation for education (`education`) and the number of jobs previous held (`ofjobs`). These variables take on 5 and 7 different values, respectively. Because these are categorical, you shouldn't just calculate and compare means -- you should compare share of observations with each value.

Does education and the number of previous jobs look balanced across racial groups? (2 points)

### Exercise 3

What do you make of the overall results on resume characteristics? Why do we care about whether these variables look similar across the race groups? (1 point)

```


```

## Estimating Effect of Race

### Exercise 4

The variable of interest in the data set is the variable `call`, which indicates a call back for an interview. Perform a two-sample t-test comparing applicants with black sounding names and white sounding names.

### Exercise 5

Now, use a regression model to estimate the differential likelihood of being called back by applicant race (i.e. the racial discrimination by employers).

### Exercise 6

Now let's see if we can improve our estimates by adding in other variables as controls. Add in `education`, `yearsexp`, and `computerskills` -- be sure to treat education as a categorical variable!

### Exercise 7

These effects are the *average* effects. Now let's look for heterogeneous treatment effects. 

Look only at candidates with high educations. Is there more or less racial discrimination among these highly educated candidates?

### Exercise 8

Now let's compare men and women -- is discrimination greater for Black men or Black women?

### Exercise 9
    
What do you conclude from the results of the Bertand and Mullainathan experiment? (1 point)