
## Welcome

This is material for the **Directed Acyclical Graphs** chapter in Scott Cunningham's book, [Causal Inference: The Mixtape.](https://mixtape.scunning.com/)



### Packages needed

The first thing you need to do is install a few packages to make sure everything runs:

In [14]:
#load_ext rpy2.ipython

In [17]:
#%%R
# install.packages("tidyverse")
# install.packages("cli")
# install.packages("haven")
# install.packages("estimatr")
# install.packages("stargazer")

### Load

In [18]:
#%%R

library(haven)
library(tidyverse)
library(estimatr)
library(stargazer)
library(cli)

# read_data function
read_data <- function(df) {
  full_path <- paste0("https://raw.github.com/scunning1975/mixtape/master/", df)
  return(haven::read_dta(full_path))
}

## Collider - Discrimination

In [19]:
#%%R

tb <- tibble(
  female = ifelse(runif(10000)>=0.5,1,0),
  ability = rnorm(10000),
  discrimination = female,
  occupation = 1 + 2*ability + 0*female - 2*discrimination + rnorm(10000),
  wage = 1 - 1*discrimination + 1*occupation + 2*ability + rnorm(10000) 
)

In [20]:
tb

female,ability,discrimination,occupation,wage
1,-2.00669976,1,-2.79326087,-6.7632603
0,-0.96568028,0,-1.15079401,-2.7494913
1,0.44773204,1,0.09439634,1.8530685
0,-1.32795678,0,-1.40321755,-1.4905730
1,-0.92988088,1,-2.76094100,-2.6900537
1,0.24441184,1,0.02443282,0.5952254
0,1.18735785,0,1.69827826,6.2884706
0,-1.38682207,0,-0.88872288,-2.3038068
0,-0.89171907,0,-0.71691112,-1.1142768
1,0.48589871,1,-0.88934386,-1.3370857


In [21]:
lm_1 <- lm(wage ~ female, tb)
lm_2 <- lm(wage ~ female + occupation, tb)
lm_3 <- lm(wage ~ female + occupation + ability, tb)

stargazer(lm_1,lm_2,lm_3, 
          type = "text", 
          column.labels = c("Biased Unconditional", "Biased", "Unbiased Conditional")
          )


                                                     Dependent variable:                                 
                    -------------------------------------------------------------------------------------
                                                            wage                                         
                       Biased Unconditional                Biased                Unbiased Conditional    
                                (1)                         (2)                          (3)             
---------------------------------------------------------------------------------------------------------
female                       -2.946***                    0.624***                    -0.946***          
                              (0.086)                     (0.029)                      (0.028)           
                                                                                                         
occupation                                   

#### QUESTIONS
- What is the true direct effect of discrimination on wages?  
- Explain the channels by which discrimination impacts wages.  
- What makes occupation a collider?
- What controls are necessary to eliminate this collider bias?



## Movie Star

In [22]:
#%%R

set.seed(3444)

star_is_born <- tibble(
  beauty = rnorm(2500),
  talent = rnorm(2500),
  score = beauty + talent,
  c85 = quantile(score, .85),
  star = ifelse(score>=c85,1,0)
)

In [23]:
cli::cli_h1("Full Sample")
star_is_born %>% 
   lm(beauty ~ talent, .) %>% 
  ggplot(aes(x = talent, y = beauty)) +
  geom_point(size = 0.5, shape=23) + xlim(-4, 4) + ylim(-4, 4)


[36m--[39m [1m[1mFull Sample[1m[22m [36m-----------------------------------------------------------------[39m


ERROR: Error in png(tf, width, height, "in", pointsize, bg, res, antialias = antialias): unable to start png() device


plot without title

In [None]:
#%%R

cli::cli_h1("Conditional on Being a Star")
star_is_born %>% 
  filter(star == 1) %>% 
  lm(beauty ~ talent, .) %>% 
  ggplot(aes(x = talent, y = beauty)) +
  geom_point(size = 0.5, shape=23) + xlim(-4, 4) + ylim(-4, 4)


In [None]:
#%%R

cli::cli_h1("Conditional on Not Being a Star")
star_is_born %>% 
  filter(star == 0) %>% 
  lm(beauty ~ talent, .) %>% 
  ggplot(aes(x = talent, y = beauty)) +
  geom_point(size = 0.5, shape=23) + xlim(-4, 4) + ylim(-4, 4)

#### QUESTIONS
- What is the correlation between talent and beauty among stars?  Non-stars?
- But what is the correlation between talent and beauty in the population?