# Session 6: An overview of econometrics and machine learning

By *Andreas Bjerre-Nielsen*

*Note: Due to the sudden demand for remote learning, we are experimenting with different ways of creating a good
remote learning experience. Therefore your feedback and suggestions are encouraged. You can give feedback
by reaching out to a teacher. This notebook contains a mix of relevant questions and mini lectures interweaved. It is important to note that we will probably iterate on this format, so we ask for your patience when we mess up.*



#### Agenda

In the lecture we will cover the following points.

1. [Review of tree based methods for inference](#Part-1:-Review-of-tree-based-methods-for-inference)
1. [Linear machine learning for econometrics](#Part-2:-Linear-machine-learning-for-econometrics)
1. [Applications of machine learning in econometrics](#Part-3:-applications-of-machine-learning-in-econometrics)




# Part 0: Welcome to teaching during Corona pandemic

As I mentioned earlier one lecture would be run with flipped classroom. However, the changes came earlier than expected and going forward **all** our lectures will take on a new format where lectures have two parts.
1. Notebook part: in this "Do It Yourself" 
  - you will be guided through reading, video mini lectures as well as try out new methods
1. Discussion part: ask me questions

The format is not final and you can hear me talking about it in the video below. To enable the video execute the cell below.

In [2]:
from IPython.display import YouTubeVideo
YouTubeVideo('1ohBB8gHQvg', width=640, height=360)

# Part 1: Review of tree based methods for inference

Before we dive into the material on 
Generalized Random Forests (GRF) and Causal Forests (CF) we review the background of why we care about heterogeneity.

In [3]:
YouTubeVideo('h-FEqDhdlvM', width=640, height=360)

We now move onto reviewing GRF properties. Before starting you may want to consult the questions below.

In [11]:
YouTubeVideo('INv9IcysfSo', width=640, height=360)

We finish off with a short comparison of Causal Forests vs. Generalized Random Forests 

In [5]:
YouTubeVideo('PTLdzxj4cEo', width=640, height=360)

#### Exercises
Just a few questions to make sure you've watched the lectures and read main parts of the texts on CF and GRF. 

1. Does GRF use the trees for local matching or as local weights? What about CF?
1. What are some advantages of GRF over CF?
1. Describe in words the main steps  for computing GRF (take a look at section 2 of [Athey, Wager, Tibshirani (2019)](https://doi.org/10.1214/18-aos1709))
1. What are the assumptions of GRF for asymptotic analysis? Are some critical? (take a look at section 3 of [Athey, Wager, Tibshirani (2019)](https://doi.org/10.1214/18-aos1709))

Note: you may find it useful to watch [this video](https://www.youtube.com/watch?v=CPz0HdUM3dE) where Susan Athey explains tree based methods from causal trees to generalized random forest.

#### Alternatives to GRF (extra-curricular)
There are some existing and new frameworks for estimating heterogeneous treatment effects. For instance BART is already quite established and often outperforms GRF.

- [Chipman, George and McCulloch (2010)](https://doi.org/10.1214/09-aoas285) develops Bayesian additive regression trees (BART)
  
- [Künzel et al. (2019)](https://doi.org/10.1073/pnas.1804597116) investigates more general class of prediction tools for partitioning data using 

  - Lower EMSE in many cases relative to causal forest and BART 
  
- [Nie and Wager (2017)](https://arxiv.org/pdf/1712.04912.pdf) investigates another class of methods called R-learners that leverages a smart representation of CATE.

# Part 2: Linear machine learning for econometrics

We move on to new econometrics tools that leverage linear ML. We round off the mini-lecture pointing to new research which works beyond linear ML building on the same basic idea.

In the first video we outline the problem of using machine learning methods, in particular the LASSO and show a simple fix.

In [9]:
YouTubeVideo('YsqLixauzMc', width=640, height=360)

In the next video we see a more general method than the Post-LASSO to remove bias from using machine learning.

In [7]:
YouTubeVideo('DCIPlCVIDGE', width=640, height=360)

#### Exercises
Below are some questions that can aid your reading and view of the lectures. The main reading is [Belloni et al. ( 2015)](https://doi.org/10.1257/jep.28.2.29) and supplemented by [Chernozhukov et al. (2015)](https://doi.org/10.1257/aer.p20151022)

1. Account for the problem estimating of treatment effect using OLS after running Lasso.
1. Explain how the Post-LASSO overcomes the issues.
1. Account for the steps in using double selection for the problem with many covariates. 

#### Beyond linear ML for econometrics (extra-curricular)

Note for the curious: you may find it useful to watch [this video](https://www.youtube.com/watch?v=eHOjmyoPCFU) where Victor Chernozhukov explains double debiased machine learning.


## Part 3: Applications of machine learning in econometrics

In [10]:
YouTubeVideo('83CE_jWCPTs', width=640, height=360)

#### Exercises
Below are some questions that can aid your reading and view of the lectures. The main reading is [Athey and Imbens (2019)](https://doi.org/10.1146/annurev-economics-080217-053433) as well as [Mullainathan and Spiess (2017)](https://doi.org/10.1257/jep.31.2.87)

1. Account for how machine learning can be used for econometric estimation 
1. Account for the use of machine learning to extract new data. Argue what are the pitfalls of this.
1. Explain what distinguishes a prediction policy problem from a standard policy problem