---
title: "Inference in Linear Models"
subtitle: "Hypothesis Testing and Confidence Intervals"
author: Vladislav Morozov  
format:
  revealjs:
    width: 1150
    slide-number: true
    sc-sb-title: true
    incremental: true   
    logo: ../../themes/favicon.ico
    footer: "A Deeper Look at Linear Regression: Inference"
    footer-logo-link: "https://vladislav-morozov.github.io/econometrics-2/"
    theme: ../../themes/slides_theme.scss
    toc: TRUE
    toc-depth: 2
    toc-title: Contents
    transition: convex
    transition-speed: fast
slide-level: 4
title-slide-attributes:
    data-background-color: "#045D5D"
    data-footer: " "
filters:
  - reveal-header  
include-in-header: ../../themes/mathjax.html 
highlight-style: tango
---




## Introduction {background="#00100F"}
  
### Lecture Info {background="#43464B" visibility="uncounted"}


#### Learning Outcomes

This lecture is about  

<br>

By the end, you should be able to

- Do

#### Textbook References
 

::: {.nonincremental}

 

- 8-2 in 
  
::: 



## Motivation {background="#00100F"}

### Motivating Empirical Example {background="#43464B" visibility="uncounted"}


#### Setting: Linear Causal Model

<br> 

We'll continue to work in the linear causal model with potential outcomes:
$$
Y_i^\bx = \bx'\bbeta + U_i
$$ {#eq-vector-distribution-potential}
 
#### Motivating Empirical Example: Variables

- $Y_i$ — hourly log wage
- $\bx$ — education and job experience in years
- $U_i$ — unobserved characteristics (skill, health, etc.), assumed to satisfy $\E[U_i|\bX_i]=0$
- Sample: some suitably homogeneous group (e.g.  married white women)

#### Motivating Empirical Example: Potential Outcomes
 
$$
\begin{aligned}[]
& [\ln(\text{wage}_i)]^{\text{(education, experience)}} \\
&  =   \beta_1 + \beta_2 \times \text{education} \\
& \quad  + \beta_3 \times  \text{experience} + \beta_4 \times  \dfrac{\text{experience}^2}{100} + U_i
\end{aligned}
$$

. . . 
 
- Can write model in terms of realized variables, but above emphasizes causal assumption
- We divide experience$^2$ by 100 for numerical reasons

#### Motivating Empirical Example: Parameters of Interest

 
<br>

Our parameters of interest: 

1. $100\beta_2$ — (more or less) average effect of additional year of education in percent
2. $100\beta_3 + 20 \beta_4$ — average effect of increasing education for individuals with 10 years of experience
3. $-50\beta_3/\beta_4$ — experience level which maximizes expected log wage

#### Motivating Empirical Example: Data {.scrollable}


- `cps09mar` — a selection from the March 2009 US Current Population Survey: 
- Can be obtained from the [website](https://users.ssc.wisc.edu/~bhansen/econometrics/) for @Hansen2022Econometrics
- Sample: married white women with present spouses

<br> 


In [None]:
#| echo: true
#| code-fold: true
#| code-summary: "Expand for full data preparation code"
import numpy as np
import pandas as pd
import statsmodels.api as sm

from statsmodels.regression.linear_model import OLS

# Read in the data
data_path = ("https://github.com/pegeorge/Econ521_Datasets/"
             "raw/refs/heads/main/cps09mar.csv")
cps_data = pd.read_csv(data_path)

# Generate variables
cps_data["experience"] = cps_data["age"] - cps_data["education"] - 6
cps_data["experience_sq_div"] = cps_data["experience"]**2/100
cps_data["wage"] = cps_data["earnings"]/(cps_data["week"]*cps_data["hours"] )
cps_data["log_wage"] = np.log(cps_data['wage'])

# Retain only married women white with present spouses
select_data = cps_data.loc[
    (cps_data["marital"] <= 2) & (cps_data["race"] == 1) & (cps_data["female"] == 1), :
]

# Construct X and y for regression 
exog = select_data.loc[:, ['education', 'experience', 'experience_sq_div']]
exog = sm.add_constant(exog)
endog = select_data.loc[:, "log_wage"]

::: footer

:::

#### Motivating Empirical Example: Estimation Results 

```{.python code-line-numbers="0-1"}
results = OLS(endog, exog).fit(cov_type='HC0') # Robust covariance matrix estimator
print(results.summary())
```


In [None]:
results = OLS(endog, exog).fit(cov_type='HC0')
print(results.summary())

#### Empirical Questions


<br> 

1. How certain are we of our estimates of target parameters?
2. Does education matter at all? (up to our statistical confidence)
3. Is the best amount of experience to have equal to 15 years? (up to our statistical confidence)


### Translating to Theory {background="#43464B" visibility="uncounted"}

#### Goal: Inference
 


## Recap and Conclusions {background="#00100F"}
  
#### Recap

In this lecture we

1. Did
   
#### Next Questions

<br>

How 

#### References {.allowframebreaks visibility="uncounted"}

::: {#refs}
:::