Regression with Panel Data
===
---
* First-Differend Estimator
* Reshape Data 
* Fixed Effects Regression
* Standard Errors for Fixed Effects Regression
---

In [2]:
library(tidyverse)
library(stargazer)

─ Attaching packages ──────────────────── tidyverse 1.2.1 ─
✔ ggplot2 2.2.1     ✔ purrr   0.2.5
✔ tibble  1.4.2     ✔ dplyr   0.7.6
✔ tidyr   0.8.1     ✔ stringr 1.4.0
✔ readr   1.1.1     ✔ forcats 0.3.0
“package ‘stringr’ was built under R version 3.5.2”─ Conflicts ───────────────────── tidyverse_conflicts() ─
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()

Please cite as: 

 Hlavac, Marek (2018). stargazer: Well-Formatted Regression and Summary Statistics Tables.
 R package version 5.2.2. https://CRAN.R-project.org/package=stargazer 



In [24]:
traffic1 <- read.csv("/Users/tino/Desktop/TA-Econometrics-II/datasets/0507/traffic1.csv")

In [4]:
head(traffic1) # wide format

state,open90,open85,dthrte90,dthrte85
AL,0,0,2.6,2.9
AK,1,0,2.1,3.2
AZ,0,0,2.5,4.4
AR,0,0,2.9,3.4
CA,1,1,2.0,2.6
CO,0,0,1.9,2.4


---
## 1. First-Differenced Estimator
* mutate the "comparison variables"
* regress with comparison variables (without intercept)

In [5]:
traffic1 <- traffic1 %>%
    mutate(copen = open90 - open85, cdthrte = dthrte90 - dthrte85)
# pipeline operator, which is quite useful when encountering complicate data manipulation
# equal to: traffic1 <- mutate(data = traffic1, copen = open90 - open85, cdthrte = dthrte90 - dthrte85)

In [6]:
head(traffic1)

state,open90,open85,dthrte90,dthrte85,copen,cdthrte
AL,0,0,2.6,2.9,0,-0.3000002
AK,1,0,2.1,3.2,1,-1.1000001
AZ,0,0,2.5,4.4,0,-1.9000001
AR,0,0,2.9,3.4,0,-0.5
CA,1,1,2.0,2.6,0,-0.5999999
CO,0,0,1.9,2.4,0,-0.5000001


In [9]:
FD <- lm(cdthrte ~ copen - 1, data = traffic1) 
# -1: without intercept

stargazer(FD, type = "text", title = "First-Differenced Estimator") 
# use "stargazer" to perform your results
# type = "text": directly show in Rstudio instead of LaTex code 
# FD estimator: coefficient = -0.97


First-Differenced Estimator
                        Dependent variable:    
                    ---------------------------
                              cdthrte          
-----------------------------------------------
copen                        -0.967***         
                              (0.354)          
                                               
-----------------------------------------------
Observations                    51             
R2                             0.130           
Adjusted R2                    0.113           
Residual Std. Error       0.613 (df = 50)      
F Statistic            7.465*** (df = 1; 50)   
Note:               *p<0.1; **p<0.05; ***p<0.01


---
## 2. Reshape Data
* wide format --> long format
* long format: panel regression

In [25]:
# reshape data from wide to long
traffic1 <- traffic1 %>%
  unite(85, open85, dthrte85) %>%
  unite(90, open90, dthrte90) %>%
  gather(key = "year", value = "open_dthrte", `85`, `90`) %>%
  separate(open_dthrte, into = c("open", "dthrte"), sep = "_")
  
traffic1$year <- as.numeric(traffic1$year)
traffic1$open <- as.numeric(traffic1$open)
traffic1$dthrte <- as.numeric(traffic1$dthrte)

In [26]:
traffic1 # long format

state,year,open,dthrte
AL,85,0,2.9
AK,85,0,3.2
AZ,85,0,4.4
AR,85,0,3.4
CA,85,1,2.6
CO,85,0,2.4
CT,85,0,2.0
DE,85,0,2.2
DC,85,0,3.0
FL,85,0,3.4


---
## 3. Fixed Effects Regression
* Fixed Effects without binary varibles
    * 

* Fixed Effects with binary varibles

In [12]:
library(plm)

“package ‘plm’ was built under R version 3.5.2”Loading required package: Formula

Attaching package: ‘plm’

The following objects are masked from ‘package:dplyr’:

    between, lag, lead



In [27]:
FE <- plm(dthrte ~ open, data = traffic1, index = c("state", "year"), model = "within")
stargazer(FE, type = "text", title = "Fixed Effects Regression")
# Fixed Effects Regression: coefficient = -0.97


Fixed Effects Regression
                 Dependent variable:    
             ---------------------------
                       dthrte           
----------------------------------------
open                  -0.967***         
                       (0.354)          
                                        
----------------------------------------
Observations             102            
R2                      0.130           
Adjusted R2            -0.758           
F Statistic     7.465*** (df = 1; 50)   
Note:        *p<0.1; **p<0.05; ***p<0.01


In [28]:
FE.dummy <-lm(dthrte ~ open + factor(state), data = traffic1)
stargazer(FE.dummy, type = "text", title = "Fixed Effects Regression (with Binary Variables)") 
# Fixed Effects Regression (with Binary Variable): coefficient = -0.97


Fixed Effects Regression (with Binary Variables)
                        Dependent variable:    
                    ---------------------------
                              dthrte           
-----------------------------------------------
open                         -0.967***         
                              (0.354)          
                                               
factor(state)AL               -0.383           
                              (0.468)          
                                               
factor(state)AR                0.017           
                              (0.468)          
                                               
factor(state)AZ                0.317           
                              (0.468)          
                                               
factor(state)CA                0.133           
                              (0.468)          
                                               
factor(state)CO              -0.983** 

## 4. Standard Errors for Fixed Effects Regression
* robust 
* cluster s.d.

In [59]:
library(lmtest)
library(sandwich)
FE <- plm(dthrte ~ open, data = traffic1, index = c("state", "year"), model = "within")

coeftest(FE, vcovHC(FE, type = 'HC1'))
# robust 


t test of coefficients:

     Estimate Std. Error t value  Pr(>|t|)    
open -0.96667    0.10940 -8.8358 8.721e-12 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


In [58]:
coeftest(FE, vcovHC(FE, type = 'HC0', cluster = 'group'))
# cluster


t test of coefficients:

     Estimate Std. Error t value  Pr(>|t|)    
open -0.96667    0.10887 -8.8794 7.489e-12 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
