## 混合横截面和面板数据回归

### 跨时独立横截面的混合

例13.2 教育回报和工资中性别差异的变化

In [3]:
library(foreign)
cps <- read.dta("http://fmwww.bc.edu/ec-p/data/wooldridge/cps78_85.dta")

# Detailed OLS results including interaction terms
summary( lm(lwage ~ y85*(educ+female) +exper+ I(exper^2) + union, 
                                                            data=cps) )


Call:
lm(formula = lwage ~ y85 * (educ + female) + exper + I(exper^2) + 
    union, data = cps)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.56098 -0.25828  0.00864  0.26571  2.11669 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  4.589e-01  9.345e-02   4.911 1.05e-06 ***
y85          1.178e-01  1.238e-01   0.952   0.3415    
educ         7.472e-02  6.676e-03  11.192  < 2e-16 ***
female      -3.167e-01  3.662e-02  -8.648  < 2e-16 ***
exper        2.958e-02  3.567e-03   8.293 3.27e-16 ***
I(exper^2)  -3.994e-04  7.754e-05  -5.151 3.08e-07 ***
union        2.021e-01  3.029e-02   6.672 4.03e-11 ***
y85:educ     1.846e-02  9.354e-03   1.974   0.0487 *  
y85:female   8.505e-02  5.131e-02   1.658   0.0977 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4127 on 1075 degrees of freedom
Multiple R-squared:  0.4262,	Adjusted R-squared:  0.4219 
F-statistic:  99.8 on 8 and 1075 DF,  p-value: < 

### 利用混合横截面作政策分析

$$y = \beta_{0} + \delta_{0}d2 + \beta_{1}dT + \delta_{1}d2 \cdot dT + other factors$$ 

例13.3 垃圾焚化炉的区位对住房价格的影响

In [5]:
library(foreign)
kielmc <- read.dta("http://fmwww.bc.edu/ec-p/data/wooldridge/kielmc.dta")

In [8]:
# Separate regressions for 1978 and 1981: report coeeficients only
coef( lm(rprice~nearinc, data=kielmc, subset=(year==1978)) )
coef( lm(rprice~nearinc, data=kielmc, subset=(year==1981)) )

# Joint regression including an interaction term 
library(lmtest)
coeftest( lm(rprice~nearinc*y81, data=kielmc) )


t test of coefficients:

            Estimate Std. Error t value  Pr(>|t|)    
(Intercept)  82517.2     2726.9 30.2603 < 2.2e-16 ***
nearinc     -18824.4     4875.3 -3.8612 0.0001368 ***
y81          18790.3     4050.1  4.6395 5.117e-06 ***
nearinc:y81 -11863.9     7456.6 -1.5911 0.1125948    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


In [10]:
DiD      <- lm(log(rprice)~nearinc*y81                     , data=kielmc)
DiDcontr <- lm(log(rprice)~nearinc*y81+age+I(age^2)+log(intst)+
                            log(land)+log(area)+rooms+baths, data=kielmc)
library(stargazer)
stargazer(DiD,DiDcontr,type="text")


                                  Dependent variable:               
                    ------------------------------------------------
                                      log(rprice)                   
                              (1)                     (2)           
--------------------------------------------------------------------
nearinc                    -0.340***                 0.032          
                            (0.055)                 (0.047)         
                                                                    
y81                        0.193***                 0.162***        
                            (0.045)                 (0.028)         
                                                                    
age                                                -0.008***        
                                                    (0.001)         
                                                                    
I(age2)                          

### 面板数据回归

假设面板数据模型
$$y_{it} = \beta_{1}x_{it} + \alpha_{i} + \mu_{it}$$

- 一阶差分估计量
- 固定效应估计量
- 随机效应估计量

例14.4 使用面板数据的一个工资方程

In [13]:
library(foreign);library(plm);library(stargazer)
wagepan<-read.dta("http://fmwww.bc.edu/ec-p/data/wooldridge/wagepan.dta")

In [14]:
# Generate pdata.frame:
wagepan.p <- pdata.frame(wagepan, index=c("nr","year") )

pdim(wagepan.p)

# Check variation of variables within individuals
pvar(wagepan.p)

Balanced Panel: n = 545, T = 8, N = 4360

no time variation:       nr black hisp educ 
no individual variation: year d81 d82 d83 d84 d85 d86 d87 

In [15]:
# Estimate different models
wagepan.p$yr<-factor(wagepan.p$year)

reg.ols<- (plm(lwage~educ+black+hisp+exper+I(exper^2)+married+union+yr, 
                                      data=wagepan.p, model="pooling") )
reg.re <- (plm(lwage~educ+black+hisp+exper+I(exper^2)+married+union+yr, 
                                      data=wagepan.p, model="random") )
reg.fe <- (plm(lwage~                      I(exper^2)+married+union+yr, 
                                      data=wagepan.p, model="within") )

# Pretty table of selected results (not reporting year dummies)
stargazer(reg.ols,reg.re,reg.fe, type="text", 
          column.labels=c("OLS","RE","FE"),keep.stat=c("n","rsq"),
          keep=c("ed","bl","hi","exp","mar","un"))


                  Dependent variable:     
             -----------------------------
                         lwage            
                OLS       RE        FE    
                (1)       (2)       (3)   
------------------------------------------
educ         0.091***  0.092***           
              (0.005)   (0.011)           
                                          
black        -0.139*** -0.139***          
              (0.024)   (0.048)           
                                          
hisp           0.016     0.022            
              (0.021)   (0.043)           
                                          
exper        0.067***  0.106***           
              (0.014)   (0.015)           
                                          
I(exper2)    -0.002*** -0.005*** -0.005***
              (0.001)   (0.001)   (0.001) 
                                          
married      0.108***  0.064***   0.047** 
              (0.016)   (0.017)   (0.018) 
          