In [1]:
library(data.table)
library(lmtest)
library(broom)

Loading required package: zoo

Attaching package: ‘zoo’

The following objects are masked from ‘package:base’:

    as.Date, as.Date.numeric



In [17]:
n <- 100
d <- data.table(x=sample(18:39, n, replace=T), edu=sample(1:3, n, replace=T), treat=sample(0:1, n, replace=T))
d$edu <- as.factor(d$edu)
d <- d[, y:=x*1000 + treat * 5000 + rnorm(n, mean=0, sd=100)]
d <- d[, y:=y+(edu==1)*50*treat]
d <- d[, y:=y+(edu==2)*250*treat]
d <- d[, y:=y+(edu==3)*1000*treat]

In [26]:
m <- lm(y ~ x + treat + treat:edu, data = d)
coeftest(m)


t test of coefficients:

             Estimate Std. Error  t value  Pr(>|t|)    
(Intercept) -115.5171    41.9394  -2.7544  0.007047 ** 
x           1003.6679     1.4022 715.7829 < 2.2e-16 ***
treat       5071.6736    30.0748 168.6352 < 2.2e-16 ***
treat:edu2   184.7470    34.1149   5.4154 4.593e-07 ***
treat:edu3   910.0683    35.1973  25.8562 < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


In [25]:
coefs <- data.table(tidy(coeftest(m)))


In [20]:
# Effect of 'treat' when edu == 1
# Should be 5000 + 50
baseline <- coefs[3,estimate]
baseline

In [21]:
# Effect of 'treat' when edu == 2
# Should be 5000 + 250

# the treat:edu2 coefficients means the effect of treat+edu2 when edu is changed from 1 to 2.
coefs[4,estimate] + baseline 

In [22]:
# Effect of 'treat' when edu == 3
# Should be 5000 + 1000
coefs[5,estimate] + baseline 

### The results are pretty close to the cases where we do lm on subset of data?

In [24]:
coeftest(lm(y ~ x + treat, data = d[edu=='1']))
coeftest(lm(y ~ x + treat, data = d[edu=='2']))
coeftest(lm(y ~ x + treat, data = d[edu=='3']))


t test of coefficients:

             Estimate Std. Error  t value Pr(>|t|)    
(Intercept) -167.9726    68.0266  -2.4692  0.02104 *  
x           1005.1414     2.1693 463.3445  < 2e-16 ***
treat       5083.5414    30.8110 164.9912  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1



t test of coefficients:

             Estimate Std. Error  t value Pr(>|t|)    
(Intercept)  -95.8199    72.5525  -1.3207   0.1954    
x           1003.5371     2.3297 430.7546   <2e-16 ***
treat       5240.6089    27.1466 193.0483   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1



t test of coefficients:

             Estimate Std. Error  t value Pr(>|t|)    
(Intercept)  -82.7017    78.4047  -1.0548   0.2992    
x           1002.1270     2.8581 350.6241   <2e-16 ***
treat       5995.1538    38.1381 157.1959   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


## Use the multcomp package to calculate the combined effect and SE's of treat with a specific edu level

In [27]:
library(multcomp)

# See https://rpubs.com/djcava/lincom


Loading required package: mvtnorm
Loading required package: survival
Loading required package: TH.data
Loading required package: MASS

Attaching package: ‘TH.data’

The following object is masked from ‘package:MASS’:

    geyser



In [28]:
names(coef(m))  #Extract the coefficient names (b0,b1,b2,b3)


In [30]:
m.lh <- glht(m, linfct = c("treat + treat:edu2 = 0"))
summary(m.lh)



	 Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = y ~ x + treat + treat:edu, data = d)

Linear Hypotheses:
                        Estimate Std. Error t value Pr(>|t|)    
treat + treat:edu2 == 0  5256.42      23.87   220.2   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)


In [31]:
confint(m.lh)


	 Simultaneous Confidence Intervals

Fit: lm(formula = y ~ x + treat + treat:edu, data = d)

Quantile = 1.9853
95% family-wise confidence level
 

Linear Hypotheses:
                        Estimate  lwr       upr      
treat + treat:edu2 == 0 5256.4205 5209.0244 5303.8167
