Explain what is the Double Lasso Approach in a markdown cell. Use equations for a better explanation.

## Double Lasso - Testing the Convergence Hypothesis

### Introduction

We provide an additional empirical example of partialling-out with Lasso to estimate the regression coefficient $\beta_1$ in the high-dimensional linear regression model:
  $$
  Y = \beta_1 D +  \beta_2'W + \epsilon.
  $$
  
Specifically, we are interested in how the rates  at which economies of different countries grow ($Y$) are related to the initial wealth levels in each country ($D$) controlling for country's institutional, educational, and other similar characteristics ($W$).
  
The relationship is captured by $\beta_1$, the *speed of convergence/divergence*, which measures the speed at which poor countries catch up $(\beta_1< 0)$ or fall behind $(\beta_1> 0)$ rich countries, after controlling for $W$. Our inference question here is: do poor countries grow faster than rich countries, controlling for educational and other characteristics? In other words, is the speed of convergence negative: $ \beta_1 <0?$ This is the Convergence Hypothesis predicted by the Solow Growth Model. This is a structural economic model. Under some strong assumptions, that we won't state here, the predictive exercise we are doing here can be given causal interpretation.

The outcome $Y$ is the realized annual growth rate of a country's wealth  (Gross Domestic Product per capita). The target regressor ($D$) is the initial level of the country's wealth. The target parameter $\beta_1$ is the speed of convergence, which measures the speed at which poor countries catch up with rich countries. The controls ($W$) include measures of education levels, quality of institutions, trade openness, and political stability in the country.

### Data analysis

We consider the data set GrowthData which is included in the package hdm. First, let us load the data set to get familiar with the data.

In [1]:
library(hdm)
library(xtable)

In [2]:
# Export data to read in python
GrowthData <- GrowthData
save(GrowthData, file = "C:/Users/Alvaro/Documents/ML/data/GrowthData.RData")

In [3]:
head(GrowthData)

Unnamed: 0_level_0,Outcome,intercept,gdpsh465,bmp1l,freeop,freetar,h65,hm65,hf65,p65,...,seccf65,syr65,syrm65,syrf65,teapri65,teasec65,ex1,im1,xr65,tot1
Unnamed: 0_level_1,<dbl>,<int>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,...,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
1,-0.02433575,1,6.591674,0.2837,0.153491,0.043888,0.007,0.013,0.001,0.29,...,0.04,0.033,0.057,0.01,47.6,17.3,0.0729,0.0667,0.348,-0.014727
2,0.10047257,1,6.829794,0.6141,0.313509,0.061827,0.019,0.032,0.007,0.91,...,0.64,0.173,0.274,0.067,57.1,18.0,0.094,0.1438,0.525,0.00575
3,0.06705148,1,8.895082,0.0,0.204244,0.009186,0.26,0.325,0.201,1.0,...,18.14,2.573,2.478,2.667,26.5,20.7,0.1741,0.175,1.082,-0.01004
4,0.06408917,1,7.565275,0.1997,0.248714,0.03627,0.061,0.07,0.051,1.0,...,2.63,0.438,0.453,0.424,27.8,22.7,0.1265,0.1496,6.625,-0.002195
5,0.02792955,1,7.162397,0.174,0.299252,0.037367,0.017,0.027,0.007,0.82,...,2.11,0.257,0.287,0.229,34.5,17.6,0.1211,0.1308,2.5,0.003283
6,0.04640744,1,7.21891,0.0,0.258865,0.02088,0.023,0.038,0.006,0.5,...,1.46,0.16,0.174,0.146,34.3,8.1,0.0634,0.0762,1.0,-0.001747


In [4]:
growth <- GrowthData
attach(growth)
names(growth)

In [5]:
dim(growth)

### Methods

The sample contains $90$ countries and $63$ controls. Thus $p \approx 60$, $n=90$ and $p/n$ is not small. We expect the least squares method to provide a poor estimate of $\beta_1$.  We expect the method based on partialling-out with Lasso to provide a high quality estimate of $\beta_1$.

To check this hypothesis, we analyze the relation between the output variable $Y$ and the other country's characteristics by running a linear regression in the first step.

## 1. OLS

In [6]:
reg.ols <- lm(Outcome~.-1,data=growth)
summary(reg.ols)


Call:
lm(formula = Outcome ~ . - 1, data = growth)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.040338 -0.011298 -0.000863  0.011813  0.043247 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
intercept  2.472e-01  7.845e-01   0.315  0.75506   
gdpsh465  -9.378e-03  2.989e-02  -0.314  0.75602   
bmp1l     -6.886e-02  3.253e-02  -2.117  0.04329 * 
freeop     8.007e-02  2.079e-01   0.385  0.70300   
freetar   -4.890e-01  4.182e-01  -1.169  0.25214   
h65       -2.362e+00  8.573e-01  -2.755  0.01019 * 
hm65       7.071e-01  5.231e-01   1.352  0.18729   
hf65       1.693e+00  5.032e-01   3.365  0.00223 **
p65        2.655e-01  1.643e-01   1.616  0.11727   
pm65       1.370e-01  1.512e-01   0.906  0.37284   
pf65      -3.313e-01  1.651e-01  -2.006  0.05458 . 
s65        3.908e-02  1.855e-01   0.211  0.83469   
sm65      -3.067e-02  1.168e-01  -0.263  0.79479   
sf65      -1.799e-01  1.181e-01  -1.523  0.13886   
fert65     6.881e-03  2.705e-02   0.254

In [7]:
est_ols <- summary(reg.ols)$coef["gdpsh465",1]
# output: estimated regression coefficient corresponding to the target regressor

std_ols <- summary(reg.ols)$coef["gdpsh465",2]
# output: std. error

ci_ols <- confint(reg.ols)[2,]
# output: 95% confidence interval

results_ols <- as.data.frame(cbind(est_ols,std_ols,ci_ols[1],ci_ols[2]))
colnames(results_ols) <-c("estimator","standard error", "lower bound CI", "upper bound CI")
rownames(results_ols) <-c("OLS")

In [8]:
print(est_ols)
print(std_ols)
print(ci_ols)

[1] -0.009377989
[1] 0.02988773
      2.5 %      97.5 % 
-0.07060022  0.05184424 


In [9]:
#library(xtable)
table <- matrix(0, 1, 4)
table[1,1:4]   <- c(est_ols,std_ols,ci_ols[1],ci_ols[2])
colnames(table) <-c("estimator","standard error", "lower bound CI", "upper bound CI")
rownames(table) <-c("OLS")
tab1<- xtable(table, digits = 3)
tab1

Unnamed: 0_level_0,estimator,standard error,lower bound CI,upper bound CI
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>
OLS,-0.009377989,0.02988773,-0.07060022,0.05184424


## 1.2 OLS Partialling out

In [10]:
y = growth[,-c(3)]
d = growth[,-c(1)]

resY <- summary(lm(Outcome~.-1,data=y))$residuals
resD <-summary(lm(gdpsh465~.-1,data=d))$residuals

residuos <- data.frame(resY,resD)
ols <- lm(resY~resD,data=residuos)
part_out.ols <- summary(ols)

In [11]:
est_ols1 <- part_out.ols$coef["resD",1]
std_ols1 <- part_out.ols$coef["resD",2]
ci_ols1 <- confint(lm(resY~resD,data=residuos))[2,]


In [32]:
table <- matrix(0,1,4)
table[1,1:4]   <- c(est_ols1,std_ols1,ci_ols1[1],ci_ols1[2])
colnames(table) <-c("estimator","standard error", "lower bound CI", "upper bound CI")
rownames(table) <-c("OLS - Partialling-out")
tab2 <- xtable(table, digits = 3) 
tab2

Unnamed: 0_level_0,estimator,standard error,lower bound CI,upper bound CI
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>
OLS - Partialling-out,-0.009377989,0.01685895,-0.04288161,0.02412563


## 2 LASSO using HDM

In [13]:
Y <- growth['Outcome'] # output variable
W <- as.matrix(growth)[, -c(1, 2,3)] # controls in matrix is mandatory to use rlassoeffects function
D <- growth['gdpsh465'] # target regressor

In [14]:
r.Y <- rlasso(x=W,y=Y)$res
r.D <- rlasso(x=W,y=D)$res
partial.lasso <- lm(r.Y ~ r.D)

In [15]:
r.Y <- rlasso(x=W,y=Y)$res # "residual" output variable
r.D <- rlasso(x=W,y=D)$res # "residual" target regressor
partial.lasso <- lm(r.Y ~ r.D)
est_lasso <- partial.lasso$coef[2]
std_lasso <- summary(partial.lasso)$coef[2,2]
ci_lasso <- confint(partial.lasso)[2,]

In [16]:
library(xtable)
table <- matrix(0, 1, 4)
table[1,1:4]   <- c(est_lasso,std_lasso,ci_lasso[1],ci_lasso[2])
colnames(table) <-c("estimator","standard error", "lower bound CI", "upper bound CI")
rownames(table) <-c("Lasso HDM")
tab3 <- xtable(table, digits = 3)
# Summary HDM Lasso
tab3

Unnamed: 0_level_0,estimator,standard error,lower bound CI,upper bound CI
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>
Lasso HDM,-0.04981147,0.01393636,-0.07750705,-0.02211588


## 3.  LASSO using cross Validation

In [17]:
Y1 <- as.matrix(growth[, 1, drop = F]) # output variable
W1 <- as.matrix(growth)[, -c(1, 2,3)] # controls
D1 <- as.matrix(growth[, 3, drop = F]) # target regressor

In [18]:
#install.packages("glmnet")
library("glmnet")

Loading required package: Matrix

Loaded glmnet 4.1-3



In [19]:
cv.5 <- cv.glmnet(W1, Y1, alpha = 0.00077)
r_Y= Y1-predict(cv.5, newx = W1, type='link') #residual of regression Y on W
cv.7 <- cv.glmnet(W1, D1, alpha = 0.00077)
r_D = D1-predict(cv.7, newx = W1, type='link') #residual of regression D on W

# ols
partial_lasso_fit <- lm(r_Y~r_D)

In [20]:
est_lasso <- partial_lasso_fit$coef[2]
std_lasso <- summary(partial_lasso_fit)$coef[2,2]
ci_lasso <- confint(partial_lasso_fit)[2,]

library(xtable)
table <- matrix(0, 1, 4)
table[1,1:4]   <- c(est_lasso,std_lasso,ci_lasso[1],ci_lasso[2])
colnames(table) <-c("estimator","standard error", "lower bound CI", "upper bound CI")
rownames(table) <-c("LASSO - Cross Validation")
tab4<- xtable(table, digits = 3)

tab4

Unnamed: 0_level_0,estimator,standard error,lower bound CI,upper bound CI
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>
LASSO - Cross Validation,-0.03678625,0.01550241,-0.06759403,-0.005978468


## 4. Double Lasso - Cross Validation

In [23]:
resY <- Y1 - predict(cv.glmnet(W1, Y1),newx=W)
resD <- Y1 - predict(cv.glmnet(W1, D1),newx=W)

residuos <- data.frame(resY,resD)
cross.lasso <- lm(resY~resD,data=residuos)
part_out_cross.lasso <- summary(cross.lasso)

In [24]:
est_lasso <- part_out_cross.lasso$coef["resD",1]
std_lasso <- part_out_cross.lasso$coef["resD",2]
ci_lasso <- confint(lm(resY~resD,data=residuos))[2,]

In [25]:
table <- matrix(0,1,4)
table[1,1:4]   <- c(est_lasso,std_lasso,ci_lasso[1],ci_lasso[2])
colnames(table) <-c("estimator","standard error", "lower bound CI", "upper bound CI")
rownames(table) <-c("Double Lasso - Cross Validation")
tab5 <- xtable(table, digits = 3) 
tab5

Unnamed: 0_level_0,estimator,standard error,lower bound CI,upper bound CI
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>
Double Lasso - Cross Validation,0.001138579,0.003083654,-0.004989536,0.007266693


## 5. Double Lasso using theoretical Lambda

In [26]:
resY <- rlasso(W,Y)$res
resD <- rlasso(W,D)$res


residuos <- data.frame(resY,resD)
theoretical.lasso <- lm(resY~resD,data=residuos)
part_out_theoretical.lasso <- summary(theoretical.lasso)

In [27]:
est_lasso1 <- part_out_theoretical.lasso$coef["resD",1]
std_lasso1 <- part_out_theoretical.lasso$coef["resD",2]
ci_lasso1 <- confint(lm(resY~resD,data=residuos))[2,]

In [28]:
table <- matrix(0,1,4)
table[1,1:4]   <- c(est_lasso1,std_lasso1,ci_lasso1[1],ci_lasso1[2])
colnames(table) <-c("estimator","standard error", "lower bound CI", "upper bound CI")
rownames(table) <-c("Double Lasso - Theoretical Lambda")
tab6 <- xtable(table, digits = 3) 
tab6

Unnamed: 0_level_0,estimator,standard error,lower bound CI,upper bound CI
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>
Double Lasso - Theoretical Lambda,-0.04981147,0.01393636,-0.07750705,-0.02211588


## 6. Double Lasso using method="partialling out"

In [29]:
part_out_direct.lasso <- rlassoEffect(x = W, y = Y, d = D, method = "partialling out")

est_lasso2 <- summary(part_out_direct.lasso)$coef[,1]
std_lasso2 <- summary(part_out_direct.lasso)$coef[,2]
lower_ci_lasso2 <- est_lasso2 - 1.96*std_lasso2
upper_ci_lasso2 <- est_lasso2 + 1.96*std_lasso2

In [56]:
table <- matrix(0,1,4)
table[1,1:4]   <- c(est_lasso2,std_lasso2,lower_ci_lasso2,upper_ci_lasso2)
colnames(table) <-c("estimator","standard error", "lower bound CI", "upper bound CI")
rownames(table) <-c("Double Lasso - Partialling out")
tab7 <- xtable(table, digits = 3) 
tab7

Unnamed: 0_level_0,estimator,standard error,lower bound CI,upper bound CI
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>
Double Lasso - Partialling out,-0.04981147,0.01393636,-0.07712673,-0.0224962


**Method summary**

In [57]:
sum=rbind(tab1,tab2,tab3,tab4,tab5,tab6,tab7)
sum

Unnamed: 0_level_0,estimator,standard error,lower bound CI,upper bound CI
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>
OLS,-0.009377989,0.029887726,-0.070600221,0.051844243
OLS - Partialling-out,-0.009377989,0.016858951,-0.042881612,0.024125634
Lasso HDM,-0.049811465,0.013936358,-0.077507049,-0.022115881
LASSO - Cross Validation,-0.036786247,0.015502408,-0.067594026,-0.005978468
Double Lasso - Cross Validation,0.001138579,0.003083654,-0.004989536,0.007266693
Double Lasso - Theoretical Lambda,-0.049811465,0.013936358,-0.077507049,-0.022115881
Double Lasso - Partialling out,-0.049811465,0.013936358,-0.077126728,-0.022496203


### Coefficient Plot

In [35]:
summary <- data.frame(sum)
attach(summary)

In [38]:
library("ggplot2")

In [58]:
vars=c("OLS","OLS Partialling-out","Lasso HDM","Lasso Cross-Validation",
      "Double Lasso Cross-Validation", "Double Lasso Theoretical Lambda",
      "Double Lasso Direct")
png(filename="bench_query_sort.png", width=1000, height=600)

ggplot(summary, aes(vars, estimator)) +
  geom_hline(yintercept=0, lty=2, lwd=1, colour="grey50")+ 
  geom_errorbar(aes(ymin = lower.bound.CI,ymax = upper.bound.CI),colour="blue",lwd=1)+
  geom_point(size=4, pch=21, fill="red") +
  theme_bw()+
  ggtitle("Confidence intervals of gdpsh465 estimator")
dev.off()