<a href="https://colab.research.google.com/github/POLSEAN/XTDML/blob/main/examples/01_xtdml_pliv_fd.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **DML for panel data with IV: FD (exact) approach**

---

*Description*

Estimation of the structural parameter using double machine learning (DML) with partially linear regression (PLR) models with instrumental variables (IV) in the context of panel data with fixed effects.

The package `XTDML` allows the estimation of the nuisance functions by machine learning methods and  the computation of the Neyman orthogonal score functions. `XTDML` is built on the CRAN package `DoubleML` (Bach et al., 2024), which uses the `mlr3` ecosystem and the `R6` package.

**References**

[1] Bach, P., Chernozhukov, V., Kurz, M. S., Spindler, M. and Klaassen, S. (2024), DoubleML - An Object-Oriented Implementation of Double Machine Learning in R, *Journal of Statistical Software*, 108(3):1-56.

[2] Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. *The Econometrics Journal*, 21(1):C1-C68.

[3] Clarke, P. and Polselli, A. (2023). Double machine learning for static panel models with fixed effects. *arXiv preprint*, arXiv:2312.08174.

[4] Mundlak, Y. (1978). On the pooling of time series and cross section data. *Econometrica*, pages 69-85.

*Overview Code*

1. Installation of XTDML and other R packages
2. Loading the data
3. Data management with FD transformation
4. Set up of DML data environment
5. Set up of DML estimation environment
6. Extraction of DML estimates


### **The Installation of `XTDML` package**

The `XTDML` package can be installed following either options below:

1. **Installation directly from GitHub:**
  ```
    #install.packages("devtools")
    library(devtools)

    install_github("POLSEAN/XTDML")
    library(XTDML)
  ```
  *Note this code works **ONLY with RStudio (desktop)**, but not with online platforms such as Google Colab or Kaggle.*


2. **Download all folders in `XTDML`** from `https://github.com/POLSEAN/XTDML` pressing `<> CODE > Download ZIP`. Rename the downloaded .zip folder as `XTDML`, and upload it on Google Colab. Get the path and run the code `!unzip XTDML.zip` in Python, then change the RUNTIME to R and run
   ```
    #install.packages("devtools")
    library(devtools)

    wd = "~ your-directory/XTDML"
    devtools::load_all(wd)
   ```

For illustration purposes on Google Colab, we follow the second approach, but the first is recommended with RStudio (desktop).


**Set RUNTIME > CHANGE RUNTIME TYPE > Python 3**

The code below unzips the XTDML.zip folder that you have previously uploaded.

In [1]:
!unzip XTDML.zip

Archive:  XTDML.zip
 extracting: XTDML/.gitignore        
  inflating: XTDML/.Rbuildignore     
  inflating: XTDML/.RData            
  inflating: XTDML/.Rhistory         
   creating: XTDML/.Rproj.user/
   creating: XTDML/.Rproj.user/22C44D20/
   creating: XTDML/.Rproj.user/22C44D20/bibliography-index/
 extracting: XTDML/.Rproj.user/22C44D20/cpp-definition-cache  
   creating: XTDML/.Rproj.user/22C44D20/ctx/
   creating: XTDML/.Rproj.user/22C44D20/explorer-cache/
   creating: XTDML/.Rproj.user/22C44D20/pcs/
  inflating: XTDML/.Rproj.user/22C44D20/pcs/files-pane.pper  
 extracting: XTDML/.Rproj.user/22C44D20/pcs/source-pane.pper  
  inflating: XTDML/.Rproj.user/22C44D20/pcs/windowlayoutstate.pper  
  inflating: XTDML/.Rproj.user/22C44D20/pcs/workbench-pane.pper  
   creating: XTDML/.Rproj.user/22C44D20/presentation/
   creating: XTDML/.Rproj.user/22C44D20/profiles-cache/
 extracting: XTDML/.Rproj.user/22C44D20/rmd-outputs  
 extracting: XTDML/.Rproj.user/22C44D20/saved_source_markers  

**From now change RUNTIME > SET RUNTIME TYPE > R**

In [2]:
# 1. Install and import R packages
# Install packages
list.of.packages <- c("datawizard","mlr3","mlr3learners","mlr3tuning","paradox","xgboost","ranger","MLmetrics","devtools","tidyverse")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages, repos = "http://cran.us.r-project.org")

# Load general packages
library(devtools)
library(tidyverse)
library(checkmate)
library(dplyr)    # alternative installation of the %>%
library(tibble)   # for add_column()
library(datawizard)
library(data.table)
# ML packages
library(mlr3)
library(mlr3learners)
library(rpart)
library(xgboost)
library(ranger)
# Packages for HP tuning
library(mlr3misc)
library(mlr3tuning)
library(paradox)
library(MLmetrics)

# Suppress error messages from ML packages
lgr::get_logger("bbotk")$set_threshold("warn")
lgr::get_logger("mlr3")$set_threshold("warn")

Installing packages into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

also installing the dependencies ‘bitops’, ‘gtools’, ‘caTools’, ‘globals’, ‘listenv’, ‘PRROC’, ‘gplots’, ‘insight’, ‘checkmate’, ‘future’, ‘future.apply’, ‘lgr’, ‘mlbench’, ‘mlr3measures’, ‘mlr3misc’, ‘parallelly’, ‘palmerpenguins’, ‘bbotk’, ‘RcppEigen’, ‘ROCR’


Loading required package: usethis

── [1mAttaching core tidyverse packages[22m ──────────────────────── tidyverse 2.0.0 ──
[32m✔[39m [34mdplyr    [39m 1.1.4     [32m✔[39m [34mreadr    [39m 2.1.5
[32m✔[39m [34mforcats  [39m 1.0.0     [32m✔[39m [34mstringr  [39m 1.5.1
[32m✔[39m [34mggplot2  [39m 3.5.1     [32m✔[39m [34mtibble   [39m 3.2.1
[32m✔[39m [34mlubridate[39m 1.9.3     [32m✔[39m [34mtidyr    [39m 1.3.1
[32m✔[39m [34mpurrr    [39m 1.0.2     
── [1mConflicts[22m ────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[39m [34mdplyr[39m::[32mfilter()[39m masks [34mstats[39m::fil

In [3]:
# Additional package required to install XTDML (not always necessary, depends on the R version)
list.of.packages <- c("mvtnorm","clusterGeneration","readstata13")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages, repos = "http://cran.us.r-project.org")

library(mvtnorm)
library(clusterGeneration)
library(readstata13)

Installing packages into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)


Attaching package: ‘mvtnorm’


The following object is masked from ‘package:datawizard’:

    standardize


Loading required package: MASS


Attaching package: ‘MASS’


The following object is masked from ‘package:dplyr’:

    select




In [4]:
# Install package
wd = "/content/XTDML"
devtools::load_all(wd)

[1m[22m[36mℹ[39m Loading [34mXTDML[39m


### **The Data**

We use simulate the data for DGP3 as follows.
\begin{align}
  & y_{it}  = d_{it}\cdot 0.5 + l_0(x_{it}) + \alpha_i + u_{it}\\
  & d_{it}  = z_{it}\cdot 0.8 + r_0(x_{it}) + \eta_i + w_{it}\\
  & z_{it}  = m_0(x_{it}) + \xi_i + v_{it}\\
  & x_{it}∼N(1,5), u_{it}∼N(0,1), v_{it}∼N(0,1), w_{it}∼N(0,1) \\
  & \alpha_i∼N(0,0.4),  \eta_i∼N(0,0.4),  \xi_i∼N(0,0.4)\\
\end{align}


In this dataset, the nuisance functions are generated as follows

\begin{align*}
    l_0(x_{it}) & =  0.5^j \sum_{j=1}^{5} (x_{it,j}\cdot1[x_{it,j}>0]) + 0.5^j \sum_{j=1}^{5} (x_{it,j+5}\cdot x_{it,j+4})\\
    r_0(x_{it}) & = 0.5^j \sum_{j=1}^{5} (x_{it,j+10}\cdot 1[x_{it,j+10}>0]) +  0.5^j \sum_{j=1}^{5} (x_{it,j+15}\cdot x_{it,j+14})\\
    m_0(x_{it}) & =  0.5^j \sum_{j=1}^{5} (x_{it,j+20}\cdot 1[x_{it,j+20}>0]) +  0.5^j \sum_{j=1}^{5} (x_{it,j+25}\cdot x_{it,j+24})
\end{align*}

The true structural effect is 0.5; the number of control variables is $p=30$, but only $s=10$ are used to model the nuisance parameters $(l_0, m_0, r_0)$, there is no overlap.

The original dataset comprises N=5,000 cross-sections observed over $T=10$ periods each; in this example we use a smaller dataset with the first $1,000$ units.


Note that the FD (exact) approach requires to use **transformed** data
* $\Delta y_{it}$ is the transformed output variable
* $\Delta d_{it}$ is the transformed treatment variable
* $\mathbf{x}_{it} = (x_{it,1}, \dots, x_{it,p}, x_{it-1,1}, \dots, x_{it-1,p})'$ are the set of $p=30$ control variables, but only $s=2$ are relevant; $x_{it-1,k}$ is the lag of variable $k$

where $\Delta y_{it} = y_{it}-y_{it-1}.$



In [5]:
# 2. Load simulated data from GitHub
df = read.csv("https://raw.githubusercontent.com/POLSEAN/XTDML/main/data/dgp3_fd_iv_short.csv")
names(df)
head(df)

Unnamed: 0_level_0,id,time,X1,X2,X3,X4,X5,X6,X7,X8,⋯,L.X21,L.X22,L.X23,L.X24,L.X25,L.X26,L.X27,L.X28,L.X29,L.X30
Unnamed: 0_level_1,<int>,<int>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,⋯,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
1,1,2,7.4777367,-5.446002,4.39199968,-3.2461127,3.794314,-4.426094,16.8874061,-10.2341614,⋯,1.6905339,0.6430322,4.396504,4.2673437,2.489757,-1.961203,-0.128617,3.083191,0.9333028,-5.310279
2,1,3,-6.7781311,3.315232,3.75811919,0.7260812,-8.26278,-1.700011,-1.9899528,-4.6079425,⋯,0.8922307,7.6914944,2.13975,-5.56261,14.355202,-6.1346808,3.114928,7.1137866,8.9281011,5.349266
3,1,4,-0.3668897,-1.859044,-0.6554911,2.1514831,3.723106,-2.69531,-5.1409043,-0.4343938,⋯,6.7881877,9.8232725,4.151105,1.8051773,-1.609558,-2.2304554,3.288991,1.7898408,2.0453694,7.180541
4,1,5,9.0546562,-1.089742,0.04849561,5.658067,-1.364368,13.046264,-3.2134139,2.0300912,⋯,-3.5370691,-2.002287,-2.49846,-0.2531446,-7.821912,-0.2242888,5.375563,-0.1968998,-1.6277743,-2.938278
5,1,6,-5.834885,7.151937,-3.45005203,2.6761822,1.932599,-6.833742,-2.2764379,5.112579,⋯,1.4772189,-5.7426569,-1.964576,7.914386,11.071879,-2.701182,-5.433439,10.9298117,-4.5348186,-2.796325
6,1,7,1.9302234,-10.55283,-9.71377632,3.1437344,2.304083,5.816848,-0.4529628,-6.4481636,⋯,5.7826088,7.143322,2.134571,5.1742646,-1.099541,-1.4696217,-2.349695,-5.0704696,4.1020889,-2.526907


### **3. Transform variables**

Transform the variables in the dataset in a suitable way for FD approach, if necessary. A sample code below

```
# keep variables to transform (no var means)

# Create (a) lags of X and (b) Delta_y and Delta_d
xvars = paste0("x", 1:30)

df.fd = df %>%
  group_by(id) %>%
  mutate(across(xvars, ~  lag(.x), .names = "L.{col}"))   %>%
  mutate(across(starts_with(c("d", "y")), ~ c(NA, diff(.x))))  %>%
  ungroup()

# Use complete.cases() to identify rows without missing values
complete_rows <- complete.cases(df.fd)

# Subset the data frame to keep only complete rows
df.fd <- df.fd[complete_rows, ]
df.fd = as.data.frame(df.fd)
names(df.fd)

```

The loaded dataset does not require any transformation because $(Y,D,Z)$ are alreadu first-differenced, and the matrix $X$ contains $p$ control variables at time $t$ and $t-1$.

## **Estimation and inference with DML for FD**

The section below consists in setting up the DML data and estimation environments, and proceed with the actual estimation.

### **4. Set up DML data environment**
Initalization of `dml_approx_data`  from `data.frame`. Arguments to pass:

```
dml_approx_data_from_data_frame(data,
                  x_cols = NULL,
                  y_col = NULL,
                  d_cols = NULL,
                  z_cols = NULL,
                  cluster_cols = NULL
                  )

```       

In [9]:
# 4. Set up DML data environment
# Get the names of the variables
x_cols  = paste0("X", 1:30)
Lx_cols = paste0("L.X", 1:30)
xvars = c(x_cols, Lx_cols)

# set up data for DML procedure
obj_dml_data = dml_approx_data_from_data_frame(df,
                            x_cols = xvars,  y_col = "y", d_cols = "d", z_cols = "z",
                            cluster_cols = "id")
obj_dml_data$print()



------------------ Data summary ------------------
Outcome variable: y
Treatment variable(s): d
Cluster variable(s): id
Covariates: X1, X2, X3, X4, X5, X6, X7, X8, X9, X10, X11, X12, X13, X14, X15, X16, X17, X18, X19, X20, X21, X22, X23, X24, X25, X26, X27, X28, X29, X30, L.X1, L.X2, L.X3, L.X4, L.X5, L.X6, L.X7, L.X8, L.X9, L.X10, L.X11, L.X12, L.X13, L.X14, L.X15, L.X16, L.X17, L.X18, L.X19, L.X20, L.X21, L.X22, L.X23, L.X24, L.X25, L.X26, L.X27, L.X28, L.X29, L.X30
Instrument(s): z
No. Observations: 10000


### **5. Set up DML estimation environment**

Arguments to pass in `dml_approx_plr` function that Creates a new instance of this R6 class.

```
 dml_approx_plr$new(data,
      ml_l,
      ml_m,
      ml_g = NULL,
      n_folds = 5,
      n_rep = 1,
      score = "partialling out",               # or "IV-type"
      dml_procedure = "dml2",          # or "dml1"
      draw_sample_splitting = TRUE,
      apply_cross_fitting = TRUE
      )

```

In [12]:
install.packages("glmnet")
install.packages("rpart")
install.packages("xgboost")
library(glmnet)
library(rpart)
library(xgboost)

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

also installing the dependencies ‘iterators’, ‘foreach’, ‘shape’


Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

Loading required package: Matrix


Attaching package: ‘Matrix’


The following objects are masked from ‘package:tidyr’:

    expand, pack, unpack


Loaded glmnet 4.1-8



In [14]:
# 5. Set up DML estimation environment
# Lasso w/t dictionary for fast computing
set.seed(1408)
learner = lrn("regr.cv_glmnet", s="lambda.min")
ml_l = learner$clone()
ml_m = learner$clone()
ml_r = learner$clone()

dml_lasso = dml_approx_pliv$new(obj_dml_data,
               ml_l = ml_l, ml_m = ml_m, ml_r = ml_r,
               partialX = TRUE, partialZ = FALSE,
               n_folds = 3,
               score = "partialling out")

# Estimate target/causal parameter
dml_lasso$fit()
dml_lasso$print()

No parameters provided for learners. Default values are used.

No parameters provided for learners. Default values are used.

No parameters provided for learners. Default values are used.






------------------ Data summary ------------------
Outcome variable: y
Treatment variable: d
Covariates: X1, X2, X3, X4, X5, X6, X7, X8, X9, X10, X11, X12, X13, X14, X15, X16, X17, X18, X19, X20, X21, X22, X23, X24, X25, X26, X27, X28, X29, X30, L.X1, L.X2, L.X3, L.X4, L.X5, L.X6, L.X7, L.X8, L.X9, L.X10, L.X11, L.X12, L.X13, L.X14, L.X15, L.X16, L.X17, L.X18, L.X19, L.X20, L.X21, L.X22, L.X23, L.X24, L.X25, L.X26, L.X27, L.X28, L.X29, L.X30
Instrument(s): z
Cluster variables: id
No. Observations: 10000
No. Groups: 1112

------------------ Score & algorithm ------------------
Score function: partialling out
DML algorithm: dml2
DML approach: transformed variables 

------------------ Machine learner ------------------
Learner of ml_l: regr.cv_glmnet
Learner of ml_m: regr.cv_glmnet
Learner of ml_r: regr.cv_glmnet
RMSE of ml_l : 2.120
RMSE of ml_m : 1.759
RMSE of ml_r : 2.307
Model RMSE: 5.357

------------------ Resampling ------------------
No. folds: 3
No. folds per cluster: 3
No. r

In [18]:
# 5. Set up DML estimation environment
# Regression tree
set.seed(1408)
learner = lrn("regr.rpart")
ml_l = learner$clone()
ml_m = learner$clone()
ml_r = learner$clone()

dml_rpart = dml_approx_pliv$new(obj_dml_data,
               ml_l = ml_l, ml_m = ml_m, ml_r = ml_r,
               partialX = TRUE, partialZ = FALSE,
               n_folds = 3,
               score = "partialling out")

# set up a list of parameter grids
param_grid = list("ml_l" = ps(cp = p_dbl(lower = 0.001, upper = 0.02),
                              maxdepth = p_int(lower = 2, upper = 10)),
                  "ml_m" = ps(cp = p_dbl(lower = 0.001, upper = 0.02),
                              maxdepth = p_int(lower = 2, upper = 10)),
                  "ml_r" = ps(cp = p_dbl(lower = 0.001, upper = 0.02),
                              maxdepth = p_int(lower = 2, upper = 10)))

tune_settings = list(terminator = mlr3tuning::trm("evals", n_evals = 5),
                      algorithm = tnr("grid_search"), resolution = 5)

dml_rpart$tune(param_set = param_grid, tune_settings = tune_settings)

# Estimate target/causal parameter
dml_rpart$fit()
dml_rpart$print()
print(dml_rpart$params)

TuningInstanceSingleCrit is deprecated. Use TuningInstanceBatchSingleCrit instead.

TuningInstanceSingleCrit is deprecated. Use TuningInstanceBatchSingleCrit instead.

TuningInstanceSingleCrit is deprecated. Use TuningInstanceBatchSingleCrit instead.






------------------ Data summary ------------------
Outcome variable: y
Treatment variable: d
Covariates: X1, X2, X3, X4, X5, X6, X7, X8, X9, X10, X11, X12, X13, X14, X15, X16, X17, X18, X19, X20, X21, X22, X23, X24, X25, X26, X27, X28, X29, X30, L.X1, L.X2, L.X3, L.X4, L.X5, L.X6, L.X7, L.X8, L.X9, L.X10, L.X11, L.X12, L.X13, L.X14, L.X15, L.X16, L.X17, L.X18, L.X19, L.X20, L.X21, L.X22, L.X23, L.X24, L.X25, L.X26, L.X27, L.X28, L.X29, L.X30
Instrument(s): z
Cluster variables: id
No. Observations: 10000
No. Groups: 1112

------------------ Score & algorithm ------------------
Score function: partialling out
DML algorithm: dml2
DML approach: transformed variables 

------------------ Machine learner ------------------
Learner of ml_l: regr.rpart
Learner of ml_m: regr.rpart
Learner of ml_r: regr.rpart
RMSE of ml_l : 2.088
RMSE of ml_m : 1.647
RMSE of ml_r : 2.273
Model RMSE: 5.472

------------------ Resampling ------------------
No. folds: 3
No. folds per cluster: 3
No. repeated samp

In [23]:
# 5. Set up DML estimation environment
# Boosted trees; just 10 trees for fast computing
set.seed(1408)
learner = lrn("regr.xgboost", nrounds = 10)
ml_l = learner$clone()
ml_m = learner$clone()
ml_r = learner$clone()

dml_xgboost = dml_approx_pliv$new(obj_dml_data,
               ml_l = ml_l, ml_m = ml_m, ml_r = ml_r,
               partialX = TRUE, partialZ = FALSE,
               n_folds = 3,
               score = "partialling out")

# set up a list of parameter grids
param_grid = list("ml_l" = ps(max_depth = p_int(lower = 2, upper = 10),
                              lambda = p_dbl(lower = 0, upper = 2)),
                  "ml_m" = ps(max_depth = p_int(lower = 2, upper = 10),
                              lambda = p_dbl(lower = 0, upper = 2)),
                  "ml_r" = ps(max_depth = p_int(lower = 2, upper = 10),
                              lambda = p_dbl(lower = 0, upper = 2)))

tune_settings = list(terminator = mlr3tuning::trm("evals", n_evals = 5),
                      algorithm = tnr("grid_search"), resolution= 5)


# Estimate target/causal parameter
dml_xgboost$fit()
dml_xgboost$print()
print(dml_xgboost$params)

No parameters provided for learners. Default values are used.

No parameters provided for learners. Default values are used.

No parameters provided for learners. Default values are used.






------------------ Data summary ------------------
Outcome variable: y
Treatment variable: d
Covariates: X1, X2, X3, X4, X5, X6, X7, X8, X9, X10, X11, X12, X13, X14, X15, X16, X17, X18, X19, X20, X21, X22, X23, X24, X25, X26, X27, X28, X29, X30, L.X1, L.X2, L.X3, L.X4, L.X5, L.X6, L.X7, L.X8, L.X9, L.X10, L.X11, L.X12, L.X13, L.X14, L.X15, L.X16, L.X17, L.X18, L.X19, L.X20, L.X21, L.X22, L.X23, L.X24, L.X25, L.X26, L.X27, L.X28, L.X29, L.X30
Instrument(s): z
Cluster variables: id
No. Observations: 10000
No. Groups: 1112

------------------ Score & algorithm ------------------
Score function: partialling out
DML algorithm: dml2
DML approach: transformed variables 

------------------ Machine learner ------------------
Learner of ml_l: regr.xgboost
Learner of ml_m: regr.xgboost
Learner of ml_r: regr.xgboost
RMSE of ml_l : 1.942
RMSE of ml_m : 1.508
RMSE of ml_r : 2.077
Model RMSE: 4.917

------------------ Resampling ------------------
No. folds: 3
No. folds per cluster: 3
No. repeate

In [25]:
# Display table with results
library(xtable)

table = matrix(0, 3, 7)
table[1,] = cbind(dml_lasso$coef_theta,dml_lasso$se_theta,dml_lasso$pval_theta,dml_lasso$model_rmse,
                  as.numeric(dml_lasso$rmses["ml_l"]),as.numeric(dml_lasso$rmses["ml_m"]),as.numeric(dml_lasso$rmses["ml_r"]))
table[2,] = cbind(dml_rpart$coef_theta,dml_rpart$se_theta,dml_rpart$pval_theta,dml_rpart$model_rmse,
                  as.numeric(dml_rpart$rmses["ml_l"]),as.numeric(dml_rpart$rmses["ml_m"]),as.numeric(dml_rpart$rmses["ml_r"]))
table[3,] = cbind(dml_xgboost$coef_theta,dml_xgboost$se_theta,dml_xgboost$pval_theta,dml_xgboost$model_rmse,
                  as.numeric(dml_xgboost$rmses["ml_l"]),as.numeric(dml_xgboost$rmses["ml_m"]),as.numeric(dml_xgboost$rmses["ml_r"]))

colnames(table)= c("Estimate", "Std. Error", "P-value", "Model RMSE", "MSE of l", "MSE of m", "MSE of r")
rownames(table)= c("DML-Lasso","DML-CART","DML-XGBOOST")
tab = xtable(table)
tab

Unnamed: 0_level_0,Estimate,Std. Error,P-value,Model RMSE,MSE of l,MSE of m,MSE of r
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
DML-Lasso,0.5184393,0.01485917,1.0469079999999999e-266,5.356925,2.120211,1.759345,2.306805
DML-CART,0.5167774,0.01526903,4.328994e-251,5.4722,2.088471,1.647033,2.273001
DML-XGBOOST,0.5390583,0.0155588,5.0387009999999995e-263,4.917116,1.941636,1.508097,2.076735
