# DoubleML + MakeTables Integration Demo

This notebook demonstrates the MakeTables integration with DoubleML, showing how to create publication-ready regression tables with beautiful HTML and LaTeX output.

## Setup

In [1]:
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression, LogisticRegression
import doubleml as dml
from maketables import ETable

# Set random seed for reproducibility
np.random.seed(42)

## Example 1: Basic PLR Model

Let's start with a simple Partially Linear Regression (PLR) model estimating the effect of education on income.

In [2]:
# Generate synthetic data
n = 1000
p = 10

X = np.random.normal(size=(n, p))
education = 0.5 * X[:, 0] + 0.3 * X[:, 1] + np.random.normal(size=n)
income = 0.8 * education + X[:, 2] + 0.5 * X[:, 3] + np.random.normal(size=n)

df = pd.DataFrame(
    np.column_stack((X, income, education)),
    columns=[f"X{i+1}" for i in range(p)] + ["income", "education"]
)

print(f"Data shape: {df.shape}")
print(f"\nFirst few rows:")
df.head()

Data shape: (1000, 12)

First few rows:


Unnamed: 0,X1,X2,X3,X4,X5,X6,X7,X8,X9,X10,income,education
0,0.496714,-0.138264,0.647689,1.52303,-0.234153,-0.234137,1.579213,0.767435,-0.469474,0.54256,-0.14627,-0.471617
1,-0.463418,-0.46573,0.241962,-1.91328,-1.724918,-0.562288,-1.012831,0.314247,-0.908024,-1.412304,-0.823369,-0.676927
2,1.465649,-0.225776,0.067528,-1.424748,-0.544383,0.110923,-1.150994,0.375698,-0.600639,-0.291694,0.520076,0.06771
3,-0.601707,1.852278,-0.013497,-1.057711,0.822545,-1.220844,0.208864,-1.95967,-1.328186,0.196861,0.505031,0.365248
4,0.738467,0.171368,-0.115648,-0.301104,-1.478522,-0.719844,-0.460639,1.057122,0.343618,-1.76304,2.112535,1.617822


In [3]:
# Prepare data for DoubleML
dml_data = dml.DoubleMLData(df, "income", "education")

# Fit PLR model
ml_l = LinearRegression()
ml_m = LinearRegression()

dml_plr = dml.DoubleMLPLR(dml_data, ml_l, ml_m, n_folds=5, score="partialling out")
dml_plr.fit()

# Show standard DoubleML summary
print("DoubleML Summary:")
print(dml_plr.summary)

DoubleML Summary:
               coef   std err          t          P>|t|     2.5 %    97.5 %
education  0.830282  0.032441  25.593745  1.790892e-144  0.766699  0.893865


### Inspect MakeTables Attributes

The model now has special `__maketables_*` attributes that MakeTables uses to create tables:

In [4]:
# Coefficient table
print("Coefficient Table (__maketables_coef_table__):")
print(dml_plr.__maketables_coef_table__)

print(f"\nSample Size: {dml_plr.__maketables_stat__('N')}")
print(f"Dependent Variable: {dml_plr.__maketables_depvar__}")
print(f"Default Statistics: {dml_plr.__maketables_default_stat_keys__}")

Coefficient Table (__maketables_coef_table__):
                  b        se              p          t     ci95l     ci95u
education  0.830282  0.032441  1.790892e-144  25.593745  0.766699  0.893865

Sample Size: 1000
Dependent Variable: income
Default Statistics: ['N']


### Create Table with MakeTables

Now let's create a publication-ready table using MakeTables:

In [6]:
# Create table
table = ETable([dml_plr], show_se=True, model_stats=['N'])

table

Unnamed: 0_level_0,income
Unnamed: 0_level_1,(1)
coef,coef
education,0.830*** (0.032)
stats,stats
Observations,1000
"Significance levels: * p < 0.1, ** p < 0.05, *** p < 0.01. Format of coefficient cell: Coefficient (Std. Error)","Significance levels: * p < 0.1, ** p < 0.05, *** p < 0.01. Format of coefficient cell: Coefficient (Std. Error)"




In [7]:
# Display LaTeX output
print("LaTeX Table Code:")
print(table.make('tex'))

LaTeX Table Code:
\begin{threeparttable}
\begingroup
\renewcommand\cellalign{t}
\renewcommand\arraystretch{1}
\setlength{\tabcolsep}{3pt}
\begin{tabularx}{\linewidth}{@{}>{\raggedright\arraybackslash}l>{\centering\arraybackslash}X}
\toprule
 & \multicolumn{1}{c}{income} \\
\cmidrule(lr){2-2}
 & (1) \\
\midrule
\addlinespace[1ex]
education & \makecell{0.830*** \\ (0.032)} \\
\addlinespace[0.5ex]
\midrule
\addlinespace[1ex]
Observations & 1,000 \\
\addlinespace[0.5ex]
\bottomrule
\end{tabularx}
\endgroup
\noindent\begin{minipage}{\linewidth}\smallskip\footnotesize
Significance levels: * p < 0.1, ** p < 0.05, *** p < 0.01. Format of coefficient cell: Coefficient   (Std. Error)\end{minipage}

\end{threeparttable}


## Example 2: Comparing Multiple Models

One of the strengths of MakeTables is easily comparing multiple models side-by-side.

In [8]:
# Generate data with two treatments
np.random.seed(43)
n = 1000
p = 8

X = np.random.normal(size=(n, p))
education = 0.5 * X[:, 0] + 0.2 * X[:, 1] + np.random.normal(size=n)
experience = 0.3 * X[:, 2] + 0.4 * X[:, 3] + np.random.normal(size=n)
income = 0.6 * education + 0.4 * experience + X[:, 4] + np.random.normal(size=n)

df2 = pd.DataFrame(
    np.column_stack((X, income, education, experience)),
    columns=[f"X{i+1}" for i in range(p)] + ["income", "education", "experience"]
)

# Fit separate models for each treatment
dml_data_edu = dml.DoubleMLData(df2, "income", "education")
dml_data_exp = dml.DoubleMLData(df2, "income", "experience")

dml_edu = dml.DoubleMLPLR(dml_data_edu, LinearRegression(), LinearRegression(), n_folds=5)
dml_exp = dml.DoubleMLPLR(dml_data_exp, LinearRegression(), LinearRegression(), n_folds=5)

dml_edu.fit()
dml_exp.fit()

print("Model 1 (Education effect):")
print(dml_edu.summary)
print("\nModel 2 (Experience effect):")
print(dml_exp.summary)

Model 1 (Education effect):
               coef   std err         t         P>|t|     2.5 %    97.5 %
education  0.620113  0.032109  19.31289  4.184858e-83  0.557181  0.683045

Model 2 (Experience effect):
               coef   std err          t         P>|t|     2.5 %    97.5 %
experience  0.42868  0.032385  13.237065  5.360313e-40  0.365207  0.492153


In [9]:
# Create comparison table
comparison_table = ETable(
    [dml_edu, dml_exp],
    show_se=True,
    model_stats=['N'],
    model_heads=['Education Model', 'Experience Model'],
    caption='Comparison of Treatment Effects on Income'
)

comparison_table

Comparison of Treatment Effects on Income,Comparison of Treatment Effects on Income,Comparison of Treatment Effects on Income
Unnamed: 0_level_1,income,income
Unnamed: 0_level_2,Education Model,Experience Model
Unnamed: 0_level_3,(1),(2)
coef,coef,coef
education,0.620*** (0.032),
experience,,0.429*** (0.032)
stats,stats,stats
Observations,1000,1000
"Significance levels: * p < 0.1, ** p < 0.05, *** p < 0.01. Format of coefficient cell: Coefficient (Std. Error)","Significance levels: * p < 0.1, ** p < 0.05, *** p < 0.01. Format of coefficient cell: Coefficient (Std. Error)","Significance levels: * p < 0.1, ** p < 0.05, *** p < 0.01. Format of coefficient cell: Coefficient (Std. Error)"




## Example 3: Binary Treatment (IRM Model)

Let's demonstrate with a binary treatment using the Interactive Regression Model (IRM).

In [10]:
# Generate data with binary treatment
np.random.seed(44)
n = 1000
p = 8

X = np.random.normal(size=(n, p))
propensity = 1 / (1 + np.exp(-0.5 * X[:, 0] - 0.3 * X[:, 1]))
treatment = (np.random.uniform(size=n) < propensity).astype(float)
outcome = 0.7 * treatment + X[:, 2] + 0.5 * X[:, 3] + np.random.normal(size=n)

df_irm = pd.DataFrame(
    np.column_stack((X, outcome, treatment)),
    columns=[f"X{i+1}" for i in range(p)] + ["outcome", "treatment"]
)

# Fit IRM model
dml_data_irm = dml.DoubleMLData(df_irm, "outcome", "treatment")

dml_irm = dml.DoubleMLIRM(
    dml_data_irm,
    LinearRegression(),
    LogisticRegression(max_iter=1000),
    n_folds=5,
    score="ATE"
)
dml_irm.fit()

print("IRM Summary:")
print(dml_irm.summary)

IRM Summary:
               coef   std err         t         P>|t|     2.5 %    97.5 %
treatment  0.635458  0.070924  8.959683  3.256131e-19  0.496449  0.774467


In [12]:
# Create table for IRM model
irm_table = ETable(
    [dml_irm],
    show_se=True,
    model_stats=['N'],
    caption='Average Treatment Effect (ATE) Estimation'
)

irm_table

Average Treatment Effect (ATE) Estimation,Average Treatment Effect (ATE) Estimation
Unnamed: 0_level_1,outcome
Unnamed: 0_level_2,(1)
coef,coef
treatment,0.635*** (0.071)
stats,stats
Observations,1000
"Significance levels: * p < 0.1, ** p < 0.05, *** p < 0.01. Format of coefficient cell: Coefficient (Std. Error)","Significance levels: * p < 0.1, ** p < 0.05, *** p < 0.01. Format of coefficient cell: Coefficient (Std. Error)"




## Example 4: Customized Table Formatting

MakeTables allows extensive customization of table appearance.

In [None]:
# Create table with custom formatting
custom_table = ETable(
    [dml_plr],
    coef_fmt="b:.3f \n [ci95l:.3f, ci95u:.3f]",  # Show CI instead of SE
    model_stats=['N'],
    caption='Custom Formatted Table with Confidence Intervals',
    notes='95% confidence intervals shown in brackets.'
)

# Display HTML with custom styling
display(custom_table.make('html', gt_style={'table_font_size': '14px'}))

In [13]:
# Another example: showing t-statistics
t_stat_table = ETable(
    [dml_plr],
    coef_fmt="b:.4f \n (t:.2f)",  # Show t-stat instead of SE
    model_stats=['N'],
    caption='Table with t-statistics',
    notes='t-statistics shown in parentheses.'
)

t_stat_table

Table with t-statistics,Table with t-statistics
Unnamed: 0_level_1,income
Unnamed: 0_level_2,(1)
coef,coef
education,0.8303 (25.59)
stats,stats
Observations,1000
t-statistics shown in parentheses.,t-statistics shown in parentheses.




## Example 5: Multiple Treatments in One Model

In [None]:
# Fit model with multiple treatments
dml_data_multi = dml.DoubleMLData(df2, "income", ["education", "experience"])

dml_multi = dml.DoubleMLPLR(
    dml_data_multi,
    LinearRegression(),
    LinearRegression(),
    n_folds=5
)
dml_multi.fit()

print("Multi-treatment Summary:")
print(dml_multi.summary)

In [None]:
# Create table
multi_table = ETable(
    [dml_multi],
    show_se=True,
    model_stats=['N'],
    caption='Joint Estimation of Multiple Treatment Effects',
    labels={'education': 'Years of Education', 'experience': 'Years of Experience'}
)

multi_table.make('html')

## Saving Tables

You can save tables to files for use in your papers/presentations:

In [None]:
# Save as LaTeX
table.save('tex', 'table_results.tex')
print("✅ Saved to table_results.tex")

# Save as HTML
table.save('html', 'table_results.html')
print("✅ Saved to table_results.html")

# Save as Word document
table.save('docx', 'table_results.docx')
print("✅ Saved to table_results.docx")

## Summary

This notebook demonstrated:

1. **Basic Integration**: DoubleML models automatically work with MakeTables
2. **Model Comparison**: Easy side-by-side comparison of multiple models
3. **Different Model Types**: Works with PLR, IRM, and other DoubleML models
4. **Customization**: Flexible formatting options for coefficients and statistics
5. **Multiple Treatments**: Handles models with multiple treatment variables
6. **Export Options**: Save to LaTeX, HTML, Word, or Typst formats

### Key Advantages

- **Zero Coupling**: DoubleML doesn't depend on MakeTables
- **Automatic Detection**: MakeTables finds the special attributes automatically
- **Publication Ready**: Beautiful tables suitable for papers and presentations
- **Flexible**: Extensive customization options available