# How to generate publication quality tables


Estimagic can create publication quality tables of parameter estimates in LaTeX or HTML. It works with the results from `estimate_ml` and `estimate_msm` but also supports statsmodels results out of the box. 

You can get almost limitless flexibility if you split the table generation into two steps. The fist generates a DataFrame which you can customize to your liking, the second renders that DataFrame in LaTeX or HTML.

In [1]:
import numpy as np
import pandas as pd
import statsmodels.formula.api as sm
from IPython.core.display import HTML, Latex

from estimagic import estimation_table, render_html, render_latex
from estimagic.config import EXAMPLE_DIR

## Create tables from statsmodels results

In [2]:
df = pd.read_csv(EXAMPLE_DIR / "diabetes.csv", index_col=0)
df.rename({"S1": "S_1"}, inplace=True, axis=1)
mod1 = sm.ols("target ~ Age + Sex+ S_1", data=df).fit()
mod2 = sm.ols("target ~ Age + Sex + BMI + ABP", data=df).fit()
models = [mod1, mod2]

In [3]:
HTML(estimation_table(models, return_type="html"))

Unnamed: 0_level_0,target,target
Unnamed: 0_level_1,(1),(2)
Intercept,152.00$^{*** }$,152.00$^{*** }$
,(3.56),(2.85)
Age,227.00$^{*** }$,37.20$^{ }$
,(78.70),(64.10)
Sex,20.30$^{ }$,-107.00$^{* }$
,(76.00),(62.10)
S_1,284.00$^{*** }$,
,(77.50),
BMI,,787.00$^{*** }$
,,(65.40)


## Adding estimagic results

`estimate_ml` and `estimate_msm` can both generate summaries of estimation results. Those summaries are either DataFrames with the columns `"value"`, `"standard_error"`, `"p_value"` and `"stars"` or pytrees containing such DataFrames. 

For examples, check out our tutorials on [`estimate_ml`](../../getting_started/first_likelihood_estimation_with_estimagic.ipynb) and [`estimate_msm`](../../getting_started/first_msm_estimation_with_estimagic.ipynb).


Assume we got the following DataFrame from an estimation summary:

In [4]:
params = pd.DataFrame(
    {
        "value": [142525262.123, 51.456, -33.789],
        "standard_error": [3.1415, 2.71828, 1.6180],
        "p_value": [1e-8] * 3,
    },
    index=["Intercept", "Age", "Sex"],
)
params

Unnamed: 0,value,standard_error,p_value
Intercept,142525300.0,3.1415,1e-08
Age,51.456,2.71828,1e-08
Sex,-33.789,1.618,1e-08


You can either use just the params DataFrame or a dictionary containing "params" and additional information in `estimation_table`.

In [5]:
mod3 = {"params": params, "name": "target", "info": {"n_obs": 445}}
models = [mod1, mod2, mod3]

In [9]:
Latex(estimation_table(models, return_type="latex", render_options={"siunitx": True}))

                   \sisetup{
                        group-digits             = false,
                        input-symbols            = (),
                        table-align-text-post    = false
                    }
                    to your main tex file. To turn
  warn(
  warn(


<IPython.core.display.Latex object>

In [21]:
import re

re.findall("[-+]?[.]?[\d]+(?:,\d\d\d)*[\.]?\d*(?:[eE][-+]?\d+)?", "45.5$***")

['45.5']

In [24]:
"45.5$***".split("45.5")

['', '$***']

In [7]:
for i, j in df.iterrows():
    print(i, j)

0 Age         0.038076
Sex         0.050680
BMI         0.061696
ABP         0.021872
S_1        -0.044223
S2         -0.034821
S3         -0.043401
S4         -0.002592
S5          0.019908
S6         -0.017646
target    151.000000
Name: 0, dtype: float64
1 Age       -0.001882
Sex       -0.044642
BMI       -0.051474
ABP       -0.026328
S_1       -0.008449
S2        -0.019163
S3         0.074412
S4        -0.039493
S5        -0.068330
S6        -0.092204
target    75.000000
Name: 1, dtype: float64
2 Age         0.085299
Sex         0.050680
BMI         0.044451
ABP        -0.005671
S_1        -0.045599
S2         -0.034194
S3         -0.032356
S4         -0.002592
S5          0.002864
S6         -0.025930
target    141.000000
Name: 2, dtype: float64
3 Age        -0.089063
Sex        -0.044642
BMI        -0.011595
ABP        -0.036656
S_1         0.012191
S2          0.024991
S3         -0.036038
S4          0.034309
S5          0.022692
S6         -0.009362
target    206.000000
Name: 3

Name: 284, dtype: float64
285 Age         0.012648
Sex        -0.044642
BMI        -0.020218
ABP        -0.015999
S_1         0.012191
S2          0.021233
S3         -0.076536
S4          0.108111
S5          0.059881
S6         -0.021788
target    233.000000
Name: 285, dtype: float64
286 Age       -0.038207
Sex       -0.044642
BMI       -0.054707
ABP       -0.077971
S_1       -0.033216
S2        -0.086490
S3         0.140681
S4        -0.076395
S5        -0.019197
S6        -0.005220
target    60.000000
Name: 286, dtype: float64
287 Age         0.045341
Sex        -0.044642
BMI        -0.006206
ABP        -0.015999
S_1         0.125019
S2          0.125198
S3          0.019187
S4          0.034309
S5          0.032433
S6         -0.005220
target    219.000000
Name: 287, dtype: float64
288 Age        0.070769
Sex        0.050680
BMI       -0.016984
ABP        0.021872
S_1        0.043837
S2         0.056305
S3         0.037595
S4        -0.002592
S5        -0.070209
S6        -0.01764

## Selecting the right return_type

The following return types are supported:
- `"latex"`: Returns a string that you can save and import into a LaTeX document
- `"html"`: Returns a string that you can save and import into a HTML document.
- `"render_inputs"`: Returns a dictionary with the following entries:
    - `"body"`: A DataFrame containing the main table
    - `"footer"`: A DataFrame containing the statisics
    - other stuff that you should ignore
- `"dataframe"`: Returns a DataFrame you can look at in a notebook

## Use `render_inputs` for maximum flexibility

As an example, let's assume we want to remove a few rows from the footer.

Let's first look at the footer we get from `estimation_table`

In [7]:
models

[<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x7f6e16c70f10>,
 <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x7f6db30d7160>,
 {'params':                   value  standard_error       p_value
  Intercept  1.425253e+08         3.14150  1.000000e-08
  Age        5.145600e+01         2.71828  1.000000e-08
  Sex       -3.378900e+01         1.61800  1.000000e-08,
  'name': 'target',
  'info': {'n_obs': 445}}]

In [8]:
render_inputs = estimation_table(
    models, return_type="render_inputs", add_trailing_zeros=True
)
footer = render_inputs["footer"]
footer

Unnamed: 0_level_0,target,target,target
Unnamed: 0_level_1,(1),(2),(3)
R$^2$,0.06,0.40,
Adj. R$^2$,0.06,0.40,
Residual Std. Error,74.80,60.00,
F Statistic,9.98$^{***}$,72.90$^{***}$,
Observations,442,442,445.0


Now we can remove the rows we don't need and render it to html. 

In [9]:
render_inputs["footer"] = footer.loc[["R$^2$", "Observations"]]
HTML(render_html(**render_inputs))

Unnamed: 0_level_0,target,target,target
Unnamed: 0_level_1,(1),(2),(3)
Intercept,152.00$^{*** }$,152.00$^{*** }$,1.43e+08$^{*** }$
,(3.56),(2.85),(3.14)
Age,227.00$^{*** }$,37.20$^{ }$,51.50$^{*** }$
,(78.70),(64.10),(2.72)
Sex,20.30$^{ }$,-107.00$^{* }$,-33.80$^{*** }$
,(76.00),(62.10),(1.62)
S_1,284.00$^{*** }$,,
,(77.50),,
BMI,,787.00$^{*** }$,
,,(65.40),


## LaTeX peculiarities

- describe the warning and how to silence it
- describe what needs to go into the preamble
- show one example

## Advanced options 

show one example with where many optional arguments are used at once

- options dictionary with all standard entries
- custom param names
- custom col names
- custom col groups
- custom number format
- title

everything that is not in this list will be left to the docstring.

In [10]:
stats_dict = {
    "Observations": "n_obs",
    "R$^2$": "rsquared",
    "Adj. R$^2$": "rsquared_adj",
    "Residual Std. Error": "resid_std_err",
    "F Statistic": "fvalue",
    "show_dof": True,
}

In [11]:
df = render_inputs["body"].copy(deep=True)
df = df.rename({"Intercept": "Inter_cept"})
df.rename({"target": "tar_get"}, axis=1, inplace=True)

In [12]:
s = df.style
s = s.hide(names=True)
s = s.format_index(escape="latex")
s = s.format_index(escape="latex", axis=1)

In [14]:
Latex(
    s.to_latex(
        siunitx=True,
        environment="table",
        column_format="lSSS",
        multicol_align="c",
        hrules=True,
    )
)

<IPython.core.display.Latex object>

In [None]:
s.rename({"Intercept": "Inter_cept"})

In [None]:
df

In [None]:
s = render_inputs["body"].style.hide(names=True)
s.format_index()

In [None]:
s

In [None]:
render_inputs["bl"] = "bl"

In [None]:
bl = render_inputs.pop("bl")

In [None]:
bl

In [None]:
x = False

In [None]:
if not x:
    print(x)

In [None]:
render_inputs["body"].forma