# How to generate publication quality tables


Estimagic can create publication quality tables of parameter estimates in LaTeX or HTML. It works with the results from `estimate_ml` and `estimate_msm` but also supports statsmodels results out of the box. 

You can get almost limitless flexibility if you split the table generation into two steps. The fist generates a DataFrame which you can customize to your liking, the second renders that DataFrame in LaTeX or HTML.

In [1]:
import numpy as np
import pandas as pd
import statsmodels.formula.api as sm
from IPython.core.display import HTML, Latex

from estimagic import estimation_table, render_html, render_latex
from estimagic.config import EXAMPLE_DIR

## Create tables from statsmodels results

In [2]:
df = pd.read_csv(EXAMPLE_DIR / "diabetes.csv", index_col=0)
mod1 = sm.ols("target ~ Age + Sex", data=df).fit()
mod2 = sm.ols("target ~ Age + Sex + BMI + ABP", data=df).fit()
models = [mod1, mod2]

In [3]:
HTML(estimation_table(models, return_type="html"))

Unnamed: 0_level_0,target,target
Unnamed: 0_level_1,(1),(2)
Intercept,152.00$^{*** }$,152.00$^{*** }$
,(3.61),(2.85)
Age,301.00$^{*** }$,37.20$^{ }$
,(77.10),(64.10)
Sex,17.40$^{ }$,-107.00$^{* }$
,(77.10),(62.10)
BMI,,787.00$^{*** }$
,,(65.40)
ABP,,417.00$^{*** }$
,,(69.50)


## Adding estimagic results

`estimate_ml` and `estimate_msm` can both generate summaries of estimation results. Those summaries are either DataFrames with the columns `"value"`, `"standard_error"`, `"p_value"` and `"stars"` or pytrees containing such DataFrames. 

For examples, check out our tutorials on [`estimate_ml`](../../getting_started/first_likelihood_estimation_with_estimagic.ipynb) and [`estimate_msm`](../../getting_started/first_msm_estimation_with_estimagic.ipynb).


Assume we got the following DataFrame from an estimation summary:

In [4]:
params = pd.DataFrame(
    {
        "value": [142.123, 51.456, -33.789],
        "standard_error": [3.1415, 2.71828, 1.6180],
        "p_value": [1e-8] * 3,
    },
    index=["Intercept", "Age", "Sex"],
)
params

Unnamed: 0,value,standard_error,p_value
Intercept,142.123,3.1415,1e-08
Age,51.456,2.71828,1e-08
Sex,-33.789,1.618,1e-08


You can either use just the params DataFrame or a dictionary containing "params" and additional information in `estimation_table`.

In [5]:
mod3 = {"params": params, "name": "target", "info": {"n_obs": 445}}
models = [mod1, mod2, mod3]

In [6]:
HTML(estimation_table(models, return_type="html"))

Unnamed: 0_level_0,target,target,target
Unnamed: 0_level_1,(1),(2),(3)
Intercept,152.00$^{*** }$,152.00$^{*** }$,142.00$^{*** }$
,(3.61),(2.85),(3.14)
Age,301.00$^{*** }$,37.20$^{ }$,51.50$^{*** }$
,(77.10),(64.10),(2.72)
Sex,17.40$^{ }$,-107.00$^{* }$,-33.80$^{*** }$
,(77.10),(62.10),(1.62)
BMI,,787.00$^{*** }$,
,,(65.40),
ABP,,417.00$^{*** }$,
,,(69.50),


## Selecting the right return_type

The following return types are supported:
- `"latex"`: Returns a string that you can save and import into a LaTeX document
- `"html"`: Returns a string that you can save and import into a HTML document.
- `"render_inputs"`: Returns a dictionary with the following entries:
    - `"body"`: A DataFrame containing the main table
    - `"footer"`: A DataFrame containing the statisics
    - other stuff that you should ignore
- `"dataframe"`: Returns a DataFrame you can look at in a notebook

## Use `render_inputs` for maximum flexibility

As an example, let's assume we want to remove a few rows from the footer.

Let's first look at the footer we get from `estimation_table`

In [11]:
render_inputs = estimation_table(models, return_type="render_inputs")
footer = render_inputs["footer"]
footer

Unnamed: 0_level_0,target,target,target
Unnamed: 0_level_1,(1),(2),(3)
R$^2$,0.04,0.40,
Adj. R$^2$,0.03,0.40,
Residual Std. Error,75.90,60.00,
F Statistic,8.06$^{***}$,72.90$^{***}$,
Observations,442,442,445.0


Now we can remove the rows we don't need and render it to html. 

In [12]:
render_inputs["footer"] = footer.loc[["R$^2$", "Observations"]]
HTML(render_html(**render_inputs))

Unnamed: 0_level_0,target,target,target
Unnamed: 0_level_1,(1),(2),(3)
Intercept,152.00$^{*** }$,152.00$^{*** }$,142.00$^{*** }$
,(3.61),(2.85),(3.14)
Age,301.00$^{*** }$,37.20$^{ }$,51.50$^{*** }$
,(77.10),(64.10),(2.72)
Sex,17.40$^{ }$,-107.00$^{* }$,-33.80$^{*** }$
,(77.10),(62.10),(1.62)
BMI,,787.00$^{*** }$,
,,(65.40),
ABP,,417.00$^{*** }$,
,,(69.50),


## LaTeX peculiarities

- describe the warning and how to silence it
- describe what needs to go into the preamble
- show one example

## Advanced options 

show one example with where many optional arguments are used at once

- options dictionary with all standard entries
- custom param names
- custom col names
- custom col groups
- custom number format
- title

everything that is not in this list will be left to the docstring.

In [13]:
stats_dict = {
    "Observations": "n_obs",
    "R$^2$": "rsquared",
    "Adj. R$^2$": "rsquared_adj",
    "Residual Std. Error": "resid_std_err",
    "F Statistic": "fvalue",
    "show_dof": True,
}