# How to generate publication quality tables


Estimagic can create publication quality tables of parameter estimates in LaTeX or HTML. It works with the results from `estimate_ml` and `estimate_msm` but also supports statsmodels results out of the box. 

You can get almost limitless flexibility if you split the table generation into two steps. The fist generates a DataFrame which you can customize to your liking, the second renders that DataFrame in LaTeX or HTML.

In [32]:
import numpy as np
import pandas as pd
import statsmodels.formula.api as sm
from IPython.core.display import HTML, Latex

from estimagic import estimation_table, render_html, render_latex
from estimagic.config import EXAMPLE_DIR

## Create tables from statsmodels results

In [33]:
df = pd.read_csv(EXAMPLE_DIR / "diabetes.csv", index_col=0)
df.rename({"S1": "S_1"}, inplace=True, axis=1)
mod1 = sm.ols("target ~ Age + Sex+ S_1", data=df).fit()
mod2 = sm.ols("target ~ Age + Sex + BMI + ABP", data=df).fit()
models = [mod1, mod2]

In [49]:
(estimation_table(models, return_type="html"))

'<table border="1" class="dataframe">\n  <thead>\n    <tr>\n      <th></th>\n      <th colspan="3" halign="left">target</th>\n    </tr>\n    <tr>\n      <th></th>\n      <th>(1)</th>\n      <th>(2)</th>\n      <th>(3)</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>Intercept</th>\n      <td>152.00$^{*** }$</td>\n      <td>152.00$^{*** }$</td>\n      <td>1.43e+08$^{*** }$</td>\n    </tr>\n    <tr>\n      <th></th>\n      <td>(3.56)</td>\n      <td>(2.85)</td>\n      <td>(3.14)</td>\n    </tr>\n    <tr>\n      <th>Age</th>\n      <td>227.00$^{*** }$</td>\n      <td>37.20$^{ }$</td>\n      <td>51.50$^{*** }$</td>\n    </tr>\n    <tr>\n      <th></th>\n      <td>(78.70)</td>\n      <td>(64.10)</td>\n      <td>(2.72)</td>\n    </tr>\n    <tr>\n      <th>Sex</th>\n      <td>20.30$^{ }$</td>\n      <td>-107.00$^{* }$</td>\n      <td>-33.80$^{*** }$</td>\n    </tr>\n    <tr>\n      <th></th>\n      <td>(76.00)</td>\n      <td>(62.10)</td>\n      <td>(1.62)</td>\n    </tr>\n    <tr>

## Adding estimagic results

`estimate_ml` and `estimate_msm` can both generate summaries of estimation results. Those summaries are either DataFrames with the columns `"value"`, `"standard_error"`, `"p_value"` and `"stars"` or pytrees containing such DataFrames. 

For examples, check out our tutorials on [`estimate_ml`](../../getting_started/first_likelihood_estimation_with_estimagic.ipynb) and [`estimate_msm`](../../getting_started/first_msm_estimation_with_estimagic.ipynb).


Assume we got the following DataFrame from an estimation summary:

In [35]:
params = pd.DataFrame(
    {
        "value": [142525262.123, 51.456, -33.789],
        "standard_error": [3.1415, 2.71828, 1.6180],
        "p_value": [1e-8] * 3,
    },
    index=["Intercept", "Age", "Sex"],
)
params

Unnamed: 0,value,standard_error,p_value
Intercept,142525300.0,3.1415,1e-08
Age,51.456,2.71828,1e-08
Sex,-33.789,1.618,1e-08


You can either use just the params DataFrame or a dictionary containing "params" and additional information in `estimation_table`.

In [97]:
mod3 = {"params": params, "name": "hello", "info": {"n_obs": 445}}
models = [mod1, mod2, mod3]

## Selecting the right return_type

The following return types are supported:
- `"latex"`: Returns a string that you can save and import into a LaTeX document
- `"html"`: Returns a string that you can save and import into a HTML document.
- `"render_inputs"`: Returns a dictionary with the following entries:
    - `"body"`: A DataFrame containing the main table
    - `"footer"`: A DataFrame containing the statisics
    - other stuff that you should ignore
- `"dataframe"`: Returns a DataFrame you can look at in a notebook

## Use `render_inputs` for maximum flexibility

As an example, let's assume we want to remove a few rows from the footer.

Let's first look at the footer we get from `estimation_table`

In [147]:
render_inputs = estimation_table(
    models, return_type="render_inputs", add_trailing_zeros=True
)
footer = render_inputs["footer"]
footer

Unnamed: 0_level_0,target,target,hello
Unnamed: 0_level_1,(1),(2),(3)
Observations,442.00,442.00,445.0
R$^2$,0.06,0.40,
Adj. R$^2$,0.06,0.40,
Residual Std. Error,74.80,60.00,
F Statistic,9.98$^{***}$,72.90$^{***}$,


In [157]:
body = render_inputs["body"].copy(deep=True)
footer = render_inputs["footer"].copy(deep=True)

In [159]:
s = body.style
sf = footer.style
htmlstr = (s.to_html(exclude_styles=True)).split("</tbody>\n</table>")[0]
stats_str = """<tr><td colspan="{}" style="border-bottom: 1px solid black">
    </td></tr>""".format(
    1 + 3
)
stats_str += (
    sf.to_html(exclude_styles=True)
    .split("</thead>\n")[1]
    .split("</tbody>\n</table>")[0]
)
htmlstr += stats_str
htmlstr += "</tbody>\n</table>"

In [170]:
get()

'<table>\n  <thead>\n    <tr>\n      <th >&nbsp;</th>\n      <th colspan="2">target</th>\n      <th >hello</th>\n    </tr>\n    <tr>\n      <th >&nbsp;</th>\n      <th >(1)</th>\n      <th >(2)</th>\n      <th >(3)</th>\n    </tr>\n    <tr>\n      <th >index</th>\n      <th >&nbsp;</th>\n      <th >&nbsp;</th>\n      <th >&nbsp;</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th >Intercept</th>\n      <td >152.00$^{*** }$</td>\n      <td >152.00$^{*** }$</td>\n      <td >1.43e+08$^{*** }$</td>\n    </tr>\n    <tr>\n      <th ></th>\n      <td >(3.56)</td>\n      <td >(2.85)</td>\n      <td >(3.14)</td>\n    </tr>\n    <tr>\n      <th >Age</th>\n      <td >227.00$^{*** }$</td>\n      <td >37.20$^{ }$</td>\n      <td >51.50$^{*** }$</td>\n    </tr>\n    <tr>\n      <th ></th>\n      <td >(78.70)</td>\n      <td >(64.10)</td>\n      <td >(2.72)</td>\n    </tr>\n    <tr>\n      <th >Sex</th>\n      <td >20.30$^{ }$</td>\n      <td >-107.00$^{* }$</td>\n      <td >-33.80$^{*** }$</

Now we can remove the rows we don't need and render it to html. 

In [171]:
render_inputs["footer"] = footer.loc[["R$^2$", "Observations"]]
HTML(render_html(**render_inputs))

Unnamed: 0_level_0,target,target,hello
Unnamed: 0_level_1,(1),(2),(3)
Intercept,152.00$^{*** }$,152.00$^{*** }$,1.43e+08$^{*** }$
,(3.56),(2.85),(3.14)
Age,227.00$^{*** }$,37.20$^{ }$,51.50$^{*** }$
,(78.70),(64.10),(2.72)
Sex,20.30$^{ }$,-107.00$^{* }$,-33.80$^{*** }$
,(76.00),(62.10),(1.62)
S_1,284.00$^{*** }$,,
,(77.50),,
BMI,,787.00$^{*** }$,
,,(65.40),


## LaTeX peculiarities

- describe the warning and how to silence it
- describe what needs to go into the preamble
- show one example

## Advanced options 

show one example with where many optional arguments are used at once

- options dictionary with all standard entries
- custom param names
- custom col names
- custom col groups
- custom number format
- title

everything that is not in this list will be left to the docstring.

In [10]:
stats_dict = {
    "Observations": "n_obs",
    "R$^2$": "rsquared",
    "Adj. R$^2$": "rsquared_adj",
    "Residual Std. Error": "resid_std_err",
    "F Statistic": "fvalue",
    "show_dof": True,
}

In [11]:
df = render_inputs["body"].copy(deep=True)
df = df.rename({"Intercept": "Inter_cept"})
df.rename({"target": "tar_get"}, axis=1, inplace=True)

In [12]:
s = df.style
s = s.hide(names=True)
s = s.format_index(escape="latex")
s = s.format_index(escape="latex", axis=1)

In [14]:
Latex(
    s.to_latex(
        siunitx=True,
        environment="table",
        column_format="lSSS",
        multicol_align="c",
        hrules=True,
    )
)

<IPython.core.display.Latex object>

In [None]:
s.rename({"Intercept": "Inter_cept"})

In [None]:
df

In [None]:
s = render_inputs["body"].style.hide(names=True)
s.format_index()

In [None]:
s

In [None]:
render_inputs["bl"] = "bl"

In [None]:
bl = render_inputs.pop("bl")

In [None]:
bl

In [None]:
x = False

In [None]:
if not x:
    print(x)

In [None]:
render_inputs["body"].forma