### `great_tables`

The Great Tables package is all about making it simple to produce nice-looking display tables. Display tables? Well yes, we are trying to distinguish between data tables (i.e., DataFrames) and those tables you’d find in a web page, a journal article, or in a magazine. Such tables can likewise be called presentation tables, summary tables, or just tables really.

We can think of display tables as output only, where we’d not want to use them as input ever again. Other features include annotations, table element styling, and text transformations that serve to communicate the subject matter more clearly.

<div style="max-width:700px;margin-left: auto; margin-right: auto;">
<img src="Great_Tables - Table Componenets.png" width="700"/>
</div>

The components (roughly from top to bottom) are:
- the **Table Header** (optional; with a **title** and possibly a **subtitle**)
- the **Stub** and the **Stub Head** (optional; contains row labels, optionally within row groups having row group labels)
- the **Column Labels** (contains column labels, optionally under spanner labels)
- the **Table Body** (contains columns and rows of cells)
- the **Table Footer** (optional; possibly with one or more **source notes**)

In [1]:
from great_tables import GT, md, html
from great_tables.data import islands, airquality, sp500

In [2]:
islands_mini = islands.head(10)

In [3]:
# Create a display table showing ten of the largest islands in the world
gt_tbl = GT(islands_mini)

# Show the output table
gt_tbl

name,size
Africa,11506
Antarctica,5500
Asia,16988
Australia,2968
Axel Heiberg,16
Baffin,184
Banks,23
Borneo,280
Britain,84
Celebes,73


### Polars DataFrame support

In [4]:
import polars as pl

df_polars = pl.from_pandas(islands_mini)

# Approach 1: call GT ----
GT(df_polars)

# Approach 2: Polars style property ----
df_polars.style

name,size
Africa,11506
Antarctica,5500
Asia,16988
Australia,2968
Axel Heiberg,16
Baffin,184
Banks,23
Borneo,280
Britain,84
Celebes,73


### Some Beautiful Examples

In [5]:
islands_mini = islands.head(10)

(
    GT(islands_mini, rowname_col = "name")
    .tab_header(
        title="Large Landmasses of the World",
        subtitle="The top ten largest are presented"
    )
    .tab_source_note(
        source_note="Source: The World Almanac and Book of Facts, 1975, page 406."
    )
    .tab_source_note(
        source_note=md("Reference: McNeil, D. R. (1977) *Interactive Data Analysis*. Wiley.")
    )
    .tab_stubhead(label="landmass")
)

Large Landmasses of the World,Large Landmasses of the World
The top ten largest are presented,The top ten largest are presented
landmass,size
Africa,11506
Antarctica,5500
Asia,16988
Australia,2968
Axel Heiberg,16
Baffin,184
Banks,23
Borneo,280
Britain,84
Celebes,73


In [6]:
airquality_m = airquality.head(10).assign(Year=1973)

gt_airquality = (
    GT(airquality_m)
    .tab_header(
        title="New York Air Quality Measurements",
        subtitle="Daily measurements in New York City (May 1-10, 1973)",
    )
    .tab_spanner(label="Time", columns=["Year", "Month", "Day"])
    .tab_spanner(label="Measurement", columns=["Ozone", "Solar_R", "Wind", "Temp"])
    .cols_move_to_start(columns=["Year", "Month", "Day"])
    .cols_label(
        Ozone=html("Ozone,<br>ppbV"),
        Solar_R=html("Solar R.,<br>cal/m<sup>2</sup>"),
        Wind=html("Wind,<br>mph"),
        Temp=html("Temp,<br>&deg;F"),
    )
)

gt_airquality

New York Air Quality Measurements,New York Air Quality Measurements,New York Air Quality Measurements,New York Air Quality Measurements,New York Air Quality Measurements,New York Air Quality Measurements,New York Air Quality Measurements
"Daily measurements in New York City (May 1-10, 1973)","Daily measurements in New York City (May 1-10, 1973)","Daily measurements in New York City (May 1-10, 1973)","Daily measurements in New York City (May 1-10, 1973)","Daily measurements in New York City (May 1-10, 1973)","Daily measurements in New York City (May 1-10, 1973)","Daily measurements in New York City (May 1-10, 1973)"
Time,Time,Time,Measurement,Measurement,Measurement,Measurement
Year,Month,Day,"Ozone, ppbV","Solar R., cal/m2","Wind, mph","Temp, °F"
1973,5,1,41.0,190.0,7.4,67
1973,5,2,36.0,118.0,8.0,72
1973,5,3,12.0,149.0,12.6,74
1973,5,4,18.0,313.0,11.5,62
1973,5,5,,,14.3,56
1973,5,6,28.0,,14.9,66
1973,5,7,23.0,299.0,8.6,65
1973,5,8,19.0,99.0,13.8,59
1973,5,9,8.0,19.0,20.1,61
1973,5,10,,194.0,8.6,69


In [7]:
# Define the start and end dates for the data range
start_date = "2010-06-07"
end_date = "2010-06-14"

# Filter sp500 using Pandas to dates between `start_date` and `end_date`
sp500_mini = sp500[(sp500["date"] >= start_date) & (sp500["date"] <= end_date)]

# Create a gt table based on the `sp500_mini` table data
(
    GT(sp500_mini)
    .tab_header(title="S&P 500", subtitle=f"{start_date} to {end_date}")
    .fmt_currency(columns=["open", "high", "low", "close"])
    .fmt_date(columns="date", date_style="wd_m_day_year")
    .fmt_number(columns="volume", compact=True)
    .cols_hide(columns="adj_close")
)

S&P 500,S&P 500,S&P 500,S&P 500,S&P 500,S&P 500
2010-06-07 to 2010-06-14,2010-06-07 to 2010-06-14,2010-06-07 to 2010-06-14,2010-06-07 to 2010-06-14,2010-06-07 to 2010-06-14,2010-06-07 to 2010-06-14
date,open,high,low,close,volume
"Mon, Jun 14, 2010","$1,095.00","$1,105.91","$1,089.03","$1,089.63",4.43B
"Fri, Jun 11, 2010","$1,082.65","$1,092.25","$1,077.12","$1,091.60",4.06B
"Thu, Jun 10, 2010","$1,058.77","$1,087.85","$1,058.77","$1,086.84",5.14B
"Wed, Jun 9, 2010","$1,062.75","$1,077.74","$1,052.25","$1,055.69",5.98B
"Tue, Jun 8, 2010","$1,050.81","$1,063.15","$1,042.17","$1,062.00",6.19B
"Mon, Jun 7, 2010","$1,065.84","$1,071.36","$1,049.86","$1,050.47",5.47B
