In [1]:
print("""
@File         : Table Visualization.ipynb
@Author(s)    : Stephen CUI
@LastEditor(s): Stephen CUI
@CreatedTime  : 2024-01-29 19:16:38
@Email        : cuixuanstephen@gmail.com
@Description  : 
""")


@File         : Table Visualization.ipynb
@Author(s)    : Stephen CUI
@LastEditor(s): Stephen CUI
@CreatedTime  : 2024-01-29 19:16:38
@Email        : cuixuanstephen@gmail.com
@Description  : 



In [2]:
import numpy as np
import pandas as pd

## Formatting the Display

使用 `Styler` 操作显示是一项有用的功能，因为可以为其他目的维护索引和数据值提供更好的控制。不必覆盖 `DataFrame` 即可按照喜欢的方式显示它。

In [3]:
weather_df = pd.DataFrame(
    np.random.rand(10, 2) * 5,
    index=pd.date_range(start="2021-01-01", periods=10),
    columns=["Tokyo", "Beijing"],
)


def rain_condition(v):
    if v < 1.75:
        return "Dry"
    elif v < 2.75:
        return "Rain"
    return "Heavy Rain"

def make_pretty(styler):
    styler.set_caption("Weather Conditions")
    styler.format(rain_condition)
    styler.format_index(lambda v: v.strftime("%A"))
    styler.background_gradient(axis=None, vmin=1, vmax=5, cmap='YlGnBu')
    return styler

weather_df

Unnamed: 0,Tokyo,Beijing
2021-01-01,0.183693,1.80755
2021-01-02,4.254492,4.578946
2021-01-03,2.666541,1.17872
2021-01-04,1.890841,3.237639
2021-01-05,3.425302,2.352733
2021-01-06,3.517742,2.610334
2021-01-07,2.699073,3.762311
2021-01-08,0.385958,3.792741
2021-01-09,4.473581,0.71239
2021-01-10,3.444049,4.496181


In [4]:
weather_df.loc["2021-01-04":"2021-01-08"].style.pipe(make_pretty).to_html('test.html')

## 添加样式的方法

将自定义 CSS 样式添加到 `Styler` 的主要方法有 3 种：

1. 使用 `.set_table_styles()` 通过指定的内部 CSS 控制表格的更广泛区域。尽管表格样式允许灵活地添加控制表格所有各个部分的 CSS 选择器和属性，但它们对于单个单元格规范来说并不实用。另请注意，表格样式无法导出到 Excel。
2. 使用 `.set_td_classes()` 将外部 CSS 类直接链接到数据单元格，或链接由 `.set_table_styles()` 创建的内部 CSS 类。看这里。这些不能用于列标题行或索引，也不能导出到 Excel。
3. 使用 `.apply()` 和 `.map()` 函数将直接内部 CSS 添加到特定数据单元格。从 v1.4.0 开始，还有直接作用于列标题行或索引的方法；`.apply_index()` 和 `.map_index()`。请注意，只有这些方法添加将导出到 Excel 的样式。这些方法的工作方式与 `DataFrame.apply()` 和 `DataFrame.map()` 类似。

## 表格样式

表格样式足够灵活，可以控制表格的所有各个部分，包括列标题和索引。但是，它们对于单个数据单元格或任何类型的条件格式的键入可能很笨拙，因此我们建议将表格样式用于广泛的样式，例如一次整行或整列。

## `Styler` Functions

### 根据数据采取行动

我们使用以下方法来传递您的样式函数。这两个方法都采用一个函数（以及其他一些关键字参数）并以某种方式将其应用到 `DataFrame`，从而呈现 CSS 样式。

- `.map()`（按元素）：接受一个函数，该函数采用单个值并返回带有 CSS 属性-值对的字符串。

- `.apply()`（按列/行/表）：接受一个函数，该函数采用 Series 或 DataFrame 并返回具有相同形状的 Series、DataFrame 或 numpy 数组，其中每个元素都是具有 CSS 属性的字符串 -值对。此方法一次传递 DataFrame 的每一列或行，或一次传递整个表，具体取决于 `axis` 关键字参数。对于按列使用 `axis=0`、按行使用 `axis=1` 以及对于整个表一次使用 `axis=None`。

<div class="alert alert-warning">
    <strong>警告！</strong>如果你使用的不是 2.X 的 pandas，那么需要使用 applymap()，而不是 map()。
</div>

In [5]:
np.random.seed(0)
df2 = pd.DataFrame(np.random.randn(10,4), columns=['A','B','C','D'])
df2.style

Unnamed: 0,A,B,C,D
0,1.764052,0.400157,0.978738,2.240893
1,1.867558,-0.977278,0.950088,-0.151357
2,-0.103219,0.410599,0.144044,1.454274
3,0.761038,0.121675,0.443863,0.333674
4,1.494079,-0.205158,0.313068,-0.854096
5,-2.55299,0.653619,0.864436,-0.742165
6,2.269755,-1.454366,0.045759,-0.187184
7,1.532779,1.469359,0.154947,0.378163
8,-0.887786,-1.980796,-0.347912,0.156349
9,1.230291,1.20238,-0.387327,-0.302303


In [6]:
def style_negative(v, props=''):
    return props if v < 0 else None
s2 = (
    df2.style
    .applymap(style_negative, props='color:red;')
    .applymap(lambda v: 'opacity: 20%;' if (v < .3) and (v > -.3) else None)
)
s2

Unnamed: 0,A,B,C,D
0,1.764052,0.400157,0.978738,2.240893
1,1.867558,-0.977278,0.950088,-0.151357
2,-0.103219,0.410599,0.144044,1.454274
3,0.761038,0.121675,0.443863,0.333674
4,1.494079,-0.205158,0.313068,-0.854096
5,-2.55299,0.653619,0.864436,-0.742165
6,2.269755,-1.454366,0.045759,-0.187184
7,1.532779,1.469359,0.154947,0.378163
8,-0.887786,-1.980796,-0.347912,0.156349
9,1.230291,1.20238,-0.387327,-0.302303


In [7]:
(
    df2.style
    .applymap(lambda v: 'opacity: 90%; color: black;')
    .applymap(lambda v: 'background: #bbb666;' if (v < 0) else 'background:#666fff;')
)

Unnamed: 0,A,B,C,D
0,1.764052,0.400157,0.978738,2.240893
1,1.867558,-0.977278,0.950088,-0.151357
2,-0.103219,0.410599,0.144044,1.454274
3,0.761038,0.121675,0.443863,0.333674
4,1.494079,-0.205158,0.313068,-0.854096
5,-2.55299,0.653619,0.864436,-0.742165
6,2.269755,-1.454366,0.045759,-0.187184
7,1.532779,1.469359,0.154947,0.378163
8,-0.887786,-1.980796,-0.347912,0.156349
9,1.230291,1.20238,-0.387327,-0.302303


我们还可以构建一个函数，同时突出显示行、列和 DataFrame 中的最大值。

In [8]:
def highlight_max(s, props=""):
    return np.where(s == np.nanmax(s.values), props, '')
s2.apply(highlight_max, props='color:white;background-color:darkblue', axis=0)

# s2.apply(highlight_max, props='color:white;background-color:darkblue', axis=0).to_html('test.html')

Unnamed: 0,A,B,C,D
0,1.764052,0.400157,0.978738,2.240893
1,1.867558,-0.977278,0.950088,-0.151357
2,-0.103219,0.410599,0.144044,1.454274
3,0.761038,0.121675,0.443863,0.333674
4,1.494079,-0.205158,0.313068,-0.854096
5,-2.55299,0.653619,0.864436,-0.742165
6,2.269755,-1.454366,0.045759,-0.187184
7,1.532779,1.469359,0.154947,0.378163
8,-0.887786,-1.980796,-0.347912,0.156349
9,1.230291,1.20238,-0.387327,-0.302303


我们可以在不同的轴上使用相同的函数，这里用紫色突出显示 `DataFrame` 最大值，用粉色突出显示行最大值。

In [31]:
(
    s2.apply(highlight_max, props="color:white;background-color:pink;", axis=1).apply(
        highlight_max, props="color:white;background-color:purple;", axis=None
    )
)
# 显示了某些样式如何被其他样式覆盖

Unnamed: 0,A,B,C,D
0,1.764052,0.400157,0.978738,2.240893
1,1.867558,-0.977278,0.950088,-0.151357
2,-0.103219,0.410599,0.144044,1.454274
3,0.761038,0.121675,0.443863,0.333674
4,1.494079,-0.205158,0.313068,-0.854096
5,-2.55299,0.653619,0.864436,-0.742165
6,2.269755,-1.454366,0.045759,-0.187184
7,1.532779,1.469359,0.154947,0.378163
8,-0.887786,-1.980796,-0.347912,0.156349
9,1.230291,1.20238,-0.387327,-0.302303


### 作用于索引和列表头

通过使用以下方法可以对标头实现类似的应用：
- `.map_index()`（按元素）：接受一个函数，该函数采用单个值并返回带有 CSS 属性值对的字符串。
- `.apply_index()`（按 level）：接受一个函数，该函数接受一个 Series 并返回一个 Series 或具有相同形状的 numpy 数组，其中每个元素都是带有 CSS 属性值对的字符串。此方法一次一次通过索引的每个级别。要设置索引的样式，请使用 `axis=0`，要设置列标题的样式 `axis=1`。

You can select a `level` of a `MultiIndex` but currently no similar `subset` application is available for these methods.

<div class="alert alert-warning">
    <strong>警告！</strong>如果你使用的不是 2.X 的 pandas，那么需要使用 applymap_index，而不是 map_index()。

前面的所有样式都会被应用。
</div>

In [32]:
s2

Unnamed: 0,A,B,C,D
0,1.764052,0.400157,0.978738,2.240893
1,1.867558,-0.977278,0.950088,-0.151357
2,-0.103219,0.410599,0.144044,1.454274
3,0.761038,0.121675,0.443863,0.333674
4,1.494079,-0.205158,0.313068,-0.854096
5,-2.55299,0.653619,0.864436,-0.742165
6,2.269755,-1.454366,0.045759,-0.187184
7,1.532779,1.469359,0.154947,0.378163
8,-0.887786,-1.980796,-0.347912,0.156349
9,1.230291,1.20238,-0.387327,-0.302303


In [33]:
s2.applymap_index(lambda v: 'color:yellow;' if v > 4 else 'color:darkblue;', axis=0)

Unnamed: 0,A,B,C,D
0,1.764052,0.400157,0.978738,2.240893
1,1.867558,-0.977278,0.950088,-0.151357
2,-0.103219,0.410599,0.144044,1.454274
3,0.761038,0.121675,0.443863,0.333674
4,1.494079,-0.205158,0.313068,-0.854096
5,-2.55299,0.653619,0.864436,-0.742165
6,2.269755,-1.454366,0.045759,-0.187184
7,1.532779,1.469359,0.154947,0.378163
8,-0.887786,-1.980796,-0.347912,0.156349
9,1.230291,1.20238,-0.387327,-0.302303


In [34]:
s2.applymap_index(lambda s: "color:pink;" if s in ["A", "B"] else "color:darkblue;", axis=1)

Unnamed: 0,A,B,C,D
0,1.764052,0.400157,0.978738,2.240893
1,1.867558,-0.977278,0.950088,-0.151357
2,-0.103219,0.410599,0.144044,1.454274
3,0.761038,0.121675,0.443863,0.333674
4,1.494079,-0.205158,0.313068,-0.854096
5,-2.55299,0.653619,0.864436,-0.742165
6,2.269755,-1.454366,0.045759,-0.187184
7,1.532779,1.469359,0.154947,0.378163
8,-0.887786,-1.980796,-0.347912,0.156349
9,1.230291,1.20238,-0.387327,-0.302303


## 通过切片进行更精细的控制

到目前为止，我们展示的 `Styler.apply` 和 `Styler.map` 函数的示例尚未演示该 `subset` 参数的使用。这是一个有用的参数，它具有很大的灵活性：它允许您将样式应用于特定的行或列，而无需将该逻辑编码到您的 `style` 函数中。

传递给 `subset` 的值的行为类似于对 `DataFrame` 进行切片；
- 标量被视为列标签
- 列表（或 Series 或 NumPy 数组）被视为多列标签
- 元组被视为 `(row_indexer, column_indexer)`

In [35]:
df3 = pd.DataFrame(
    np.random.randn(4, 4),
    pd.MultiIndex.from_product([["A", "B"], ["r1", "r2"]]),
    columns=["c1", "c2", "c3", "c4"],
)
df3

Unnamed: 0,Unnamed: 1,c1,c2,c3,c4
A,r1,-1.070753,1.054452,-0.403177,1.222445
A,r2,0.208275,0.976639,0.356366,0.706573
B,r1,0.0105,1.78587,0.126912,0.401989
B,r2,1.883151,-1.347759,-1.270485,0.969397


In [36]:
slice_ = ['c3', 'c4']
(
    df3.style.apply(
        highlight_max, props="color:red;", axis=0, subset=slice_
    ).set_properties(**{"background-color": "#ffffb3"}, subset=slice_)
)

Unnamed: 0,Unnamed: 1,c1,c2,c3,c4
A,r1,-1.070753,1.054452,-0.403177,1.222445
A,r2,0.208275,0.976639,0.356366,0.706573
B,r1,0.0105,1.78587,0.126912,0.401989
B,r2,1.883151,-1.347759,-1.270485,0.969397


如果与 `IndexSlice` 结合使用，那么它可以以更大的灵活性跨两个维度进行索引。

In [37]:
idx = pd.IndexSlice
slice_ = idx[idx[:, 'r1'], idx['c2':'c4']]
(
    df3.style.apply(
        highlight_max, props="color:red;", axis=0, subset=slice_
    ).set_properties(**{"background-color": "#ffffb3"}, subset=slice_)
)

Unnamed: 0,Unnamed: 1,c1,c2,c3,c4
A,r1,-1.070753,1.054452,-0.403177,1.222445
A,r2,0.208275,0.976639,0.356366,0.706573
B,r1,0.0105,1.78587,0.126912,0.401989
B,r2,1.883151,-1.347759,-1.270485,0.969397


目前仅支持基于标签的切片，不支持位置切片，也不支持可调用切片。

如果您的样式函数使用 `subset` 或 `axis` 关键字参数，请考虑将您的函数包装在中 `functools.partial`，部分化该关键字。

## 优化