Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -240,6 +240,7 @@
- [Optimizer Hints](/optimizer-hints.md)
- [TiKV 调优](/tune-tikv-performance.md)
- [TiDB 最佳实践](https://pingcap.com/blog-cn/tidb-best-practice/)
- [列裁剪](/column-pruning.md)
+ 监控指标
- [Overview 面板](/grafana-overview-dashboard.md)
- [TiDB 面板](/grafana-tidb-dashboard.md)
Expand Down
20 changes: 20 additions & 0 deletions column-pruning.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
title: 列裁剪
category: performance
---

# 列裁剪

列裁剪的基本思想在于:对于算子中实际用不上的列,优化器在优化的过程中没有必要保留它们。 对这些列的删除会减少 I/O 资源占用,并为后续的优化带来便利。下面给出一个列重复的例子:

假设表 t 里面有 a b c d 四列,执行如下语句:

{{< copyable "sql" >}}

```sql
select a from t where b > 5
```

在该查询的过程中,t 表实际上只有 a, b 两列会被用到,而 c, d 的数据则显得多余。对应到该语句的查询计划,Selection 算子会用到 b 列,下面接着的 DataSource 算子会用到 a, b 两列,而剩下 c, d 两列则都可以裁剪掉,DataSource 算子在读数据时不需要将它们读进来。

出于上述考量,TiDB 会在逻辑优化阶段进行自上而下的扫描,裁剪不需要的列,减少资源浪费。该扫描过程称作 “列裁剪”,对应逻辑优化规则中的 `columnPruner`。