From 834b0f15f85ac02f8a17050aa1f7b85b27febaf2 Mon Sep 17 00:00:00 2001 From: toutdesuite Date: Fri, 17 Jul 2020 14:46:21 +0800 Subject: [PATCH 1/4] cherry pick #3246 to release-2.1 Signed-off-by: ti-srebot --- TOC.md | 136 ++++++++++++++++++++++++++++++++++++++++++++++ column-pruning.md | 20 +++++++ 2 files changed, 156 insertions(+) create mode 100644 column-pruning.md diff --git a/TOC.md b/TOC.md index c902c6f54f2ff..fa9b19dff25af 100644 --- a/TOC.md +++ b/TOC.md @@ -82,6 +82,7 @@ - [Use Dumpling for Export or Backup](/export-or-backup-using-dumpling.md) - [Identify Slow Queries](/identify-slow-queries.md) + Scale +<<<<<<< HEAD - [Scale using Ansible](/scale-tidb-using-ansible.md) - [Scale a TiDB Cluster](/horizontal-scale.md) + Upgrade @@ -90,6 +91,141 @@ - [TiDB Troubleshooting Map](/tidb-troubleshooting-map.md) - [Troubleshoot Cluster Setup](/troubleshoot-tidb-cluster.md) - [Troubleshoot TiDB Lightning](/troubleshoot-tidb-lightning.md) +======= + + [Use TiUP (Recommended)](/scale-tidb-using-tiup.md) + + [Use TiDB Ansible](/scale-tidb-using-ansible.md) + + [Use TiDB Operator](https://docs.pingcap.com/tidb-in-kubernetes/v1.1/scale-a-tidb-cluster) + + Backup and Restore + + [Use Mydumper and TiDB Lightning](/backup-and-restore-using-mydumper-lightning.md) + + [Use Dumpling for Export or Backup](/export-or-backup-using-dumpling.md) + + Use BR Tool + + [Use BR Tool](/br/backup-and-restore-tool.md) + + [BR Use Cases](/br/backup-and-restore-use-cases.md) + + [BR storages](/br/backup-and-restore-storages.md) + + [Configure Time Zone](/configure-time-zone.md) + + [Daily Checklist](/daily-check.md) + + [Manage TiCDC Cluster and Replication Tasks](/ticdc/manage-ticdc.md) + + [Maintain TiFlash](/tiflash/maintain-tiflash.md) + + [Maintain TiDB Using TiUP](/maintain-tidb-using-tiup.md) + + [Maintain TiDB Using Ansible](/maintain-tidb-using-ansible.md) ++ Monitor and Alert + + [Monitoring Framework Overview](/tidb-monitoring-framework.md) + + [Monitoring API](/tidb-monitoring-api.md) + + [Deploy Monitoring Services](/deploy-monitoring-services.md) + + [TiDB Cluster Alert Rules](/alert-rules.md) + + [TiFlash Alert Rules](/tiflash/tiflash-alert-rules.md) ++ Troubleshoot + + [TiDB Troubleshooting Map](/tidb-troubleshooting-map.md) + + [Identify Slow Queries](/identify-slow-queries.md) + + [SQL Diagnostics](/system-tables/system-table-sql-diagnostics.md) + + [Identify Expensive Queries](/identify-expensive-queries.md) + + [Statement Summary Tables](/statement-summary-tables.md) + + [Troubleshoot Hotspot Issues](/troubleshoot-hot-spot-issues.md) + + [Troubleshoot Cluster Setup](/troubleshoot-tidb-cluster.md) + + [Troubleshoot TiCDC](/ticdc/troubleshoot-ticdc.md) + + [Troubleshoot TiFlash](/tiflash/troubleshoot-tiflash.md) + + [Troubleshoot Write Conflicts in Optimistic Transactions](/troubleshoot-write-conflicts.md) ++ Performance Tuning + + System Tuning + + [Operating System Tuning](/tune-operating-system.md) + + Software Tuning + + Configuration + + [Tune TiDB Memory](/configure-memory-usage.md) + + [Tune TiKV Threads](/tune-tikv-thread-performance.md) + + [Tune TiKV Memory](/tune-tikv-memory-performance.md) + + [TiKV Follower Read](/follower-read.md) + + [TiFlash Tuning](/tiflash/tune-tiflash-performance.md) + + [Coprocessor Cache](/coprocessor-cache.md) + + SQL Tuning + + [SQL Tuning with `EXPLAIN`](/query-execution-plan.md) + + SQL Optimization + + [SQL Optimization Process](/sql-optimization-concepts.md) + + Logic Optimization + + [Subquery Related Optimizations](/subquery-optimization.md) + + [Column Pruning](/column-pruning.md) + + [Decorrelation of Correlated Subquery](/correlated-subquery-optimization.md) + + [Predicates Push Down](/predicates-push-down.md) + + [TopN and Limit Push Down](/topn-limit-push-down.md) + + [Join Reorder](/join-reorder.md) + + Physical Optimization + + [Index Selection](/choose-index.md) + + [Statistics](/statistics.md) + + [Distinct Optimization](/agg-distinct-optimization.md) + + [Prepare Execution Plan Cache](/sql-prepare-plan-cache.md) + + Control Execution Plan + + [Optimizer Hints](/optimizer-hints.md) + + [SQL Plan Management](/sql-plan-management.md) + + [Access Tables Using `IndexMerge`](/index-merge.md) + + [The Blocklist of Optimization Rules and Expression Pushdown](/blocklist-control-plan.md) ++ Tutorials + + [Multiple Data Centers in One City Deployment](/multi-data-centers-in-one-city-deployment.md) + + [Three Data Centers in Two Cities Deployment](/three-data-centers-in-two-cities-deployment.md) + + Best Practices + + [Use TiDB](/tidb-best-practices.md) + + [Java Application Development](/best-practices/java-app-best-practices.md) + + [Use HAProxy](/best-practices/haproxy-best-practices.md) + + [Highly Concurrent Write](/best-practices/high-concurrency-best-practices.md) + + [Grafana Monitoring](/best-practices/grafana-monitor-best-practices.md) + + [PD Scheduling](/best-practices/pd-scheduling-best-practices.md) + + [TiKV Performance Tuning with Massive Regions](/best-practices/massive-regions-best-practices.md) + + [Use Placement Rules](/configure-placement-rules.md) + + [Use Load Base Split](/configure-load-base-split.md) + + [Use Store Limit](/configure-store-limit.md) ++ TiDB Ecosystem Tools + + [Overview](/ecosystem-tool-user-guide.md) + + [Use Cases](/ecosystem-tool-user-case.md) + + [Download](/download-ecosystem-tools.md) + + Backup & Restore (BR) + + [BR FAQ](/br/backup-and-restore-faq.md) + + [Use BR Tool](/br/backup-and-restore-tool.md) + + [BR Use Cases](/br/backup-and-restore-use-cases.md) + + TiDB Binlog + + [Overview](/tidb-binlog/tidb-binlog-overview.md) + + [Deploy](/tidb-binlog/deploy-tidb-binlog.md) + + [Maintain](/tidb-binlog/maintain-tidb-binlog-cluster.md) + + [Configure](/tidb-binlog/tidb-binlog-configuration-file.md) + + [Pump](/tidb-binlog/tidb-binlog-configuration-file.md#pump) + + [Drainer](/tidb-binlog/tidb-binlog-configuration-file.md#drainer) + + [Upgrade](/tidb-binlog/upgrade-tidb-binlog.md) + + [Monitor](/tidb-binlog/monitor-tidb-binlog-cluster.md) + + [Reparo](/tidb-binlog/tidb-binlog-reparo.md) + + [binlogctl](/tidb-binlog/binlog-control.md) + + [Binlog Slave Client](/tidb-binlog/binlog-slave-client.md) + + [TiDB Binlog Relay Log](/tidb-binlog/tidb-binlog-relay-log.md) + + [Bidirectional Replication Between TiDB Clusters](/tidb-binlog/bidirectional-replication-between-tidb-clusters.md) + + [Glossary](/tidb-binlog/tidb-binlog-glossary.md) + + Troubleshoot + + [Troubleshoot](/tidb-binlog/troubleshoot-tidb-binlog.md) + + [Handle Errors](/tidb-binlog/handle-tidb-binlog-errors.md) + + [FAQ](/tidb-binlog/tidb-binlog-faq.md) + + TiDB Lightning + + [Overview](/tidb-lightning/tidb-lightning-overview.md) + + [Tutorial](/get-started-with-tidb-lightning.md) + + [Deploy](/tidb-lightning/deploy-tidb-lightning.md) + + [Configure](/tidb-lightning/tidb-lightning-configuration.md) + + Key Features + + [Checkpoints](/tidb-lightning/tidb-lightning-checkpoints.md) + + [Table Filter](/table-filter.md) + + [CSV Support](/tidb-lightning/migrate-from-csv-using-tidb-lightning.md) + + [TiDB-backend](/tidb-lightning/tidb-lightning-tidb-backend.md) + + [Web Interface](/tidb-lightning/tidb-lightning-web-interface.md) + + [Monitor](/tidb-lightning/monitor-tidb-lightning.md) + + [Troubleshoot](/troubleshoot-tidb-lightning.md) + + [FAQ](/tidb-lightning/tidb-lightning-faq.md) + + [Glossary](/tidb-lightning/tidb-lightning-glossary.md) + + [TiCDC](/ticdc/ticdc-overview.md) + + sync-diff-inspector + + [Overview](/sync-diff-inspector/sync-diff-inspector-overview.md) + + [Data Check for Tables with Different Schema/Table Names](/sync-diff-inspector/route-diff.md) + + [Data Check in Sharding Scenarios](/sync-diff-inspector/shard-diff.md) + + [Data Check for TiDB Upstream/Downstream Clusters](/sync-diff-inspector/upstream-downstream-diff.md) + + [Loader](/loader-overview.md) + + [Mydumper](/mydumper-overview.md) + + [Syncer](/syncer-overview.md) + + TiSpark + + [Quick Start](/get-started-with-tispark.md) + + [User Guide](/tispark-overview.md) +>>>>>>> 15bdb6a... perf-tuning: add column-prune.md (#3246) + Reference + SQL - [MySQL Compatibility](/mysql-compatibility.md) diff --git a/column-pruning.md b/column-pruning.md new file mode 100644 index 0000000000000..d018ab15084a6 --- /dev/null +++ b/column-pruning.md @@ -0,0 +1,20 @@ +--- +title: Column Pruning +summary: Learn about the usage of column pruning in TiDB. +--- + +# Column Pruning + +The basic idea of column pruning is that for columns not used in the operator, the optimizer does not need to retain them during optimization. Removing these columns reduces the use of I/O resources and facilitates the subsequent optimization. The following is an example of column repetition: + +Suppose there are four columns (a, b, c, and d) in table t. You can execute the following statement: + +{{< copyable "sql" >}} + +```sql +select a from t where b> 5 +``` + +In this query, only column a and column b are used, and column c and column d are redundant. Regarding the query plan of this statement, the `Selection` operator uses column b. Then the `DataSource` operator uses columns a and column b. Columns c and column d can be pruned because the `DataSource` operator does not read them. + +Therefore, when TiDB performs a top-down scanning during the logic optimization phase, redundant columns are pruned to reduce waste of resources. This scanning process is called "Column Pruning", corresponding to the `columnPruner` rule. If you want to disable this rule, refer to [The Blocklist of Optimization Rules and Expression Pushdown](/blocklist-control-plan.md). From c7d013a70fbd13dbb6608f56d6426ca5580f41e3 Mon Sep 17 00:00:00 2001 From: toutdesuite Date: Mon, 20 Jul 2020 16:30:11 +0800 Subject: [PATCH 2/4] resolve conflict --- TOC.md | 137 +-------------------------------------------------------- 1 file changed, 1 insertion(+), 136 deletions(-) diff --git a/TOC.md b/TOC.md index fa9b19dff25af..d56a52a250740 100644 --- a/TOC.md +++ b/TOC.md @@ -82,7 +82,6 @@ - [Use Dumpling for Export or Backup](/export-or-backup-using-dumpling.md) - [Identify Slow Queries](/identify-slow-queries.md) + Scale -<<<<<<< HEAD - [Scale using Ansible](/scale-tidb-using-ansible.md) - [Scale a TiDB Cluster](/horizontal-scale.md) + Upgrade @@ -91,141 +90,6 @@ - [TiDB Troubleshooting Map](/tidb-troubleshooting-map.md) - [Troubleshoot Cluster Setup](/troubleshoot-tidb-cluster.md) - [Troubleshoot TiDB Lightning](/troubleshoot-tidb-lightning.md) -======= - + [Use TiUP (Recommended)](/scale-tidb-using-tiup.md) - + [Use TiDB Ansible](/scale-tidb-using-ansible.md) - + [Use TiDB Operator](https://docs.pingcap.com/tidb-in-kubernetes/v1.1/scale-a-tidb-cluster) - + Backup and Restore - + [Use Mydumper and TiDB Lightning](/backup-and-restore-using-mydumper-lightning.md) - + [Use Dumpling for Export or Backup](/export-or-backup-using-dumpling.md) - + Use BR Tool - + [Use BR Tool](/br/backup-and-restore-tool.md) - + [BR Use Cases](/br/backup-and-restore-use-cases.md) - + [BR storages](/br/backup-and-restore-storages.md) - + [Configure Time Zone](/configure-time-zone.md) - + [Daily Checklist](/daily-check.md) - + [Manage TiCDC Cluster and Replication Tasks](/ticdc/manage-ticdc.md) - + [Maintain TiFlash](/tiflash/maintain-tiflash.md) - + [Maintain TiDB Using TiUP](/maintain-tidb-using-tiup.md) - + [Maintain TiDB Using Ansible](/maintain-tidb-using-ansible.md) -+ Monitor and Alert - + [Monitoring Framework Overview](/tidb-monitoring-framework.md) - + [Monitoring API](/tidb-monitoring-api.md) - + [Deploy Monitoring Services](/deploy-monitoring-services.md) - + [TiDB Cluster Alert Rules](/alert-rules.md) - + [TiFlash Alert Rules](/tiflash/tiflash-alert-rules.md) -+ Troubleshoot - + [TiDB Troubleshooting Map](/tidb-troubleshooting-map.md) - + [Identify Slow Queries](/identify-slow-queries.md) - + [SQL Diagnostics](/system-tables/system-table-sql-diagnostics.md) - + [Identify Expensive Queries](/identify-expensive-queries.md) - + [Statement Summary Tables](/statement-summary-tables.md) - + [Troubleshoot Hotspot Issues](/troubleshoot-hot-spot-issues.md) - + [Troubleshoot Cluster Setup](/troubleshoot-tidb-cluster.md) - + [Troubleshoot TiCDC](/ticdc/troubleshoot-ticdc.md) - + [Troubleshoot TiFlash](/tiflash/troubleshoot-tiflash.md) - + [Troubleshoot Write Conflicts in Optimistic Transactions](/troubleshoot-write-conflicts.md) -+ Performance Tuning - + System Tuning - + [Operating System Tuning](/tune-operating-system.md) - + Software Tuning - + Configuration - + [Tune TiDB Memory](/configure-memory-usage.md) - + [Tune TiKV Threads](/tune-tikv-thread-performance.md) - + [Tune TiKV Memory](/tune-tikv-memory-performance.md) - + [TiKV Follower Read](/follower-read.md) - + [TiFlash Tuning](/tiflash/tune-tiflash-performance.md) - + [Coprocessor Cache](/coprocessor-cache.md) - + SQL Tuning - + [SQL Tuning with `EXPLAIN`](/query-execution-plan.md) - + SQL Optimization - + [SQL Optimization Process](/sql-optimization-concepts.md) - + Logic Optimization - + [Subquery Related Optimizations](/subquery-optimization.md) - + [Column Pruning](/column-pruning.md) - + [Decorrelation of Correlated Subquery](/correlated-subquery-optimization.md) - + [Predicates Push Down](/predicates-push-down.md) - + [TopN and Limit Push Down](/topn-limit-push-down.md) - + [Join Reorder](/join-reorder.md) - + Physical Optimization - + [Index Selection](/choose-index.md) - + [Statistics](/statistics.md) - + [Distinct Optimization](/agg-distinct-optimization.md) - + [Prepare Execution Plan Cache](/sql-prepare-plan-cache.md) - + Control Execution Plan - + [Optimizer Hints](/optimizer-hints.md) - + [SQL Plan Management](/sql-plan-management.md) - + [Access Tables Using `IndexMerge`](/index-merge.md) - + [The Blocklist of Optimization Rules and Expression Pushdown](/blocklist-control-plan.md) -+ Tutorials - + [Multiple Data Centers in One City Deployment](/multi-data-centers-in-one-city-deployment.md) - + [Three Data Centers in Two Cities Deployment](/three-data-centers-in-two-cities-deployment.md) - + Best Practices - + [Use TiDB](/tidb-best-practices.md) - + [Java Application Development](/best-practices/java-app-best-practices.md) - + [Use HAProxy](/best-practices/haproxy-best-practices.md) - + [Highly Concurrent Write](/best-practices/high-concurrency-best-practices.md) - + [Grafana Monitoring](/best-practices/grafana-monitor-best-practices.md) - + [PD Scheduling](/best-practices/pd-scheduling-best-practices.md) - + [TiKV Performance Tuning with Massive Regions](/best-practices/massive-regions-best-practices.md) - + [Use Placement Rules](/configure-placement-rules.md) - + [Use Load Base Split](/configure-load-base-split.md) - + [Use Store Limit](/configure-store-limit.md) -+ TiDB Ecosystem Tools - + [Overview](/ecosystem-tool-user-guide.md) - + [Use Cases](/ecosystem-tool-user-case.md) - + [Download](/download-ecosystem-tools.md) - + Backup & Restore (BR) - + [BR FAQ](/br/backup-and-restore-faq.md) - + [Use BR Tool](/br/backup-and-restore-tool.md) - + [BR Use Cases](/br/backup-and-restore-use-cases.md) - + TiDB Binlog - + [Overview](/tidb-binlog/tidb-binlog-overview.md) - + [Deploy](/tidb-binlog/deploy-tidb-binlog.md) - + [Maintain](/tidb-binlog/maintain-tidb-binlog-cluster.md) - + [Configure](/tidb-binlog/tidb-binlog-configuration-file.md) - + [Pump](/tidb-binlog/tidb-binlog-configuration-file.md#pump) - + [Drainer](/tidb-binlog/tidb-binlog-configuration-file.md#drainer) - + [Upgrade](/tidb-binlog/upgrade-tidb-binlog.md) - + [Monitor](/tidb-binlog/monitor-tidb-binlog-cluster.md) - + [Reparo](/tidb-binlog/tidb-binlog-reparo.md) - + [binlogctl](/tidb-binlog/binlog-control.md) - + [Binlog Slave Client](/tidb-binlog/binlog-slave-client.md) - + [TiDB Binlog Relay Log](/tidb-binlog/tidb-binlog-relay-log.md) - + [Bidirectional Replication Between TiDB Clusters](/tidb-binlog/bidirectional-replication-between-tidb-clusters.md) - + [Glossary](/tidb-binlog/tidb-binlog-glossary.md) - + Troubleshoot - + [Troubleshoot](/tidb-binlog/troubleshoot-tidb-binlog.md) - + [Handle Errors](/tidb-binlog/handle-tidb-binlog-errors.md) - + [FAQ](/tidb-binlog/tidb-binlog-faq.md) - + TiDB Lightning - + [Overview](/tidb-lightning/tidb-lightning-overview.md) - + [Tutorial](/get-started-with-tidb-lightning.md) - + [Deploy](/tidb-lightning/deploy-tidb-lightning.md) - + [Configure](/tidb-lightning/tidb-lightning-configuration.md) - + Key Features - + [Checkpoints](/tidb-lightning/tidb-lightning-checkpoints.md) - + [Table Filter](/table-filter.md) - + [CSV Support](/tidb-lightning/migrate-from-csv-using-tidb-lightning.md) - + [TiDB-backend](/tidb-lightning/tidb-lightning-tidb-backend.md) - + [Web Interface](/tidb-lightning/tidb-lightning-web-interface.md) - + [Monitor](/tidb-lightning/monitor-tidb-lightning.md) - + [Troubleshoot](/troubleshoot-tidb-lightning.md) - + [FAQ](/tidb-lightning/tidb-lightning-faq.md) - + [Glossary](/tidb-lightning/tidb-lightning-glossary.md) - + [TiCDC](/ticdc/ticdc-overview.md) - + sync-diff-inspector - + [Overview](/sync-diff-inspector/sync-diff-inspector-overview.md) - + [Data Check for Tables with Different Schema/Table Names](/sync-diff-inspector/route-diff.md) - + [Data Check in Sharding Scenarios](/sync-diff-inspector/shard-diff.md) - + [Data Check for TiDB Upstream/Downstream Clusters](/sync-diff-inspector/upstream-downstream-diff.md) - + [Loader](/loader-overview.md) - + [Mydumper](/mydumper-overview.md) - + [Syncer](/syncer-overview.md) - + TiSpark - + [Quick Start](/get-started-with-tispark.md) - + [User Guide](/tispark-overview.md) ->>>>>>> 15bdb6a... perf-tuning: add column-prune.md (#3246) + Reference + SQL - [MySQL Compatibility](/mysql-compatibility.md) @@ -397,6 +261,7 @@ - [Optimizer Hints](/optimizer-hints.md) - [Tune TiKV](/tune-tikv-performance.md) - [Operating System Tuning](/tune-operating-system.md) + - [Column Pruning](/column-pruning.md) + Key Monitoring Metrics - [Overview](/grafana-overview-dashboard.md) - [TiDB](/grafana-tidb-dashboard.md) From 65cb546d4d80990ef89f37500f420445ddb4adff Mon Sep 17 00:00:00 2001 From: toutdesuite Date: Mon, 20 Jul 2020 16:41:09 +0800 Subject: [PATCH 3/4] another conflict --- column-pruning.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/column-pruning.md b/column-pruning.md index d018ab15084a6..14d8aae6c6c77 100644 --- a/column-pruning.md +++ b/column-pruning.md @@ -17,4 +17,4 @@ select a from t where b> 5 In this query, only column a and column b are used, and column c and column d are redundant. Regarding the query plan of this statement, the `Selection` operator uses column b. Then the `DataSource` operator uses columns a and column b. Columns c and column d can be pruned because the `DataSource` operator does not read them. -Therefore, when TiDB performs a top-down scanning during the logic optimization phase, redundant columns are pruned to reduce waste of resources. This scanning process is called "Column Pruning", corresponding to the `columnPruner` rule. If you want to disable this rule, refer to [The Blocklist of Optimization Rules and Expression Pushdown](/blocklist-control-plan.md). +Therefore, when TiDB performs a top-down scanning during the logic optimization phase, redundant columns are pruned to reduce waste of resources. This scanning process is called "Column Pruning", corresponding to the `columnPruner` rule. From 3317244ddbcf25eef399dc4a7bf2ab73c5778717 Mon Sep 17 00:00:00 2001 From: toutdesuite Date: Mon, 20 Jul 2020 16:49:32 +0800 Subject: [PATCH 4/4] Update column-pruning.md --- column-pruning.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/column-pruning.md b/column-pruning.md index 14d8aae6c6c77..b36f2b9907404 100644 --- a/column-pruning.md +++ b/column-pruning.md @@ -17,4 +17,4 @@ select a from t where b> 5 In this query, only column a and column b are used, and column c and column d are redundant. Regarding the query plan of this statement, the `Selection` operator uses column b. Then the `DataSource` operator uses columns a and column b. Columns c and column d can be pruned because the `DataSource` operator does not read them. -Therefore, when TiDB performs a top-down scanning during the logic optimization phase, redundant columns are pruned to reduce waste of resources. This scanning process is called "Column Pruning", corresponding to the `columnPruner` rule. +Therefore, when TiDB performs a top-down scanning during the logic optimization phase, redundant columns are pruned to reduce waste of resources. This scanning process is called "Column Pruning", corresponding to the `columnPruner` rule. If you want to disable this rule, refer to [The Blocklist of Optimization Rules and Expression Pushdown](https://docs.pingcap.com/tidb/dev/blocklist-control-plan).