From e9aa20ac1b4039f34d518118a1ce0233e526d9ab Mon Sep 17 00:00:00 2001 From: lilin90 Date: Fri, 27 Apr 2018 19:34:41 +0800 Subject: [PATCH 1/2] releases/readme: add the release notes for 2.0 --- README.md | 1 + releases/2.0ga.md | 155 ++++++++++++++++++++++++++++++++++++++++++++++ releases/rn.md | 1 + 3 files changed, 157 insertions(+) create mode 100644 releases/2.0ga.md diff --git a/README.md b/README.md index e555af260094b..888377d7e968c 100644 --- a/README.md +++ b/README.md @@ -116,6 +116,7 @@ - [Frequently Asked Questions (FAQ)](FAQ.md) - [TiDB Best Practices](https://pingcap.github.io/blog/2017/07/24/tidbbestpractice/) + [Releases](releases/rn.md) + - [2.0](releases/2.0ga.md) - [2.0 RC5](releases/2rc5.md) - [2.0 RC4](releases/2rc4.md) - [2.0 RC3](releases/2rc3.md) diff --git a/releases/2.0ga.md b/releases/2.0ga.md new file mode 100644 index 0000000000000..642dc6511e733 --- /dev/null +++ b/releases/2.0ga.md @@ -0,0 +1,155 @@ +--- +title: TiDB 2.0 Release Notes +category: Releases +--- + +# TiDB 2.0 Release Notes + +On April 27, 2018, TiDB 2.0 GA is released! Compared with TiDB 1.0, this release has great improvement in MySQL compatibility, SQL optimizer, executor, and stability. + +## TiDB + +- SQL Optimizer + - Use more compact data structure to reduce the memory usage of statistics information + - Speed up the loading statistics information when starting a tidb-server process + - Support updating statistics information dynamically [experimental] + - Optimize the cost model to provide more accurate query cost evaluation + - Use `Count-Min Sketch` to estimate the cost of point queries more accurately + - Support analyzing more complex conditions to make full use of indexes + - Support manually specifying the `Join` order using the `STRAIGHT_JOIN` syntax + - Use the Stream Aggregation operator when the `GROUP BY` clause is empty to improve the performance + - Support using indexes for the `MAX/MIN` function + - Optimize the processing algorithms for correlated subqueries to support decorrelating more types of correlated subqueries and transform them to `Left Outer Join` + - Extend `IndexLookupJoin` to be used in matching the index prefix +- SQL Execution Engine + - Refactor all operators using the Chunk architecture, improve the execution performance of analytical queries, and reduce memory usage.There is a significant improvement in the TPC-H benchmark result. + - Support the Streaming Aggregation operators pushdown + - Optimize the `Insert Into Ignore` statement to improve the performance by over 10 times + - Optimize the `Insert On Duplicate Key Update` statement to improve the performance by over 10 times + - Optimize `Load Data` to improve the performance by over 10 times + - Push down more data types and functions to TiKV + - Support computing the memory usage of physical operators, and specifying the processing behavior in the configuration file and system variables when the memory usage exceeds the threshold + - Support limiting the memory usage by a single SQL statement to reduce the risk of OOM + - Support using implicit RowID in CRUD operations + - Improve the performance of point queries +- Server + - Support the Proxy Protocol + - Add more monitoring metrics and refine the log + - Support validating the configuration files + - Support obtaining the information of TiDB parameters through HTTP API + - Resolve Lock in the Batch mode to speed up garbage collection + - Support multi-threaded garbage collection + - Support TLS +- Compatibility + - Support more MySQL syntaxes + - Support modifying the `lower_case_table_names` system variable in the configuration file to support the OGG data synchronization tool + - Improve compatibility with the Navicat management tool + - Support displaying the table creating time in `Information_Schema` + - Fix the issue that the return types of some functions/expressions differ from MySQL + - Improve compatibility with JDBC + - Support more SQL Modes +- DDL + - Optimize the `Add Index` operation to greatly improve the execution speed in some scenarios + - Attach a lower priority to the `Add Index` operation to reduce the impact on online business + - Output more detailed status information of the DDL jobs in `Admin Show DDL Jobs` + - Support querying the original statements of currently running DDL jobs using `Admin Show DDL Job Queries JobID` + - Support recovering the index data using `Admin Recover Index` for disaster recovery + - Support modifying Table Options using the `Alter` statement + +## PD + +- Support `Region Merge`, to merge empty Regions after deleting data [experimental] +- Support `Raft Learner` [experimental] +- Optimize the scheduler + - Make the scheduler to adapt to different Region sizes + - Improve the priority and speed of restoring data during TiKV outage + - Speed up data transferring when removing a TiKV node + - Optimize the scheduling policies to prevent the disks from becoming full when the space of TiKV nodes is insufficient + - Improve the scheduling efficiency of the balance-leader scheduler + - Reduce the scheduling overhead of the balance-region scheduler + - Optimize the execution efficiency of the the hot-region scheduler +- Operations interface and configuration + - Support TLS + - Support prioritizing the PD leaders + - Support configuring the scheduling policies based on labels + - Support configuring stores with a specific label not to schedule the Raft leader + - Support splitting Region manually to handle the hotspot in a single Region + - Support scattering a specified Region to manually adjust Region distribution in some cases + - Add check rules for configuration parameters and improve validity check of the configuration items +- Debugging interface + - Add the `Drop Region` debugging interface + - Add the interfaces to enumerate the health status of each PD +- Statistics + - Add statistics about abnormal Regions + - Add statistics about Region isolation level + - Add scheduling related metrics +- Performance + - Keep the PD leader and the etcd leader together in the same node to improve write performance + - Optimize the performance of Region heartbeat + +## TiKV + +- Features + - Protect critical configuration from incorrect modification + - Support `Region Merge` [experimental] + - Add the `Raw DeleteRange` API + - Add the `GetMetric` API + - Add `Raw Batch Put`, `Raw Batch Get`, `Raw Batch Delete` and `Raw Batch Scan` + - Add Column Family options for the RawKV API and support executing operation on a specific Column Family + - Support Streaming and Streaming Aggregation in Coprocessor + - Support configuring the request timeout of Coprocessor + - Carry timestamps with Region heartbeats + - Support modifying some RocksDB parameters online, such as `block-cache-size` + - Support configuring the behavior of Coprocessor when it encounters some warnings or errors + - Support starting in the importing data mode to reduce write amplification during the data importing process + - Support manually splitting Region in halves + - Improve the data recovery tool `tikv-ctl` + - Return more statistics in Coprocessor to guide the behavior of TiDB + - Support the `ImportSST` API to import SST files [experimental] + - Add the TiKV Importer binary to integrate with TiDB Lightning to import data quickly [experimental] +- Performance + - Optimize read performance using `ReadPool` and increase the `raw_get/get/batch_get` by 30% + - Improve metrics performance + - Inform PD immediately once the Raft snapshot process is completed to speed up balancing + - Solve performance jitter caused by RocksDB flushing + - Optimize the space reclaiming mechanism after deleting data + - Speed up garbage cleaning while starting the server + - Reduce the I/O overhead during replica migration using `DeleteFilesInRanges` +- Stability + - Fix the issue that gRPC call does not returned when the PD leader switches + - Fix the issue that it is slow to offline nodes caused by snapshots + - Limit the temporary space usage consumed by migrating replicas + - Report the Regions that cannot elect a leader for a long time + - Update the Region size information in time according to compaction events + - Limit the size of scan lock to avoid request timeout + - Limit the memory usage when receiving snapshots to avoid OOM + - Increase the speed of CI test + - Fix the OOM issue caused by too many snapshots + - Configure `keepalive` of gRPC + - Fix the OOM issue caused by an increase of the Region number + +## TiSpark + +TiSpark uses a separate version number. The current TiSpark version is 1.0 GA. The components of TiSpark 1.0 provide distributed computing of TiDB data using Apache Spark. + +- Provide a gRPC communication framework to read data from TiKV +- Provide encoding and decoding of TiKV component data and communication protocol +- Provide calculation pushdown, which includes: + - Aggregate pushdown + - Predicate pushdown + - TopN pushdown + - Limit pushdown +- Provide index related support + - Transform predicate into Region key range or secondary index + - Optimize `Index Only` queries + - Optimize table scan when runtime index degenerates +- Provide cost-based optimization + - Support statistics + - Select index + - Estimate broadcast table cost +- Provide support for multiple Spark interfaces + - Support Spark Shell + - Support ThriftServer/JDBC + - Support Spark-SQL interaction + - Support PySpark Shell + - Support SparkR \ No newline at end of file diff --git a/releases/rn.md b/releases/rn.md index 111288a9ca953..b6ce75ae1054e 100644 --- a/releases/rn.md +++ b/releases/rn.md @@ -5,6 +5,7 @@ category: release # TiDB Release Notes + - [2.0](2.0ga.md) - [2.0 RC5](2rc5.md) - [2.0 RC4](2rc4.md) - [2.0 RC3](2rc3.md) From c676637c6c0b5368e0f3fc70ba6b18c0df92fe29 Mon Sep 17 00:00:00 2001 From: lilin90 Date: Fri, 27 Apr 2018 19:56:06 +0800 Subject: [PATCH 2/2] releases: address comments --- releases/2.0ga.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/releases/2.0ga.md b/releases/2.0ga.md index 642dc6511e733..6b458757a5519 100644 --- a/releases/2.0ga.md +++ b/releases/2.0ga.md @@ -11,7 +11,7 @@ On April 27, 2018, TiDB 2.0 GA is released! Compared with TiDB 1.0, this release - SQL Optimizer - Use more compact data structure to reduce the memory usage of statistics information - - Speed up the loading statistics information when starting a tidb-server process + - Speed up loading statistics information when starting a tidb-server process - Support updating statistics information dynamically [experimental] - Optimize the cost model to provide more accurate query cost evaluation - Use `Count-Min Sketch` to estimate the cost of point queries more accurately @@ -22,7 +22,7 @@ On April 27, 2018, TiDB 2.0 GA is released! Compared with TiDB 1.0, this release - Optimize the processing algorithms for correlated subqueries to support decorrelating more types of correlated subqueries and transform them to `Left Outer Join` - Extend `IndexLookupJoin` to be used in matching the index prefix - SQL Execution Engine - - Refactor all operators using the Chunk architecture, improve the execution performance of analytical queries, and reduce memory usage.There is a significant improvement in the TPC-H benchmark result. + - Refactor all operators using the Chunk architecture, improve the execution performance of analytical queries, and reduce memory usage. There is a significant improvement in the TPC-H benchmark result. - Support the Streaming Aggregation operators pushdown - Optimize the `Insert Into Ignore` statement to improve the performance by over 10 times - Optimize the `Insert On Duplicate Key Update` statement to improve the performance by over 10 times @@ -116,7 +116,7 @@ On April 27, 2018, TiDB 2.0 GA is released! Compared with TiDB 1.0, this release - Speed up garbage cleaning while starting the server - Reduce the I/O overhead during replica migration using `DeleteFilesInRanges` - Stability - - Fix the issue that gRPC call does not returned when the PD leader switches + - Fix the issue that gRPC call does not get returned when the PD leader switches - Fix the issue that it is slow to offline nodes caused by snapshots - Limit the temporary space usage consumed by migrating replicas - Report the Regions that cannot elect a leader for a long time