Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: update roadmap description #38253

Merged
merged 13 commits into from Sep 30, 2022
110 changes: 36 additions & 74 deletions roadmap.md
Expand Up @@ -11,91 +11,62 @@ This roadmap brings you what's coming in the 1-year future, so you can see the n
<table>
<thead>
<tr>
<th>Scenario</th>
<th>Domain</th>
<th>Feature</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="2">Support JSON</td>
<td>Support JSON function.</td>
<td>In business scenarios that require flexible schema definitions, the application can use JSON to store information for ODS, transaction indicators, commodities, game characters, and props.</td>
</tr>
<tr>
<td><ul><li>Support expression indexes.</li><li>Support generated columns.</li></ul></td>
<td>Provide query acceleration for specific field indexes in JSON scenarios.</td>
<td rowspan="3">Scalability &amp; Stability</td>
<td>Support resource management framework.</td>
<td><ul><li>Provide a basic resource management and control framework to effectively control the resource squeeze of background tasks on front-end tasks (user operations), and improve cluster stability.</li><li>Refine resource management in the multi-service aggregation scenario.</li></ul></td>
</tr>
<tr>
<td>Flashback</td>
<td>Support cluster-level flashback.</td>
<td>In game rollback scenarios, the flashback can be used to achieve a fast rollback of the current cluster. This solves the common problems in the gaming industry such as version errors and bugs.</td>
<td>Enhance the plan cache feature.</td>
<td><ul><li>Support in-session subquery, expression index, and prepared plan cache for partitions, which expands the usage scenarios of plan cache.</li><li>Support plan cache for general SQL statements in a session to save cache resources, improve the hit rate of general execution plans, and improve SQL performance.</li><li>Support cross-session plan cache, save cache resources, improve the hit rate of general execution plans, and improve SQL performance. In general scenarios, reusing execution plans can improve memory utilization and to achieve higher throughputs.</li></ul></td>
</tr>
<tr>
<td>TiFlash result write-back (supports <code>INSERT INTO SELECT</code>)</td>
<td><ul><li>Easily write the analysis results in TiFlash back to TiDB.</li><li>Provide complete ACID transactions, more convenient and reliable than general ETL solutions.</li><li>Set a hard limit on the threshold of intermediate result size, and report an error if the threshold is exceeded.</li><li>Support fully distributed transactions, and remove or relax the limit on the intermediate result size.</li></ul></td>
<td>These features combined enable a way to materialize intermediate results. The analysis results can be easily reused, which reduces unnecessary ad-hoc queries, improves the performance of BI and other applications (by pulling results directly) and reduces system load (by avoiding duplicated computation), thereby improving the overall data pipeline efficiency and reducing costs. It will make TiFlash an online service.</td>
<td>Support dynamic region.</td>
<td>Support dynamic region size adjustment (heterogeneous) and huge region size for scenarios with fast business growth and a large amount of data.</td>
</tr>
<tr>
<td>Time to live (TTL)</td>
<td>Support automatically deleting expired table data based on custom rules. </td>
<td>This feature enables automatic data cleanup in limited data archiving scenarios.</td>
<td rowspan="4">SQL</td>
<td>Support the JSON function.<ul><li>Expression index</li><li>Multi-value index</li><li>Partial index</li>
</ul></td>
<td>In business scenarios that require flexible schema definitions, the application can use JSON to store information for ODS, transaction indicators, commodities, game characters, and props.</td>
</tr>
<tr>
<td>Multi-value Index</td>
<td>Support array index.</td>
<td>Array is one of the commonly used data types in JSON scenarios. For inclusive queries in arrays, multi-value indexes can efficiently improve the query speed. </td>
<td>Support cluster-level flashback.</td>
<td>In game rollback scenarios, the flashback can be used to achieve a fast rollback of the current cluster. This solves the common problems in the gaming industry such as version errors and bugs.</td>
</tr>
<tr>
<td>TiFlash kernel optimization</td>
<td><ul><li>FastScan provides weak consistency but faster table scan capability.</li><li>Further optimize the join order, shuffle, and exchange algorithms to improve computing efficiency and boost performance for complex queries.</li><li>Add a fine-grained data sharding mechanism to optimize the <code>COUNT(DISTINCT)</code> function and high cardinality aggregation.</li></ul></td>
<td>Improve the basic computing capability of TiFlash, and optimize the performance and reliability of the underlying algorithms of the columnar storage and MPP engine.</td>
<td>Support time to live (TTL).</td>
<td>This feature enables automatic data cleanup in limited data archiving scenarios.</td>
</tr>
<tr>
<td>TiDB proxy</td>
<td>Implement automatic load balancing so that upgrading a cluster or modifying configurations does not affect the application. After scaling out or scaling in the cluster, the application can automatically rebalance the connection without reconnecting.</td>
<td>In scenarios such as upgrades and configuration changes, TiDB proxy is more business-friendly.</td>
<td>Implement a DDL parallel execution framework.</td>
<td>Implement a distributed parallel DDL execution framework, so that DDL tasks executed by only one TiDB Owner node can be coordinated and executed by all TiDB nodes in the cluster. Improve the execution speed of DDL tasks and cluster resource utilization.<br>By converting the execution of DDL tasks to distributed mode, this feature accelerates the execution speed of DDL tasks and improves the utilization of computing resources in the entire cluster. At present, DDL tasks that need to improve the speed include large table indexing and lossy column type modification tasks.</td>
</tr>
<tr>
<td>PB-level scalability</td>
<td>Support huge region size.</td>
<td>Scenarios with fast business growth and a large amount of data</td>
<td rowspan="2">Hybrid Transactional and Analytical Processing (HTAP)</td>
<td>Support TiFlash result write-back.</td>
<td><p>Support <code>INSERT INTO SELECT</code>.</p><ul><li>Easily write analysis results in TiFlash back to TiDB.</li><li>Provide complete ACID transactions, more convenient and reliable than general ETL solutions.</li><li>Set a hard limit on the threshold of intermediate result size, and report an error if the threshold is exceeded.</li><li>Support fully distributed transactions, and remove or relax the limit on the intermediate result size.</li></ul><p>These features combined enable a way to materialize intermediate results. The analysis results can be easily reused, which reduces unnecessary ad-hoc queries, improves the performance of BI and other applications (by pulling results directly) and reduces system load (by avoiding duplicated computation), thereby improving the overall data pipeline efficiency and reducing costs. It will make TiFlash an online service.</p></td>
</tr>
<tr>
<td>Distributed DDL parallel framework</td>
<td>Implement a distributed parallel DDL execution framework, so that DDL tasks executed by only one TiDB Owner node can be coordinated and executed by all TiDB nodes in the cluster. Improve the execution speed of DDL tasks and cluster resource utilization.</td>
<td>By converting the execution of DDL tasks to distributed mode, this feature accelerates the execution speed of DDL tasks and improves the utilization of computing resources in the entire cluster. At present, DDL tasks that need to improve the speed include large table indexing and lossy column type modification tasks.</td>
<td>Support FastScan for TiFlash.</td>
<td><ul><li>FastScan provides weak consistency but faster table scan capability.</li><li>Further optimize the join order, shuffle, and exchange algorithms to improve computing efficiency and boost performance for complex queries.</li><li>Add a fine-grained data sharding mechanism to optimize the <code>COUNT(DISTINCT)</code> function and high cardinality aggregation.</li></ul><p>This feature improves the basic computing capability of TiFlash, and optimizes the performance and reliability of the underlying algorithms of the columnar storage and MPP engine.</p></td>
</tr>
<tr>
<td>Non-prepared Plan Cache</td>
<td>Support plan cache for general SQL statements in a session to save cache resources, improve the hit rate of general execution plans, and improve SQL performance.</td>
<td>Non-prepared plan cache. Improve real-time and throughputs of OLTP in general scenarios.</td>
<td>Proxy</td>
<td>Support TiDB proxy.</td>
<td>Implement automatic load balancing so that upgrading a cluster or modifying configurations does not affect the application. After scaling out or scaling in the cluster, the application can automatically rebalance the connection without reconnecting.<br>In scenarios such as upgrades and configuration changes, TiDB proxy is more business-friendly.</td>
</tr>
<tr>
<td>SQL blocklist</td>
<td>Support a rule-based SQL blocklist mechanism.</td>
<td>Maintenance</td>
<td>Support rule-based SQL blocklist.</td>
<td>In multi-service aggregation scenarios, provide SQL management and control capabilities, and improve cluster stability by prohibiting high-resource-consuming SQL statements.</td>
</tr>
<tr>
<td>Resource management</td>
<td>Provide a basic resource management and control framework to effectively control the resource squeeze of background tasks on front-end tasks (user operations), and improve cluster stability.</td>
<td>Refine resource management in the multi-service aggregation scenario.</td>
</tr>
<tr>
<td>Prepared Plan Cache</td>
<td>Support in-session subquery, expression index, and prepared plan cache for Partition.</td>
<td>Expand the usage scenarios of plan cache.</td>
</tr>
<tr>
<td>PB-level scalability</td>
<td>Support dynamic region size adjustment (heterogeneous).</td>
<td>For scenarios with fast business growth and a large amount of data.</td>
</tr>
<tr>
<td>Instance plan cache</td>
<td>Support cross-session plan cache, save cache resources, improve the hit rate of general execution plans, and improve SQL performance.</td>
<td>In general scenarios, reuse execution plans to improve memory utilization and to achieve higher throughputs.</td>
</tr>
</tbody>
</table>

Expand All @@ -104,7 +75,7 @@ This roadmap brings you what's coming in the 1-year future, so you can see the n
<table>
<thead>
<tr>
<th>Scenario</th>
<th>Domain</th>
<th>Feature</th>
<th>Description</th>
</tr>
Expand All @@ -123,15 +94,15 @@ This roadmap brings you what's coming in the 1-year future, so you can see the n
<table>
<thead>
<tr>
<th>Scenario</th>
<th>Domain</th>
<th>Feature</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>Backup and restore</td>
<td>EBS snapshot-based backup and restore</td>
<td>AWS EBS or GCP persistent disk snapshot-based backup and restore.</td>
<td>Support backup and restore based on AWS EBS or GCP persistent disk snapshots.</td>
</tr>
<tr>
Expand All @@ -141,7 +112,7 @@ This roadmap brings you what's coming in the 1-year future, so you can see the n
</tr>
<tr>
<td rowspan="2">Data replication to downstream systems via TiCDC</td>
<td>Reduce TiCDC replication latency in planned offline scenarios.</td>
<td>Reduce TiCDC replication latency in daily operations.</td>
<td>When TiKV, TiDB, PD, or TiCDC nodes are offline in a planned maintenance window, the replication latency of TiCDC can be reduced to less than 10 seconds.</td>
</tr>
<tr>
Expand All @@ -150,7 +121,7 @@ This roadmap brings you what's coming in the 1-year future, so you can see the n
</tr>
<tr>
<td>Data migration</td>
<td>TiDB Lightning supports table-level and partition-level incremental data import.</td>
<td>TiDB Lightning supports table-level and partition-level online data import.</td>
<td>TiDB Lightning provides comprehensive table-level and partition-level data import capabilities.</td>
</tr>
</tbody>
Expand All @@ -161,24 +132,15 @@ This roadmap brings you what's coming in the 1-year future, so you can see the n
<table>
<thead>
<tr>
<th>Scenario</th>
<th>Domain</th>
<th>Feature</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="2">ShangMi (SM) algorithms</td>
<td>Encryption-at-rest (TiKV and TiFlash) supports the SM4 algorithm.</td>
<td>Supports encrypting data stored in TiKV and TiFlash based on the SM4 algorithm.</td>
</tr>
<tr>
<td>TiDB authentication supports the SM3 algorithm. </td>
<td>Provide a user authentication plugin based on the SM3 algorithm, which encrypts the password using the SM3 algorithm.</td>
</tr>
<tr>
<td>Log redaction</td>
<td><ul><li>Data redaction in execution plans in TiDB Dashboard.</li><li>Data redaction in TiDB-related logs.</li></ul></td>
<td><ul><li>Support data redaction in execution plans in TiDB Dashboard.</li><li>Enhance data redaction in TiDB-related logs.</li></ul></td>
<td>Redact sensitive information in execution plans and various logs to enhance the security of user data.</td>
</tr>
<tr>
Expand All @@ -202,7 +164,7 @@ This roadmap brings you what's coming in the 1-year future, so you can see the n
<td>TiDB already supports cluster-level, database-level, and table-level privilege management. On top of that, TiDB will support column-level privilege management to meet the principle of least privilege and provide fine-grained data access control.</td>
</tr>
<tr>
<td>Audit logging capability refactor</td>
<td>Audit logging capability enhancement</td>
<td>Support configurable audit log policies, configurable audit filters (filter by objects, users, and operation types), and visual access to audit logs.</td>
<td>Improve the completeness and usability of the audit log feature.</td>
</tr>
Expand Down