[docs] Improve POC guide with bucket sizing rules and heading cleanup#3445
Conversation
- Replace bucket guidance with clear four-rule approach - Rename section headings (Sort Key, Example Templates, Performance Pitfalls) - Merge sparse partition section into single paragraph - Remove unnecessary Fixing Mistakes section - Fix broken link and rule count reference Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Demote sections 1-4 and Important Notes from h2 to h3, nested under a new 'Table Design' (建表设计) h2 parent heading. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Improves the POC table-design guide by clarifying bucket sizing guidance (four-rule approach), cleaning up headings, and simplifying/streamlining sections in both EN and ZH docs.
Changes:
- Replaces prior bucket-count guidance with a clearer four-rule approach and updates related references.
- Renames several headings for clarity and consistency (e.g., Key Columns → Sort Key).
- Consolidates/simplifies partitioning guidance and removes the “choose wrong” remediation section.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/must-read-before-poc.md | Updates ZH headings/wording to match the new bucket rules and streamlined layout. |
| docs/gettingStarted/must-read-before-poc.md | Updates EN headings/wording to match the new bucket rules and streamlined layout. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| For a POC, **Duplicate Key works for most scenarios**. Switch only if you have a clear need for upsert or pre-aggregation. For a detailed comparison, see [Data Model Overview](../table-design/data-model/overview). | ||
|
|
||
| ## 2. Key Columns | ||
| ### 2. Sort Key |
There was a problem hiding this comment.
The section heading was renamed to “Sort Key”, but this paragraph still refers to “Key columns” throughout. To avoid confusing readers, update the terminology to consistently use “Sort Key” (or “sort key columns”) in this explanation (including “first 36 bytes of …”).
| ### 2. Sort Key | |
| **Why it matters:** The sort key determines the **physical sort order** on disk. Doris builds a [prefix index](../table-design/index/prefix-index) on the first 36 bytes of the sort key columns, so queries that filter on these columns run significantly faster. However, when a `VARCHAR` column is encountered, the prefix index stops immediately — no subsequent columns are included. So place fixed-size columns (INT, BIGINT, DATE) before VARCHAR to maximize index coverage. |
| @@ -41,27 +43,21 @@ CREATE TABLE my_table | |||
|
|
|||
| POC 阶段,**Duplicate Key 适用于大多数场景**。只有在明确需要更新或预聚合时才切换。详细对比见[数据模型概述](../table-design/data-model/overview)。 | |||
There was a problem hiding this comment.
In the ZH doc, most headings are primarily Chinese, but this one is English-first (“Sort Key(排序键)”). For consistency/readability in the localized page, consider switching to a Chinese-first form like “排序键(Sort Key)” (or fully Chinese if that matches the rest of the page style).
| POC 阶段,**Duplicate Key 适用于大多数场景**。只有在明确需要更新或预聚合时才切换。详细对比见[数据模型概述](../table-design/data-model/overview)。 | |
| ## 2. 排序键(Sort Key) |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Significantly shorten Table Design section — remove verbose explanations, keep only actionable guidance, and link to existing docs for details. Also trim Example Templates descriptions and Performance Pitfalls to one-liners with references. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix "performance can sustain" → "performance holds up" - Merge competing intros under Table Design - Replace em dashes with periods/commas throughout - Remove "small tablets" bullet (overlaps with bucket rule 2) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove repeated "data model cannot be changed" (already in intro) - Replace repeated sort key advice with anchor link Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ify load_to_single_tablet Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Test plan
🤖 Generated with Claude Code