[fix](doc) trim/field/substring: bidirectional en/zh sync by boluor · Pull Request #3839 · apache/doris-website

boluor · 2026-05-28T16:02:38Z

Summary

Careful bidirectional sync for three pages where EN and ZH had each grown distinct non-redundant examples. Each addition was verified end-to-end against Doris 4.1.1; expected outputs are pasted from the cluster, not transcribed.

`trim.md`

EN ← ZH: add example 9 — UTF-8 string with a UTF-8 trim pattern. Demonstrates that trim('ṭṛì ḍḍumai+++', 'ṭṛì') strips only the leading ṭṛì (trailing +++ doesn't match the pattern so trim stops there). EN previously had only a default-space UTF-8 example.
ZH ← EN: add examples 8/9/10 — no-match returns original, repeated-pattern strips until exhausted, and asymmetric removal (multi-char pattern from both ends). EN had these corner cases as 4/7/8; ZH didn't.
ZH fix: trailing ;; (double semicolon) on example 7 → single ;.

`field.md`

EN ← ZH: extend the class_test setup with a NULL row, then add two examples that exercise NULL handling under FIELD-based custom ordering — DESC variant (NULL ends up last because FIELD = 0) and NULLS FIRST (NULL placement independent of sort direction). Updated the existing ASC example's expected output to include the NULL row that the new setup row produces.
ZH ← EN: add the simple SELECT FIELD(2, 3, 1, 2, 5) example that introduces the 1-based position semantics before the ORDER BY use case.

`substring.md`

EN ← ZH: add example 12 — substring(NULL, 1, 3) (NULL passed to the function directly; the MID alias variant was already example 8).
ZH ← EN: add example 12 — SUBSTR('Hello World', 7, 5) showing the SUBSTR alias.

Scope note

Two other P1-both-extra pages remain intentionally not in this PR:

strright.md — EN and ZH are already at parity; the only diff is a cosmetic "syntax block lists the RIGHT alias twice" formatting difference. Not worth a doc change.
array-count.md — EN and ZH each have substantially different example sets covering the same teaching points with different idioms (e.g. EN uses size(array_filter(...)) > 0 to show the "wrap inner higher-order in a scalar" pattern; ZH uses recursive array_count(... array_count(...) ...)). Both are valid; mechanically merging them risks a confusing doc.

🤖 Generated with Claude Code

…atch) trim.md EN <- ZH: add example 9 — UTF-8 string with a UTF-8 trim pattern, showing leading-side strip while trailing non-match prevents further removal ('ṭṛì ḍḍumai+++', 'ṭṛì' -> ' ḍḍumai+++'). ZH <- EN: add examples 8/9/10 — no-match (returns original), repeated-pattern (strips until exhausted), and asymmetric removal (multi-char pattern from both ends). Plus a small ZH fix: trailing ';;' typo on example 7 -> single ';'. field.md EN <- ZH: extend the class_test setup with a NULL row, then add two examples that exercise NULL handling under FIELD-based custom ordering — DESC variant (NULL last because FIELD = 0) and NULLS FIRST (NULL placement independent of sort direction). Updated the existing ASC example's expected output to include the NULL row. ZH <- EN: add the simple 'SELECT FIELD(2, 3, 1, 2, 5)' example that introduces the 1-based position semantics before the ORDER BY use case. substring.md EN <- ZH: add example 12 — NULL passed directly to substring (the MID alias variant was already example 8). ZH <- EN: add example 12 — the SUBSTR alias variant ('Hello World', 7, 5) -> 'World'. All added examples verified end-to-end against Doris 4.1.1; expected outputs were captured from the cluster, not transcribed across. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…orner-case adds (en+zh) (#3848) ## Summary Mechanical port of the 4.x fixes in #3833, #3837, #3838, #3839 to dev/master. Verified against today's master build (selectdb-qa-test tarball). **Skipped** (already applied on dev): - strleft.md ZH dedup (#3837) - milliseconds-add.md EN BIGINT-range example (#3834 EN-side; the ZH duplicate-removal piece is in the sibling PR) ## Files (13) ### EN string-functions - **from-base64.md, instr.md, length.md, locate.md, lpad.md** — backport corner-case examples (NULL / empty / multi-byte UTF-8 / numeric / etc.) added in #3833 - **rtrim.md** — add the LENGTH-based 'default-only-strips-ASCII-space' example (#3837) - **substring.md** — add the missing 'empty source string' example + 'NULL passed directly' (#3839) - **trim.md** — replace example 2's prose + expected output (`trim('ababccaab', 'ab')` is `cca`, not `ababcca` — trim repeatedly strips from both ends), plus the UTF-8 multi-byte-pattern example (#3839) ### EN other-functions - **field.md** — full replace with 4.x post-fix version: adds CREATE TABLE setup for `baseall` and `class_test` (which the page references but never created), adds a NULL row to `class_test`, adds DESC and NULLS FIRST examples that exercise NULL handling (#3789 + #3839 combined) ### ZH - **field.md** — add the simple `SELECT FIELD(2, 3, 1, 2, 5)` example (#3839) - **ltrim.md** — rewrite example 3 with correct expected output; LTRIM strips ASCII space only, NOT `\t`/`\n`, so the result still contains the tab/newline. Switched to a LENGTH() comparison for clarity (#3838) - **substring.md** — add the SUBSTR alias example (#3839) - **trim.md** — add three examples (no-match returns original; repeated pattern strips until exhausted; asymmetric removal with multi-char pattern). Also drop a trailing `;;` typo (#3839) ## Verification Verified end-to-end against today's master cluster — all added / modified examples behave identically on master as on 4.1.1, so the doc fixes apply unchanged. ## Related 4.x PRs #3833 #3837 #3838 #3839 (and #3789 for the field.md setup blocks). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

boluor mentioned this pull request May 29, 2026

[fix](doc) dev: backport 4.x string-function example expansions and corner-case adds (en+zh) #3848

Merged

boluor temporarily deployed to Production May 30, 2026 02:44 — with GitHub Actions Inactive

morningman merged commit 7f3b1a7 into apache:master May 30, 2026
3 checks passed

boluor deleted the fix/trim-field-substring-bidir-sync-deferred branch May 30, 2026 08:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fix](doc) trim/field/substring: bidirectional en/zh sync#3839

[fix](doc) trim/field/substring: bidirectional en/zh sync#3839
morningman merged 1 commit into
apache:masterfrom
boluor:fix/trim-field-substring-bidir-sync-deferred

boluor commented May 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

boluor commented May 28, 2026

Summary

trim.md

field.md

substring.md

Scope note

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`trim.md`

`field.md`

`substring.md`