[fix](doc) trim/field/substring: bidirectional en/zh sync#3839
Merged
morningman merged 1 commit intoMay 30, 2026
Merged
Conversation
…atch)
trim.md
EN <- ZH: add example 9 — UTF-8 string with a UTF-8 trim pattern,
showing leading-side strip while trailing non-match prevents further
removal ('ṭṛì ḍḍumai+++', 'ṭṛì' -> ' ḍḍumai+++').
ZH <- EN: add examples 8/9/10 — no-match (returns original),
repeated-pattern (strips until exhausted), and asymmetric removal
(multi-char pattern from both ends). Plus a small ZH fix: trailing
';;' typo on example 7 -> single ';'.
field.md
EN <- ZH: extend the class_test setup with a NULL row, then add two
examples that exercise NULL handling under FIELD-based custom
ordering — DESC variant (NULL last because FIELD = 0) and
NULLS FIRST (NULL placement independent of sort direction). Updated
the existing ASC example's expected output to include the NULL row.
ZH <- EN: add the simple 'SELECT FIELD(2, 3, 1, 2, 5)' example that
introduces the 1-based position semantics before the ORDER BY use
case.
substring.md
EN <- ZH: add example 12 — NULL passed directly to substring (the MID
alias variant was already example 8).
ZH <- EN: add example 12 — the SUBSTR alias variant
('Hello World', 7, 5) -> 'World'.
All added examples verified end-to-end against Doris 4.1.1; expected
outputs were captured from the cluster, not transcribed across.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
morningman
pushed a commit
that referenced
this pull request
May 30, 2026
…orner-case adds (en+zh) (#3848) ## Summary Mechanical port of the 4.x fixes in #3833, #3837, #3838, #3839 to dev/master. Verified against today's master build (selectdb-qa-test tarball). **Skipped** (already applied on dev): - strleft.md ZH dedup (#3837) - milliseconds-add.md EN BIGINT-range example (#3834 EN-side; the ZH duplicate-removal piece is in the sibling PR) ## Files (13) ### EN string-functions - **from-base64.md, instr.md, length.md, locate.md, lpad.md** — backport corner-case examples (NULL / empty / multi-byte UTF-8 / numeric / etc.) added in #3833 - **rtrim.md** — add the LENGTH-based 'default-only-strips-ASCII-space' example (#3837) - **substring.md** — add the missing 'empty source string' example + 'NULL passed directly' (#3839) - **trim.md** — replace example 2's prose + expected output (`trim('ababccaab', 'ab')` is `cca`, not `ababcca` — trim repeatedly strips from both ends), plus the UTF-8 multi-byte-pattern example (#3839) ### EN other-functions - **field.md** — full replace with 4.x post-fix version: adds CREATE TABLE setup for `baseall` and `class_test` (which the page references but never created), adds a NULL row to `class_test`, adds DESC and NULLS FIRST examples that exercise NULL handling (#3789 + #3839 combined) ### ZH - **field.md** — add the simple `SELECT FIELD(2, 3, 1, 2, 5)` example (#3839) - **ltrim.md** — rewrite example 3 with correct expected output; LTRIM strips ASCII space only, NOT `\t`/`\n`, so the result still contains the tab/newline. Switched to a LENGTH() comparison for clarity (#3838) - **substring.md** — add the SUBSTR alias example (#3839) - **trim.md** — add three examples (no-match returns original; repeated pattern strips until exhausted; asymmetric removal with multi-char pattern). Also drop a trailing `;;` typo (#3839) ## Verification Verified end-to-end against today's master cluster — all added / modified examples behave identically on master as on 4.1.1, so the doc fixes apply unchanged. ## Related 4.x PRs #3833 #3837 #3838 #3839 (and #3789 for the field.md setup blocks). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Careful bidirectional sync for three pages where EN and ZH had each grown distinct non-redundant examples. Each addition was verified end-to-end against Doris 4.1.1; expected outputs are pasted from the cluster, not transcribed.
trim.mdtrim('ṭṛì ḍḍumai+++', 'ṭṛì')strips only the leadingṭṛì(trailing+++doesn't match the pattern so trim stops there). EN previously had only a default-space UTF-8 example.;;(double semicolon) on example 7 → single;.field.mdclass_testsetup with aNULLrow, then add two examples that exercise NULL handling under FIELD-based custom ordering — DESC variant (NULL ends up last because FIELD = 0) and NULLS FIRST (NULL placement independent of sort direction). Updated the existing ASC example's expected output to include the NULL row that the new setup row produces.SELECT FIELD(2, 3, 1, 2, 5)example that introduces the 1-based position semantics before the ORDER BY use case.substring.mdsubstring(NULL, 1, 3)(NULL passed to the function directly; theMIDalias variant was already example 8).SUBSTR('Hello World', 7, 5)showing the SUBSTR alias.Scope note
Two other P1-both-extra pages remain intentionally not in this PR:
strright.md— EN and ZH are already at parity; the only diff is a cosmetic "syntax block lists the RIGHT alias twice" formatting difference. Not worth a doc change.array-count.md— EN and ZH each have substantially different example sets covering the same teaching points with different idioms (e.g. EN usessize(array_filter(...)) > 0to show the "wrap inner higher-order in a scalar" pattern; ZH uses recursivearray_count(... array_count(...) ...)). Both are valid; mechanically merging them risks a confusing doc.🤖 Generated with Claude Code