fix(docs): unblock markdown --append on Thai/CJK/emoji content#560
Open
ninyawee wants to merge 1 commit intoopenclaw:mainfrom
Open
fix(docs): unblock markdown --append on Thai/CJK/emoji content#560ninyawee wants to merge 1 commit intoopenclaw:mainfrom
ninyawee wants to merge 1 commit intoopenclaw:mainfrom
Conversation
…ing markdown append `gog docs write --markdown --append` looped forever on any markdown content ending in a non-ASCII rune (Thai, CJK, emoji, etc.). The trigger was `nextRune` in docs_markdown.go: for a string containing exactly one multi-byte rune, the range-loop implementation never returned a non-zero size, and the caller in ParseInlineFormatting spun on `currentByte += 0`. `--markdown --replace` was unaffected because that path uploads the markdown to Drive's server-side converter; only `--append` and the find-replace/sed paths invoke the local parser. Replace the hand-rolled iteration with `utf8.DecodeRuneInString`, which handles empty input, single-byte ASCII, multi-byte runes, and invalid UTF-8 (returns the replacement-rune width) consistently. Adds three regression tests covering the unit (`nextRune`), the immediate caller (`ParseInlineFormatting` on Thai/emoji), and the end-to-end markdown→Docs-API path used by --append (`MarkdownToDocsRequests` at a non-zero base index). Each is wrapped in a 2–5s timeout so a future infinite-loop regression fails fast instead of stalling CI. Fixes openclaw#559
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #559 —
gog docs write --markdown --appendhung indefinitely (CPU-bound infinite loop) on any markdown content ending in a non-ASCII rune. Thai content reliably triggered it because Thai paragraphs/headings naturally end with multi-byte runes; the same bug applies to CJK and emoji content.The hang was in
nextRune(internal/cmd/docs_markdown.go). For a string containing exactly one multi-byte rune, the previous range-based implementation:returned size 0, and the caller in
ParseInlineFormattingadvancedcurrentByte += 0forever inside itsfor currentByte < len(text)loop.--markdown --replaceis unaffected because that path uploads to Drive's server-side markdown converter (Media(strings.NewReader(cleaned), ContentType(mimeTextMarkdown))indocs_edit.go). Only--appendand the find-replace/sed paths invoke the local parser, so this issue affects:docs write --markdown --appenddocs find-replace --format markdown(when replacement text ends in a multi-byte rune)docs sedfamily commands using markdown replacement contentFix
Replace the hand-rolled rune iteration with
utf8.DecodeRuneInString. It handles every case correctly — empty input, ASCII, multi-byte runes, and invalid UTF-8 (RuneErrorwith its byte size).Test plan
nextRunecovering the previously-broken paths (single Thai rune, single emoji, empty input, mixed strings).ParseInlineFormattingon Thai headings, FAQ-style mixed text, bold/italic/code mixed with Thai, and trailing emoji. Each call is wrapped in a 2s timeout so any future infinite-loop regression fails fast in CI rather than stalling it.TestMarkdownToDocsRequests_ThaiAppendexercising the exact path used bydocs write --markdown --append—ParseMarkdown→MarkdownToDocsRequestsat a non-zero base index — on a multi-element Thai sample (heading + paragraph + bullet list + blockquote).mainHEAD (each times out and the goroutine stack confirms it's stuck insideParseInlineFormattingat line 434) and pass with the fix.make lintclean. Fullgo test ./internal/cmd/...passes apart from a pre-existing failure inTestRequireAccount_Missingthat is unrelated to this change (also fails on stockmain).--markdown --appendon a single-line Thai heading) used to time out at 30s; now completes in ~1.4s.Notes
mise(github:steipete/gogcli@0.14.0); the bug reproduces identically on currentmain(56755e9) which is what this branch targets.