Skip to content

docs write --markdown --append hangs forever on Thai content (infinite loop in nextRune) #559

@ninyawee

Description

@ninyawee

Summary

gog docs write <docId> --markdown --append --file <md> hangs indefinitely (CPU-bound, no progress, no error) when the markdown content ends with a non-ASCII rune. Thai content reliably triggers it because Thai paragraphs/headings naturally end with a multi-byte rune. The same content works with --markdown --replace and with --append alone (no --markdown).

Version: v0.14.0 (build 469f4b4, also reproducible on current main HEAD 56755e9).

Reproduction

DOC_ID=$(gog docs create "repro" -j --results-only | jq -r .id)

# A single Thai heading is enough to trigger the hang:
printf '## ส่วนคำถาม\n' > /tmp/test-thai.md

# Works (~3s) — replace path goes through Drive's server-side markdown converter
gog docs write "$DOC_ID" --markdown --replace --file /tmp/test-thai.md

# Hangs forever — append path goes through gogcli's local markdown parser
gog docs write "$DOC_ID" --markdown --append --file /tmp/test-thai.md

# Works (<2s) — plain text path bypasses the markdown parser entirely
gog docs write "$DOC_ID" --append --file /tmp/test-thai.md

Empirically the hang burns ~29s of user CPU per 30s wall clock — it is a CPU-bound infinite loop, not a stuck network call.

Root cause

internal/cmd/docs_markdown.go:459nextRune returns (\"\", 0) when its input is a string consisting of a single multi-byte rune:

func nextRune(s string) (string, int) {
    for i, r := range s {
        if i > 0 {
            return s[:i], i
        }
        if len(s) == 1 {  // len() is bytes, not runes
            return s, 1
        }
        _ = r
    }
    return "", 0
}

For input \"ก\" (one Thai rune, 3 bytes):

  1. Iteration 1: i=0, r='ก'. Neither branch fires (i>0 false, len(s)==1 false because len is 3 bytes).
  2. The range loop ends.
  3. Function returns (\"\", 0).

The caller in ParseInlineFormatting then runs currentByte += size where size == 0, so currentByte never advances and the outer for currentByte < len(text) loop spins forever.

The condition is reached for any text ending in a multi-byte rune — Thai, Chinese, Japanese, Korean, emoji, etc. The reason --markdown --replace is unaffected is that it uses Drive's native markdown upload (Media(strings.NewReader(cleaned), ContentType(mimeTextMarkdown)) in docs_edit.go:218), bypassing gogcli's local parser entirely. The --append path calls insertDocsMarkdownAtMarkdownToDocsRequestsParseInlineFormattingnextRune.

Affected paths

ParseInlineFormatting is also called from docs_formatter.go for headings, list items, blockquotes, and paragraphs, so any markdown-handling path that runs through the local parser is affected — not just docs write --append --markdown but also docs find-replace --format markdown and docs sed-family commands when the replacement content ends in a multi-byte rune.

Suggested fix

Replace the hand-rolled rune iteration with unicode/utf8.DecodeRuneInString:

func nextRune(s string) (string, int) {
    if s == \"\" {
        return \"\", 0
    }
    _, size := utf8.DecodeRuneInString(s)
    return s[:size], size
}

I'm opening a PR with this fix plus regression tests (unit tests for nextRune, timeout-guarded tests for ParseInlineFormatting and MarkdownToDocsRequests on Thai/emoji content).

Environment

  • gog v0.14.0 (build 469f4b4 2026-04-28)
  • Linux 6.17, Go 1.26.2
  • Auth backend: keyring + age secret store (irrelevant — bug is purely local CPU)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions