Skip to content

[follow-up] Truncate-safe doesn't fully clean nested-bracket labels #17

@justi

Description

@justi

Context

Surfaced during codex review of #14. After that PR, _truncate_safe correctly cleans flat unclosed markdown links at the truncation boundary, but a label with one level of CommonMark bracket nesting can leave a fragment of the outer label behind:

[outer [inner](u1)](url-cut-here

The flat strip regex \[[^]]*\]\([^)]*$ matches the first [label] it finds going backward, but [inner] is closed with ] while [outer is not — [^]]* cannot cross the inner ], so the outer pair is invisible to the regex. Result: a leftover [outer prefix in the truncated bullet.

Why not now

CommonMark allows arbitrary bracket nesting. A regex that handles depth-1 is meaningfully more complex; depth-2+ requires a real parser. Release-note bullets in practice use flat links, so the failure mode is rare. PR #14 marked this as a known limitation in lib/core/github.sh.

Suggested approach

  • Replace the regex pair with a small bracket/paren-balance scanner that strips from the earliest unbalanced [ or ( at the tail
  • Or pull in a minimal markdown link tokenizer (still keeping the bash/jq parity intact)

Tests should add: nested-one-level, nested-two-level, mixed [ref] [a](u) adjacency.

Severity

Low. Cosmetic — output is still readable for an LLM consumer; the fragment doesn't change semantics. File when someone hits a real-world example in the wild.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions