Skip to content

fix(doc): post-process docs +fetch output to improve round-trip fidelity#214

Merged
fangshuyu-768 merged 12 commits intomainfrom
feat/docs-export-import-fix
Apr 9, 2026
Merged

fix(doc): post-process docs +fetch output to improve round-trip fidelity#214
fangshuyu-768 merged 12 commits intomainfrom
feat/docs-export-import-fix

Conversation

@herbertliu
Copy link
Copy Markdown
Collaborator

@herbertliu herbertliu commented Apr 2, 2026

Summary

fetch-doc (MCP) exports Markdown with several spec violations — missing blank lines between blocks, trailing spaces in bold markers, blockquote lines collapsing, etc. — that cause ~80% of document structure to be lost on re-import via create-doc. This PR adds a fixExportedMarkdown() post-processing pipeline to docs +fetch that corrects these issues at the CLI layer.

Changes

  • shortcuts/doc/docs_fetch.go: Call fixExportedMarkdown() on the markdown field returned by fetch-doc before outputting.
  • shortcuts/doc/markdown_fix.go (new): Post-processing pipeline with five fix functions:
    • fixBoldSpacing: removes trailing spaces before **/* closing delimiters; strips redundant ** inside ATX headings
    • fixSetextAmbiguity: inserts a blank line before --- that immediately follows a non-blank line, preventing Setext H2 misparse
    • fixBlockquoteHardBreaks: inserts an empty > separator between consecutive blockquote lines so each becomes its own paragraph on re-import
    • fixTopLevelSoftbreaks: inserts blank lines between adjacent top-level content blocks, and between content lines inside <callout>, <quote-container>, and <lark-td> containers; code fences are left untouched
    • fixCalloutEmoji: replaces named emoji aliases (e.g. emoji="warning") with Unicode emoji that create-doc recognises
    • fixCodeBlockTrailingBlanks: removes spurious blank lines that create-doc/fetch-doc inserts before each closing ``` fence
  • shortcuts/doc/markdown_fix_test.go (new): Table-driven unit tests for all fix functions.

Test Plan

  • Unit tests pass (go test ./shortcuts/doc/...)
  • Round-trip verified on three documents of increasing complexity:
    • Harness Engineering (427 lines): diff 166 lines → 1 line (≈99.8% fidelity)

Related Issues

  • None

Summary by CodeRabbit

  • Bug Fixes

    • Improved exported Markdown post-processing when fetching docs: fixes for bold/italic spacing and redundant heading markers, Setext ambiguity handling, blockquote hard-break insertion, callout emoji alias conversion, top-level/container softbreak normalization, and collapsing excessive blank lines while preserving fenced code blocks.
  • Tests

    • Added comprehensive tests covering all Markdown transformation behaviors, edge cases, and ensuring changes are only applied outside fenced code regions.

herbertliu and others added 6 commits April 2, 2026 15:06
Port fixExportedMarkdown() from internal cli to improve round-trip
fidelity when fetched Markdown is re-imported via +update / +create.

Five fixes applied to MCP fetch-doc raw output:
1. fixBoldSpacing: remove trailing spaces before ** / *, strip redundant
   ** from ATX headings (CommonMark compliance)
2. fixSetextAmbiguity: insert blank line before "---" following a
   paragraph to prevent Setext H2 misparse
3. fixBlockquoteHardBreaks: insert bare ">" between consecutive "> "
   lines so create-doc preserves each line as its own paragraph
4. fixTopLevelSoftbreaks: insert blank line between adjacent top-level
   content lines (Lark exports blocks with single \n; parsers collapse
   them into one paragraph without the blank line)
5. Same blank-line insertion inside <lark-td> cells for table content

Result: round-trip diff 166 lines → 1 line (image token regeneration,
unavoidable by design). Accuracy ≈ 99.8%.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Cover all five fix functions with table-driven tests:
- fixBoldSpacing: trailing space removal and heading bold stripping
- fixSetextAmbiguity: blank line before --- to prevent Setext H2
- fixBlockquoteHardBreaks: bare > inserted between consecutive blockquote lines
- fixTopLevelSoftbreaks: blank line between top-level blocks, lark-td cells
- fixExportedMarkdown: end-to-end integration across all fixes

New code patch coverage for markdown_fix.go: 87-100% per function.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ut emoji aliases

Three improvements to fixExportedMarkdown post-processing:

1. fixTopLevelSoftbreaks: promote <callout> and <quote-container> from
   opaque blocks to content containers, same as <lark-td>. Adjacent
   content lines inside these containers now get a blank line inserted
   between them so create-doc preserves paragraph structure instead of
   collapsing all lines into one.

2. fixCalloutEmoji: new function that replaces named emoji aliases
   (e.g. emoji="warning") with actual Unicode emoji characters
   (e.g. emoji="⚠️"). fetch-doc sometimes emits named aliases that
   create-doc does not recognize, causing it to fall back to an
   unrelated emoji.

3. Tests updated to reflect new callout/quote-container behaviour and
   cover fixCalloutEmoji with table-driven cases.

Round-trip diff on test doc (839 lines): 24 lines → 19 lines.
Remaining diff: whiteboard token loss (create-doc design limitation,
4 whiteboards × ~2 lines each) + 1 emoji mapped by create-doc server.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… tracking

fetch-doc / create-doc adds a spurious blank line before each closing
code fence, e.g.:

  ```
  last line
             <- blank added by server
  ```

Add fixCodeBlockTrailingBlanks() to remove these blanks after fetch.

Also fix fixTopLevelSoftbreaks() to use the same simple toggle rule:
any ```lang line opens a code block; only a plain ``` closes it.
Previously the function toggled on every ``` line, causing lines inside
a nested fence (e.g. ```go inside ```markdown) to be treated as
top-level content and receive spurious blank-line insertions.

Round-trip diff for the Lark CLI dev spec: 200 lines → 156 lines.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added domain/ccm PR touches the ccm domain size/L Large or sensitive change across domains or core paths labels Apr 2, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 2, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a Markdown post-processing module and tests; the docs fetch handler now applies this fixer to MCP tool results when result["markdown"] is a string. Also includes a minor .gitignore blank-line change.

Changes

Cohort / File(s) Summary
Gitignore tweak
/.gitignore
Inserted an extra blank line before the “Generated / test artifacts” section; no ignore rules added or removed.
Markdown post-processing
shortcuts/doc/markdown_fix.go
New file implementing fixExportedMarkdown(md string) string and helpers (fixBoldSpacing, fixSetextAmbiguity, fixBlockquoteHardBreaks, fixTopLevelSoftbreaks, fixCalloutEmoji, applyOutsideCodeFences, etc.) that apply ordered transformations outside fenced code blocks and normalize newlines.
Docs fetch integration
shortcuts/doc/docs_fetch.go
Now conditionally calls fixExportedMarkdown(...) when result["markdown"] exists and is a string before formatting/outputting the result.
Tests
shortcuts/doc/markdown_fix_test.go
New unit tests covering helper behaviors and end-to-end fixExportedMarkdown (bold/italic spacing, setext ambiguity, blockquote breaks, top-level softbreaks, callout emoji mapping, fenced-code preservation, newline collapse).

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant DocsFetch as DocsFetchHandler
    participant MCP as MCPTool
    participant Fixer as MarkdownFixer

    Client->>DocsFetch: request docs
    DocsFetch->>MCP: invoke tool
    MCP-->>DocsFetch: returns result (may include "markdown")
    alt "markdown" is a string
        DocsFetch->>Fixer: fixExportedMarkdown(result["markdown"])
        Fixer-->>DocsFetch: cleaned markdown
        DocsFetch-->>Client: return result with cleaned markdown
    else no markdown string
        DocsFetch-->>Client: return result unchanged
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • LuckyTerry
  • liangshuo-1

Poem

🐇 I hopped through exported text tonight,

smoothed bold edges, nudged setext to right,
split blockquotes with a gentle cue,
kept fences safe and trimmed the trailing glue —
now docs bounce back, tidy, snug, and bright ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 47.37% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately and concisely describes the main change: post-processing exported Markdown to improve round-trip fidelity in the docs fetch command.
Description check ✅ Passed The PR description follows the required template with complete Summary, Changes, Test Plan, and Related Issues sections; all required information is present.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/docs-export-import-fix

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
shortcuts/doc/markdown_fix.go (1)

148-167: Duplicate doc comment block should be removed.

The fixCodeBlockTrailingBlanks comment is duplicated verbatim, which adds noise and risks drift.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shortcuts/doc/markdown_fix.go` around lines 148 - 167, The file contains a
duplicated doc comment for fixCodeBlockTrailingBlanks; remove the redundant copy
so only one explanatory comment remains immediately above the
fixCodeBlockTrailingBlanks declaration, keeping the original wording, formatting
and examples intact and ensuring surrounding blank lines/paragraph breaks stay
consistent with Go doc comment style for that function.
shortcuts/doc/markdown_fix_test.go (1)

90-125: Add fenced-code regression cases for early fixers.

Please add explicit tests asserting that fixBoldSpacing, fixSetextAmbiguity, and fixBlockquoteHardBreaks do not modify content inside fenced code blocks.

Example tests to add
+func TestFixersDoNotModifyInsideFences(t *testing.T) {
+	input := "```md\n**x **\n> a\n> b\nline\n---\n```"
+
+	if got := fixBoldSpacing(input); got != input {
+		t.Fatalf("fixBoldSpacing modified fenced code: %q", got)
+	}
+	if got := fixSetextAmbiguity(input); got != input {
+		t.Fatalf("fixSetextAmbiguity modified fenced code: %q", got)
+	}
+	if got := fixBlockquoteHardBreaks(input); got != input {
+		t.Fatalf("fixBlockquoteHardBreaks modified fenced code: %q", got)
+	}
+}

Also applies to: 127-173, 175-203

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shortcuts/doc/markdown_fix_test.go` around lines 90 - 125, Add regression
tests in shortcuts/doc/markdown_fix_test.go to ensure the early-fixer functions
do not alter fenced code blocks: create inputs containing a fenced code block
(triple backticks) with markdown that would otherwise trigger fixes (e.g., bold
spacing, setext-like lines, blockquote lines) and call fixBoldSpacing,
fixSetextAmbiguity, and fixBlockquoteHardBreaks with those inputs, asserting
each returns the original input unchanged; reference the existing test structure
and functions fixBoldSpacing, fixSetextAmbiguity, and fixBlockquoteHardBreaks
when adding these assertions so they run alongside the other TestFix... cases.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@shortcuts/doc/markdown_fix.go`:
- Around line 30-33: The three transforms (fixBoldSpacing, fixSetextAmbiguity,
fixBlockquoteHardBreaks) currently run over entire markdown and mutate content
inside fenced code blocks; update each function so it scans lines, tracks a
boolean inCode toggled on/off when encountering fenced-code delimiters (e.g. a
trimmed line == "```" or other fence variants), and only apply the transform
logic to chunks outside fenced blocks while appending fenced-block lines
verbatim (use a flush mechanism to process accumulated non-code chunk with
existing transform and then clear it); ensure the fence detection correctly
toggles inCode and preserves fence lines and inner code unchanged to maintain
round-trip fidelity.

---

Nitpick comments:
In `@shortcuts/doc/markdown_fix_test.go`:
- Around line 90-125: Add regression tests in shortcuts/doc/markdown_fix_test.go
to ensure the early-fixer functions do not alter fenced code blocks: create
inputs containing a fenced code block (triple backticks) with markdown that
would otherwise trigger fixes (e.g., bold spacing, setext-like lines, blockquote
lines) and call fixBoldSpacing, fixSetextAmbiguity, and fixBlockquoteHardBreaks
with those inputs, asserting each returns the original input unchanged;
reference the existing test structure and functions fixBoldSpacing,
fixSetextAmbiguity, and fixBlockquoteHardBreaks when adding these assertions so
they run alongside the other TestFix... cases.

In `@shortcuts/doc/markdown_fix.go`:
- Around line 148-167: The file contains a duplicated doc comment for
fixCodeBlockTrailingBlanks; remove the redundant copy so only one explanatory
comment remains immediately above the fixCodeBlockTrailingBlanks declaration,
keeping the original wording, formatting and examples intact and ensuring
surrounding blank lines/paragraph breaks stay consistent with Go doc comment
style for that function.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 90a5d8be-dcb4-4422-9fbc-1e021a9cab75

📥 Commits

Reviewing files that changed from the base of the PR and between 0f96bdf and 57b0303.

📒 Files selected for processing (4)
  • .gitignore
  • shortcuts/doc/docs_fetch.go
  • shortcuts/doc/markdown_fix.go
  • shortcuts/doc/markdown_fix_test.go

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 2, 2026

🚀 PR Preview Install Guide

🧰 CLI update

npm i -g https://pkg.pr.new/larksuite/cli/@larksuite/cli@5a0185a37704f4fcc3d6d61d46105595b61fd6d2

🧩 Skill update

npx skills add larksuite/cli#feat/docs-export-import-fix -y -g

@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Apr 2, 2026

Greptile Summary

This PR adds a fixExportedMarkdown() post-processing pipeline to the docs +fetch shortcut, correcting several Markdown spec violations in Lark's fetch-doc output (bold trailing spaces, Setext ambiguity, blockquote collapsing, missing blank lines between blocks, and emoji alias replacement) to improve round-trip fidelity when re-importing via create-doc. The implementation is well-structured with a clean per-pass architecture, thorough unit tests, and guarded code-fence bypass logic.

Confidence Score: 5/5

Safe to merge; both remaining findings are P2 edge cases that don't affect normal Lark exports.

All prior P0/P1 concerns from previous review threads have been addressed. The two new findings — 4-backtick fence detection and the Contains/HasPrefix inconsistency — are P2 robustness gaps that require atypical input to trigger. The pipeline logic is sound, tests are comprehensive, and the integration hook in docs_fetch.go is minimal and correct.

shortcuts/doc/markdown_fix.go — two minor robustness gaps in fence-close detection and container-depth closing-tag matching.

Vulnerabilities

No security concerns identified. The changes are entirely within string post-processing of document content already retrieved from the Lark MCP endpoint; no new network calls, file I/O, or user-controlled code paths are introduced.

Important Files Changed

Filename Overview
shortcuts/doc/markdown_fix.go New file with the full post-processing pipeline; two minor robustness gaps: non-3-backtick fences are never closed, and container-depth closing-tag detection uses Contains vs HasPrefix inconsistently.
shortcuts/doc/markdown_fix_test.go Comprehensive table-driven tests for all fix functions; covers edge cases including multiple bold spans in headings, inline code preservation, code-fence bypass, adjacent callout blocks, and list continuation detection.
shortcuts/doc/docs_fetch.go Minimal integration change: calls fixExportedMarkdown on the markdown field before formatting; type assertion is safe and only applies when the key exists.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["fetch-doc MCP result (raw markdown)"] --> B["fixBoldSpacing"]
    B --> C["fixSetextAmbiguity"]
    C --> D["fixBlockquoteHardBreaks"]
    D --> E["fixTopLevelSoftbreaks"]
    E --> F["fixCalloutEmoji"]
    F --> G["Collapse 3+ newlines to 2"]
    G --> H["Fixed markdown ready for create-doc"]
Loading

Reviews (4): Last reviewed commit: "chore(doc): fix gofmt formatting in scan..." | Re-trigger Greptile

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 2, 2026

CLA assistant check
All committers have signed the CLA.

@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Apr 8, 2026

Tip:

Greploops — Automatically fix all review issues by running /greploops in Claude Code. It iterates: fix, push, re-review, repeat until 5/5 confidence.

Use the Greptile plugin for Claude Code to query reviews, search comments, and manage custom context directly from your terminal.

herbertliu and others added 2 commits April 8, 2026 14:14
- Wrap fixBoldSpacing, fixSetextAmbiguity, fixBlockquoteHardBreaks,
  fixCalloutEmoji and the triple-newline collapse in applyOutsideCodeFences()
  so fenced code block content is never modified
- Skip inline code spans in fixBoldSpacing to avoid corrupting literal code
- Skip ATX heading lines in line-level bold fix; headingBoldRe handles them
- Change headingBoldRe from .+? to [^*]+ to avoid mismatching headings with
  multiple disjoint bold spans (e.g. "# **foo** and **bar**")
- Preserve tight lists in fixTopLevelSoftbreaks: consecutive list items and
  continuation lines are no longer separated by blank lines
- Fix adjacent container blocks (callout, quote-container) getting no blank
  line between them: only closing tags suppress insertion, not opening tags
- Remove duplicate doc comment on fixCodeBlockTrailingBlanks; add note that
  the trailing-blank removal is a heuristic and may affect edge cases
- Add fixCodeBlockTrailingBlanks as item 6 in fixExportedMarkdown top-level
  comment (it was previously omitted)
- Remove .tmp/ from .gitignore (unrelated to this PR)
- Add tests: inline code preservation, fence-aware transforms, tight lists,
  list continuation, adjacent callout blocks, applyOutsideCodeFences

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… edits

Removing a blank line immediately before a closing ``` fence is inherently
lossy: that blank line may be intentional code content. The heuristic was
only cosmetic (reducing round-trip diff noise), not a correctness fix, so
it does not meet the bar of the other transforms in this pipeline.

The correct long-term fix belongs on the create-doc import side, where the
server-inserted trailing blank should be tolerated rather than stripped on
the fetch side.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
shortcuts/doc/markdown_fix.go (1)

115-116: ⚠️ Potential issue | 🟠 Major

Single-backtick matching still corrupts some valid inline code spans.

inlineCodeRe only shields one-backtick spans. If the input uses a longer backtick delimiter to carry a literal backtick inside the code span, part of that span falls back into a “non-code” segment and fixBoldSpacingLine can rewrite literal content. This needs a backtick-run scanner rather than a regex that assumes a delimiter length of 1.

Also applies to: 155-180

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shortcuts/doc/markdown_fix.go` around lines 115 - 116, The regex inlineCodeRe
= regexp.MustCompile("`[^`]+`") only matches single-backtick spans and breaks
multi-backtick delimiters; replace usages (the inlineCodeRe variable and any
code path in fixBoldSpacingLine) with a scanner that finds backtick runs by
scanning the string for a starting run of N backticks, then searching for the
next identical run to form the code span (treating the content between as
verbatim), and skip processing of those spans in fixBoldSpacingLine; update all
occurrences of inlineCodeRe and the logic between lines referenced (including
the block around fixBoldSpacingLine) to use this run-length backtick scanner so
longer delimiters and embedded backticks are preserved.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@shortcuts/doc/markdown_fix.go`:
- Around line 103-110: fixBlockquoteHardBreaks currently inserts a blank ">"
between any two lines starting with "> ", which breaks quoted lists and
headings; update the conditional inside fixBlockquoteHardBreaks (the loop over
lines and the check that appends ">") to first strip the "> " prefix into
variables (e.g., cur := strings.TrimPrefix(line, "> ") and next :=
strings.TrimPrefix(lines[i+1], "> ")) and only append the extra ">" when both
cur and next are plain paragraph text — explicitly skip when next (or cur)
matches list/item markers (ordered or unordered like "1. " or "-/+/* "), ATX
headings ("#"), fence/opening markers ("```" or "~~~"), thematic breaks, or
other block-level markers — this preserves quoted lists/headings/fences while
still inserting hard-break placeholders for plain quoted text.
- Around line 75-84: Replace the narrow triple-backtick checks with a single
reusable fence-detection helper (e.g., detectFence(line string) (isFence bool,
prefix string, fenceChar rune, count int)) and use it in the three places that
iterate lines (the loop using variables lines/trimmed/inCode/flush/out and the
blocks around the commented ranges). The helper must parse an optional
blockquote/prefix (leading '>' and spaces), detect a fence made of at least
three identical fence runes (backtick or tilde), and return the fence character
and length; toggle entering code when you see any fence and toggle exiting only
when you see a fence with the same fence character and count >= opening count
and the same optional prefix. Replace the inline strings.HasPrefix(trimmed,
"```") and trimmed == "```" checks with calls to this helper so quoted fences
and 4+ tick fences open/close correctly and the logic is reused in all three
locations.

---

Duplicate comments:
In `@shortcuts/doc/markdown_fix.go`:
- Around line 115-116: The regex inlineCodeRe = regexp.MustCompile("`[^`]+`")
only matches single-backtick spans and breaks multi-backtick delimiters; replace
usages (the inlineCodeRe variable and any code path in fixBoldSpacingLine) with
a scanner that finds backtick runs by scanning the string for a starting run of
N backticks, then searching for the next identical run to form the code span
(treating the content between as verbatim), and skip processing of those spans
in fixBoldSpacingLine; update all occurrences of inlineCodeRe and the logic
between lines referenced (including the block around fixBoldSpacingLine) to use
this run-length backtick scanner so longer delimiters and embedded backticks are
preserved.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9336ec51-8d06-44d3-9d05-ebb568f81e74

📥 Commits

Reviewing files that changed from the base of the PR and between 5067b0d and c7ad872.

📒 Files selected for processing (3)
  • .gitignore
  • shortcuts/doc/markdown_fix.go
  • shortcuts/doc/markdown_fix_test.go
✅ Files skipped from review due to trivial changes (1)
  • .gitignore

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
shortcuts/doc/markdown_fix.go (1)

97-107: ⚠️ Potential issue | 🟠 Major

fixBlockquoteHardBreaks still rewrites quoted lists and headings.

The function inserts a blank > between any two consecutive > lines. This changes the structure of:

  • Quoted lists: > - a\n> - b> - a\n>\n> - b (loose list)
  • Quoted headings followed by content
  • Quoted code fences

Per past review feedback, this should skip list markers, headings, and fence markers after stripping the > prefix.

Suggested refinement
 func fixBlockquoteHardBreaks(md string) string {
 	lines := strings.Split(md, "\n")
 	out := make([]string, 0, len(lines)*2)
+	blockMarkerRe := regexp.MustCompile(`^([-*+]|\d+[.)]) |^#{1,6} |^[`+"```"+`~]{3}`)
 	for i, line := range lines {
 		out = append(out, line)
-		if strings.HasPrefix(line, "> ") && i+1 < len(lines) && strings.HasPrefix(lines[i+1], "> ") {
-			out = append(out, ">")
+		if strings.HasPrefix(line, "> ") && i+1 < len(lines) && strings.HasPrefix(lines[i+1], "> ") {
+			nextContent := strings.TrimPrefix(lines[i+1], "> ")
+			// Skip if next line is a block-level marker (list, heading, fence)
+			if !blockMarkerRe.MatchString(nextContent) {
+				out = append(out, ">")
+			}
 		}
 	}
 	return strings.Join(out, "\n")
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shortcuts/doc/markdown_fix.go` around lines 97 - 107, fixBlockquoteHardBreaks
currently inserts a blank ">" between any two consecutive quoted lines which
rewrites quoted lists, headings, and fences; change the logic in
fixBlockquoteHardBreaks to first compute nextContent :=
strings.TrimPrefix(lines[i+1], "> ") and only insert the extra ">" when
nextContent is not a block-level marker by matching it against a blockMarkerRe
(a regexp that detects list markers, ATX headings, and fence markers); add or
reuse a blockMarkerRe (e.g., matching list bullets/numeric lists, ^#{1,6}\s, and
fence starts like ``` ) and use it in the existing conditional so
lists/headings/fences are skipped.
🧹 Nitpick comments (4)
shortcuts/doc/markdown_fix.go (2)

293-303: Fence toggle in fixTopLevelSoftbreaks has the same narrowness issue.

This duplicates the fence detection logic from applyOutsideCodeFences with the same limitations (no tilde fences, no longer fences, no blockquote-prefixed fences). Consider extracting shared fence-detection logic.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shortcuts/doc/markdown_fix.go` around lines 293 - 303, The fence detection in
fixTopLevelSoftbreaks duplicates the narrow logic from applyOutsideCodeFences
(it only matches exactly "```"), so extract a shared helper (e.g.,
isFenceStartOrEnd or detectFence) used by both fixTopLevelSoftbreaks and
applyOutsideCodeFences that correctly recognizes backtick and tilde fences,
supports longer/variable-length fences (e.g., "```", "````", "~~~"), and accepts
optional blockquote prefixes ("> " or repeated) before the fence; update the
code in fixTopLevelSoftbreaks to call that helper instead of comparing trimmed
== "```" and ensure it preserves inCodeBlock toggling and appends the original
line to out as before.

237-242: isTableStructuralTag is overly broad.

strings.HasPrefix(s, "<lark-t") matches any tag starting with <lark-t, including potential future tags like <lark-text>. Consider matching the exact known tags explicitly for clarity and safety.

Suggested fix
 func isTableStructuralTag(s string) bool {
-	return strings.HasPrefix(s, "<lark-t") ||
-		strings.HasPrefix(s, "</lark-t")
+	for _, prefix := range []string{
+		"<lark-table", "</lark-table>",
+		"<lark-tr", "</lark-tr>",
+		"<lark-td", "</lark-td>",
+	} {
+		if strings.HasPrefix(s, prefix) {
+			return true
+		}
+	}
+	return false
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shortcuts/doc/markdown_fix.go` around lines 237 - 242, The
isTableStructuralTag function is too broad by checking strings.HasPrefix(s,
"<lark-t"), which will match unintended tags (e.g., <lark-text>); update the
predicate to explicitly check the known structural tags instead—use explicit
prefix checks for "<lark-table", "</lark-table", "<lark-tr", "</lark-tr",
"<lark-td", and "</lark-td" (or equivalent exact tag checks) inside
isTableStructuralTag to limit matches to only table, tr and td open/close tags.
shortcuts/doc/markdown_fix_test.go (2)

105-140: Add test for blockquote containing list items.

Given the concern that fixBlockquoteHardBreaks rewrites quoted lists, a test case would help document expected behavior:

{
	name:  "blockquote list items",
	input: "> - item1\n> - item2",
	want:  "> - item1\n> - item2", // or document if separator is expected
},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shortcuts/doc/markdown_fix_test.go` around lines 105 - 140, Add a test case
to TestFixBlockquoteHardBreaks to cover blockquotes that contain list items:
insert a struct entry with name "blockquote list items", input "> - item1\n> -
item2", and want "> - item1\n> - item2" (i.e., expect no inserted blank quoted
separator); place it alongside the other test cases so fixBlockquoteHardBreaks
is exercised for quoted list lines and the expected behavior is documented.

287-313: Test coverage for applyOutsideCodeFences should include edge cases.

The test only covers triple-backtick fences. Given the fence-detection limitations flagged in the implementation, consider adding tests for:

  1. Tilde fences (~~~)
  2. Longer fences (````````)
  3. Blockquote-prefixed fences (`> ````)

These tests would document expected behavior and catch regressions if fence detection is improved.

Suggested additional test cases
func TestApplyOutsideCodeFencesTilde(t *testing.T) {
	// Currently expected to fail if tilde fences aren't supported
	input := "~~~\n**x **\n~~~"
	got := applyOutsideCodeFences(input, fixBoldSpacing)
	if got != input {
		t.Errorf("tilde fence content modified: got %q", got)
	}
}

func TestApplyOutsideCodeFencesBlockquote(t *testing.T) {
	input := "> ```\n> **x **\n> ```"
	got := applyOutsideCodeFences(input, fixBoldSpacing)
	if got != input {
		t.Errorf("blockquote fence content modified: got %q", got)
	}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shortcuts/doc/markdown_fix_test.go` around lines 287 - 313, The current tests
only exercise triple-backtick fences; add unit tests that cover tilde fences,
longer fence delimiters, and blockquote-prefixed fences to document current
behavior and prevent regressions: create tests named like
TestApplyOutsideCodeFencesTilde, TestApplyOutsideCodeFencesLongerFences, and
TestApplyOutsideCodeFencesBlockquote that call applyOutsideCodeFences with
inputs using ~~~ fences, a longer fence (e.g., 8 backticks), and a
blockquote-prefixed fence ("> ```" style), then assert that content inside those
fences remains unchanged while content outside is transformed by fixBoldSpacing
(or other transformer) as in the existing TestApplyOutsideCodeFences assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@shortcuts/doc/markdown_fix.go`:
- Around line 260-266: isListItemOrContinuation currently treats any line
starting with two spaces or a tab as a list continuation, which misclassifies
indented code blocks; change the logic so continuation detection requires that
the previous line was an actual list item. Update the call sites to pass context
(e.g., add a boolean parameter prevWasList to isListItemOrContinuation or
consult a surrounding state variable like lastWasListItem) and only return true
for indentation-based continuations when prevWasList is true; keep the existing
listItemRe.MatchString check unchanged.

---

Duplicate comments:
In `@shortcuts/doc/markdown_fix.go`:
- Around line 97-107: fixBlockquoteHardBreaks currently inserts a blank ">"
between any two consecutive quoted lines which rewrites quoted lists, headings,
and fences; change the logic in fixBlockquoteHardBreaks to first compute
nextContent := strings.TrimPrefix(lines[i+1], "> ") and only insert the extra
">" when nextContent is not a block-level marker by matching it against a
blockMarkerRe (a regexp that detects list markers, ATX headings, and fence
markers); add or reuse a blockMarkerRe (e.g., matching list bullets/numeric
lists, ^#{1,6}\s, and fence starts like ``` ) and use it in the existing
conditional so lists/headings/fences are skipped.

---

Nitpick comments:
In `@shortcuts/doc/markdown_fix_test.go`:
- Around line 105-140: Add a test case to TestFixBlockquoteHardBreaks to cover
blockquotes that contain list items: insert a struct entry with name "blockquote
list items", input "> - item1\n> - item2", and want "> - item1\n> - item2"
(i.e., expect no inserted blank quoted separator); place it alongside the other
test cases so fixBlockquoteHardBreaks is exercised for quoted list lines and the
expected behavior is documented.
- Around line 287-313: The current tests only exercise triple-backtick fences;
add unit tests that cover tilde fences, longer fence delimiters, and
blockquote-prefixed fences to document current behavior and prevent regressions:
create tests named like TestApplyOutsideCodeFencesTilde,
TestApplyOutsideCodeFencesLongerFences, and TestApplyOutsideCodeFencesBlockquote
that call applyOutsideCodeFences with inputs using ~~~ fences, a longer fence
(e.g., 8 backticks), and a blockquote-prefixed fence ("> ```" style), then
assert that content inside those fences remains unchanged while content outside
is transformed by fixBoldSpacing (or other transformer) as in the existing
TestApplyOutsideCodeFences assertions.

In `@shortcuts/doc/markdown_fix.go`:
- Around line 293-303: The fence detection in fixTopLevelSoftbreaks duplicates
the narrow logic from applyOutsideCodeFences (it only matches exactly "```"), so
extract a shared helper (e.g., isFenceStartOrEnd or detectFence) used by both
fixTopLevelSoftbreaks and applyOutsideCodeFences that correctly recognizes
backtick and tilde fences, supports longer/variable-length fences (e.g., "```",
"````", "~~~"), and accepts optional blockquote prefixes ("> " or repeated)
before the fence; update the code in fixTopLevelSoftbreaks to call that helper
instead of comparing trimmed == "```" and ensure it preserves inCodeBlock
toggling and appends the original line to out as before.
- Around line 237-242: The isTableStructuralTag function is too broad by
checking strings.HasPrefix(s, "<lark-t"), which will match unintended tags
(e.g., <lark-text>); update the predicate to explicitly check the known
structural tags instead—use explicit prefix checks for "<lark-table",
"</lark-table", "<lark-tr", "</lark-tr", "<lark-td", and "</lark-td" (or
equivalent exact tag checks) inside isTableStructuralTag to limit matches to
only table, tr and td open/close tags.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: be5b48e9-786e-4869-b1da-fc847562e345

📥 Commits

Reviewing files that changed from the base of the PR and between c7ad872 and aef0089.

📒 Files selected for processing (2)
  • shortcuts/doc/markdown_fix.go
  • shortcuts/doc/markdown_fix_test.go

herbertliu and others added 2 commits April 8, 2026 18:21
The previous regex "`[^`]+`" only matched single-backtick inline code
spans. Any span delimited by two or more backticks (e.g. `` `foo` ``)
would be split at the outer boundary, causing fixBoldSpacingLine to
treat the inner content as regular text and potentially rewrite bold/
italic markers inside what should be a verbatim code span.

Replace inlineCodeRe and its FindAllStringIndex usage with
scanInlineCodeSpans(), a hand-written scanner that follows CommonMark
§6.1: find an opening run of N backticks, then search forward for the
next identical-length run to close the span. This correctly handles
all delimiter lengths and embedded single backticks.

Add two regression tests for double-backtick spans in TestFixBoldSpacing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@shortcuts/doc/markdown_fix.go`:
- Around line 141-145: Run gofmt on the file to fix CI formatting errors: apply
`gofmt -w` to shortcuts/doc/markdown_fix.go (which contains the function
scanInlineCodeSpans) so the function declaration and surrounding code match Go
formatting rules; after formatting, re-run the tests/CI to confirm the
formatting issue is resolved.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a850c073-c232-4781-90c3-a87e5f98ee9f

📥 Commits

Reviewing files that changed from the base of the PR and between aef0089 and 5f0fc16.

📒 Files selected for processing (2)
  • shortcuts/doc/markdown_fix.go
  • shortcuts/doc/markdown_fix_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • shortcuts/doc/markdown_fix_test.go

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@fangshuyu-768 fangshuyu-768 merged commit 9d48ef4 into main Apr 9, 2026
14 checks passed
@fangshuyu-768 fangshuyu-768 deleted the feat/docs-export-import-fix branch April 9, 2026 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain/ccm PR touches the ccm domain size/L Large or sensitive change across domains or core paths

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants