fix(doc): post-process docs +fetch output to improve round-trip fidelity#214
fix(doc): post-process docs +fetch output to improve round-trip fidelity#214fangshuyu-768 merged 12 commits intomainfrom
docs +fetch output to improve round-trip fidelity#214Conversation
Port fixExportedMarkdown() from internal cli to improve round-trip fidelity when fetched Markdown is re-imported via +update / +create. Five fixes applied to MCP fetch-doc raw output: 1. fixBoldSpacing: remove trailing spaces before ** / *, strip redundant ** from ATX headings (CommonMark compliance) 2. fixSetextAmbiguity: insert blank line before "---" following a paragraph to prevent Setext H2 misparse 3. fixBlockquoteHardBreaks: insert bare ">" between consecutive "> " lines so create-doc preserves each line as its own paragraph 4. fixTopLevelSoftbreaks: insert blank line between adjacent top-level content lines (Lark exports blocks with single \n; parsers collapse them into one paragraph without the blank line) 5. Same blank-line insertion inside <lark-td> cells for table content Result: round-trip diff 166 lines → 1 line (image token regeneration, unavoidable by design). Accuracy ≈ 99.8%. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Cover all five fix functions with table-driven tests: - fixBoldSpacing: trailing space removal and heading bold stripping - fixSetextAmbiguity: blank line before --- to prevent Setext H2 - fixBlockquoteHardBreaks: bare > inserted between consecutive blockquote lines - fixTopLevelSoftbreaks: blank line between top-level blocks, lark-td cells - fixExportedMarkdown: end-to-end integration across all fixes New code patch coverage for markdown_fix.go: 87-100% per function. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ut emoji aliases Three improvements to fixExportedMarkdown post-processing: 1. fixTopLevelSoftbreaks: promote <callout> and <quote-container> from opaque blocks to content containers, same as <lark-td>. Adjacent content lines inside these containers now get a blank line inserted between them so create-doc preserves paragraph structure instead of collapsing all lines into one. 2. fixCalloutEmoji: new function that replaces named emoji aliases (e.g. emoji="warning") with actual Unicode emoji characters (e.g. emoji="⚠️ "). fetch-doc sometimes emits named aliases that create-doc does not recognize, causing it to fall back to an unrelated emoji. 3. Tests updated to reflect new callout/quote-container behaviour and cover fixCalloutEmoji with table-driven cases. Round-trip diff on test doc (839 lines): 24 lines → 19 lines. Remaining diff: whiteboard token loss (create-doc design limitation, 4 whiteboards × ~2 lines each) + 1 emoji mapped by create-doc server. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… tracking
fetch-doc / create-doc adds a spurious blank line before each closing
code fence, e.g.:
```
last line
<- blank added by server
```
Add fixCodeBlockTrailingBlanks() to remove these blanks after fetch.
Also fix fixTopLevelSoftbreaks() to use the same simple toggle rule:
any ```lang line opens a code block; only a plain ``` closes it.
Previously the function toggled on every ``` line, causing lines inside
a nested fence (e.g. ```go inside ```markdown) to be treated as
top-level content and receive spurious blank-line insertions.
Round-trip diff for the Lark CLI dev spec: 200 lines → 156 lines.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a Markdown post-processing module and tests; the docs fetch handler now applies this fixer to MCP tool results when Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant DocsFetch as DocsFetchHandler
participant MCP as MCPTool
participant Fixer as MarkdownFixer
Client->>DocsFetch: request docs
DocsFetch->>MCP: invoke tool
MCP-->>DocsFetch: returns result (may include "markdown")
alt "markdown" is a string
DocsFetch->>Fixer: fixExportedMarkdown(result["markdown"])
Fixer-->>DocsFetch: cleaned markdown
DocsFetch-->>Client: return result with cleaned markdown
else no markdown string
DocsFetch-->>Client: return result unchanged
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
shortcuts/doc/markdown_fix.go (1)
148-167: Duplicate doc comment block should be removed.The
fixCodeBlockTrailingBlankscomment is duplicated verbatim, which adds noise and risks drift.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/doc/markdown_fix.go` around lines 148 - 167, The file contains a duplicated doc comment for fixCodeBlockTrailingBlanks; remove the redundant copy so only one explanatory comment remains immediately above the fixCodeBlockTrailingBlanks declaration, keeping the original wording, formatting and examples intact and ensuring surrounding blank lines/paragraph breaks stay consistent with Go doc comment style for that function.shortcuts/doc/markdown_fix_test.go (1)
90-125: Add fenced-code regression cases for early fixers.Please add explicit tests asserting that
fixBoldSpacing,fixSetextAmbiguity, andfixBlockquoteHardBreaksdo not modify content inside fenced code blocks.Example tests to add
+func TestFixersDoNotModifyInsideFences(t *testing.T) { + input := "```md\n**x **\n> a\n> b\nline\n---\n```" + + if got := fixBoldSpacing(input); got != input { + t.Fatalf("fixBoldSpacing modified fenced code: %q", got) + } + if got := fixSetextAmbiguity(input); got != input { + t.Fatalf("fixSetextAmbiguity modified fenced code: %q", got) + } + if got := fixBlockquoteHardBreaks(input); got != input { + t.Fatalf("fixBlockquoteHardBreaks modified fenced code: %q", got) + } +}Also applies to: 127-173, 175-203
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/doc/markdown_fix_test.go` around lines 90 - 125, Add regression tests in shortcuts/doc/markdown_fix_test.go to ensure the early-fixer functions do not alter fenced code blocks: create inputs containing a fenced code block (triple backticks) with markdown that would otherwise trigger fixes (e.g., bold spacing, setext-like lines, blockquote lines) and call fixBoldSpacing, fixSetextAmbiguity, and fixBlockquoteHardBreaks with those inputs, asserting each returns the original input unchanged; reference the existing test structure and functions fixBoldSpacing, fixSetextAmbiguity, and fixBlockquoteHardBreaks when adding these assertions so they run alongside the other TestFix... cases.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@shortcuts/doc/markdown_fix.go`:
- Around line 30-33: The three transforms (fixBoldSpacing, fixSetextAmbiguity,
fixBlockquoteHardBreaks) currently run over entire markdown and mutate content
inside fenced code blocks; update each function so it scans lines, tracks a
boolean inCode toggled on/off when encountering fenced-code delimiters (e.g. a
trimmed line == "```" or other fence variants), and only apply the transform
logic to chunks outside fenced blocks while appending fenced-block lines
verbatim (use a flush mechanism to process accumulated non-code chunk with
existing transform and then clear it); ensure the fence detection correctly
toggles inCode and preserves fence lines and inner code unchanged to maintain
round-trip fidelity.
---
Nitpick comments:
In `@shortcuts/doc/markdown_fix_test.go`:
- Around line 90-125: Add regression tests in shortcuts/doc/markdown_fix_test.go
to ensure the early-fixer functions do not alter fenced code blocks: create
inputs containing a fenced code block (triple backticks) with markdown that
would otherwise trigger fixes (e.g., bold spacing, setext-like lines, blockquote
lines) and call fixBoldSpacing, fixSetextAmbiguity, and fixBlockquoteHardBreaks
with those inputs, asserting each returns the original input unchanged;
reference the existing test structure and functions fixBoldSpacing,
fixSetextAmbiguity, and fixBlockquoteHardBreaks when adding these assertions so
they run alongside the other TestFix... cases.
In `@shortcuts/doc/markdown_fix.go`:
- Around line 148-167: The file contains a duplicated doc comment for
fixCodeBlockTrailingBlanks; remove the redundant copy so only one explanatory
comment remains immediately above the fixCodeBlockTrailingBlanks declaration,
keeping the original wording, formatting and examples intact and ensuring
surrounding blank lines/paragraph breaks stay consistent with Go doc comment
style for that function.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 90a5d8be-dcb4-4422-9fbc-1e021a9cab75
📒 Files selected for processing (4)
.gitignoreshortcuts/doc/docs_fetch.goshortcuts/doc/markdown_fix.goshortcuts/doc/markdown_fix_test.go
🚀 PR Preview Install Guide🧰 CLI updatenpm i -g https://pkg.pr.new/larksuite/cli/@larksuite/cli@5a0185a37704f4fcc3d6d61d46105595b61fd6d2🧩 Skill updatenpx skills add larksuite/cli#feat/docs-export-import-fix -y -g |
Greptile SummaryThis PR adds a Confidence Score: 5/5Safe to merge; both remaining findings are P2 edge cases that don't affect normal Lark exports. All prior P0/P1 concerns from previous review threads have been addressed. The two new findings — 4-backtick fence detection and the Contains/HasPrefix inconsistency — are P2 robustness gaps that require atypical input to trigger. The pipeline logic is sound, tests are comprehensive, and the integration hook in docs_fetch.go is minimal and correct. shortcuts/doc/markdown_fix.go — two minor robustness gaps in fence-close detection and container-depth closing-tag matching.
|
| Filename | Overview |
|---|---|
| shortcuts/doc/markdown_fix.go | New file with the full post-processing pipeline; two minor robustness gaps: non-3-backtick fences are never closed, and container-depth closing-tag detection uses Contains vs HasPrefix inconsistently. |
| shortcuts/doc/markdown_fix_test.go | Comprehensive table-driven tests for all fix functions; covers edge cases including multiple bold spans in headings, inline code preservation, code-fence bypass, adjacent callout blocks, and list continuation detection. |
| shortcuts/doc/docs_fetch.go | Minimal integration change: calls fixExportedMarkdown on the markdown field before formatting; type assertion is safe and only applies when the key exists. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["fetch-doc MCP result (raw markdown)"] --> B["fixBoldSpacing"]
B --> C["fixSetextAmbiguity"]
C --> D["fixBlockquoteHardBreaks"]
D --> E["fixTopLevelSoftbreaks"]
E --> F["fixCalloutEmoji"]
F --> G["Collapse 3+ newlines to 2"]
G --> H["Fixed markdown ready for create-doc"]
Reviews (4): Last reviewed commit: "chore(doc): fix gofmt formatting in scan..." | Re-trigger Greptile
|
Tip: Greploops — Automatically fix all review issues by running Use the Greptile plugin for Claude Code to query reviews, search comments, and manage custom context directly from your terminal. |
- Wrap fixBoldSpacing, fixSetextAmbiguity, fixBlockquoteHardBreaks, fixCalloutEmoji and the triple-newline collapse in applyOutsideCodeFences() so fenced code block content is never modified - Skip inline code spans in fixBoldSpacing to avoid corrupting literal code - Skip ATX heading lines in line-level bold fix; headingBoldRe handles them - Change headingBoldRe from .+? to [^*]+ to avoid mismatching headings with multiple disjoint bold spans (e.g. "# **foo** and **bar**") - Preserve tight lists in fixTopLevelSoftbreaks: consecutive list items and continuation lines are no longer separated by blank lines - Fix adjacent container blocks (callout, quote-container) getting no blank line between them: only closing tags suppress insertion, not opening tags - Remove duplicate doc comment on fixCodeBlockTrailingBlanks; add note that the trailing-blank removal is a heuristic and may affect edge cases - Add fixCodeBlockTrailingBlanks as item 6 in fixExportedMarkdown top-level comment (it was previously omitted) - Remove .tmp/ from .gitignore (unrelated to this PR) - Add tests: inline code preservation, fence-aware transforms, tight lists, list continuation, adjacent callout blocks, applyOutsideCodeFences Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… edits Removing a blank line immediately before a closing ``` fence is inherently lossy: that blank line may be intentional code content. The heuristic was only cosmetic (reducing round-trip diff noise), not a correctness fix, so it does not meet the bar of the other transforms in this pipeline. The correct long-term fix belongs on the create-doc import side, where the server-inserted trailing blank should be tolerated rather than stripped on the fetch side. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
shortcuts/doc/markdown_fix.go (1)
115-116:⚠️ Potential issue | 🟠 MajorSingle-backtick matching still corrupts some valid inline code spans.
inlineCodeReonly shields one-backtick spans. If the input uses a longer backtick delimiter to carry a literal backtick inside the code span, part of that span falls back into a “non-code” segment andfixBoldSpacingLinecan rewrite literal content. This needs a backtick-run scanner rather than a regex that assumes a delimiter length of 1.Also applies to: 155-180
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/doc/markdown_fix.go` around lines 115 - 116, The regex inlineCodeRe = regexp.MustCompile("`[^`]+`") only matches single-backtick spans and breaks multi-backtick delimiters; replace usages (the inlineCodeRe variable and any code path in fixBoldSpacingLine) with a scanner that finds backtick runs by scanning the string for a starting run of N backticks, then searching for the next identical run to form the code span (treating the content between as verbatim), and skip processing of those spans in fixBoldSpacingLine; update all occurrences of inlineCodeRe and the logic between lines referenced (including the block around fixBoldSpacingLine) to use this run-length backtick scanner so longer delimiters and embedded backticks are preserved.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@shortcuts/doc/markdown_fix.go`:
- Around line 103-110: fixBlockquoteHardBreaks currently inserts a blank ">"
between any two lines starting with "> ", which breaks quoted lists and
headings; update the conditional inside fixBlockquoteHardBreaks (the loop over
lines and the check that appends ">") to first strip the "> " prefix into
variables (e.g., cur := strings.TrimPrefix(line, "> ") and next :=
strings.TrimPrefix(lines[i+1], "> ")) and only append the extra ">" when both
cur and next are plain paragraph text — explicitly skip when next (or cur)
matches list/item markers (ordered or unordered like "1. " or "-/+/* "), ATX
headings ("#"), fence/opening markers ("```" or "~~~"), thematic breaks, or
other block-level markers — this preserves quoted lists/headings/fences while
still inserting hard-break placeholders for plain quoted text.
- Around line 75-84: Replace the narrow triple-backtick checks with a single
reusable fence-detection helper (e.g., detectFence(line string) (isFence bool,
prefix string, fenceChar rune, count int)) and use it in the three places that
iterate lines (the loop using variables lines/trimmed/inCode/flush/out and the
blocks around the commented ranges). The helper must parse an optional
blockquote/prefix (leading '>' and spaces), detect a fence made of at least
three identical fence runes (backtick or tilde), and return the fence character
and length; toggle entering code when you see any fence and toggle exiting only
when you see a fence with the same fence character and count >= opening count
and the same optional prefix. Replace the inline strings.HasPrefix(trimmed,
"```") and trimmed == "```" checks with calls to this helper so quoted fences
and 4+ tick fences open/close correctly and the logic is reused in all three
locations.
---
Duplicate comments:
In `@shortcuts/doc/markdown_fix.go`:
- Around line 115-116: The regex inlineCodeRe = regexp.MustCompile("`[^`]+`")
only matches single-backtick spans and breaks multi-backtick delimiters; replace
usages (the inlineCodeRe variable and any code path in fixBoldSpacingLine) with
a scanner that finds backtick runs by scanning the string for a starting run of
N backticks, then searching for the next identical run to form the code span
(treating the content between as verbatim), and skip processing of those spans
in fixBoldSpacingLine; update all occurrences of inlineCodeRe and the logic
between lines referenced (including the block around fixBoldSpacingLine) to use
this run-length backtick scanner so longer delimiters and embedded backticks are
preserved.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9336ec51-8d06-44d3-9d05-ebb568f81e74
📒 Files selected for processing (3)
.gitignoreshortcuts/doc/markdown_fix.goshortcuts/doc/markdown_fix_test.go
✅ Files skipped from review due to trivial changes (1)
- .gitignore
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
shortcuts/doc/markdown_fix.go (1)
97-107:⚠️ Potential issue | 🟠 Major
fixBlockquoteHardBreaksstill rewrites quoted lists and headings.The function inserts a blank
>between any two consecutive>lines. This changes the structure of:
- Quoted lists:
> - a\n> - b→> - a\n>\n> - b(loose list)- Quoted headings followed by content
- Quoted code fences
Per past review feedback, this should skip list markers, headings, and fence markers after stripping the
>prefix.Suggested refinement
func fixBlockquoteHardBreaks(md string) string { lines := strings.Split(md, "\n") out := make([]string, 0, len(lines)*2) + blockMarkerRe := regexp.MustCompile(`^([-*+]|\d+[.)]) |^#{1,6} |^[`+"```"+`~]{3}`) for i, line := range lines { out = append(out, line) - if strings.HasPrefix(line, "> ") && i+1 < len(lines) && strings.HasPrefix(lines[i+1], "> ") { - out = append(out, ">") + if strings.HasPrefix(line, "> ") && i+1 < len(lines) && strings.HasPrefix(lines[i+1], "> ") { + nextContent := strings.TrimPrefix(lines[i+1], "> ") + // Skip if next line is a block-level marker (list, heading, fence) + if !blockMarkerRe.MatchString(nextContent) { + out = append(out, ">") + } } } return strings.Join(out, "\n") }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/doc/markdown_fix.go` around lines 97 - 107, fixBlockquoteHardBreaks currently inserts a blank ">" between any two consecutive quoted lines which rewrites quoted lists, headings, and fences; change the logic in fixBlockquoteHardBreaks to first compute nextContent := strings.TrimPrefix(lines[i+1], "> ") and only insert the extra ">" when nextContent is not a block-level marker by matching it against a blockMarkerRe (a regexp that detects list markers, ATX headings, and fence markers); add or reuse a blockMarkerRe (e.g., matching list bullets/numeric lists, ^#{1,6}\s, and fence starts like ``` ) and use it in the existing conditional so lists/headings/fences are skipped.
🧹 Nitpick comments (4)
shortcuts/doc/markdown_fix.go (2)
293-303: Fence toggle infixTopLevelSoftbreakshas the same narrowness issue.This duplicates the fence detection logic from
applyOutsideCodeFenceswith the same limitations (no tilde fences, no longer fences, no blockquote-prefixed fences). Consider extracting shared fence-detection logic.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/doc/markdown_fix.go` around lines 293 - 303, The fence detection in fixTopLevelSoftbreaks duplicates the narrow logic from applyOutsideCodeFences (it only matches exactly "```"), so extract a shared helper (e.g., isFenceStartOrEnd or detectFence) used by both fixTopLevelSoftbreaks and applyOutsideCodeFences that correctly recognizes backtick and tilde fences, supports longer/variable-length fences (e.g., "```", "````", "~~~"), and accepts optional blockquote prefixes ("> " or repeated) before the fence; update the code in fixTopLevelSoftbreaks to call that helper instead of comparing trimmed == "```" and ensure it preserves inCodeBlock toggling and appends the original line to out as before.
237-242:isTableStructuralTagis overly broad.
strings.HasPrefix(s, "<lark-t")matches any tag starting with<lark-t, including potential future tags like<lark-text>. Consider matching the exact known tags explicitly for clarity and safety.Suggested fix
func isTableStructuralTag(s string) bool { - return strings.HasPrefix(s, "<lark-t") || - strings.HasPrefix(s, "</lark-t") + for _, prefix := range []string{ + "<lark-table", "</lark-table>", + "<lark-tr", "</lark-tr>", + "<lark-td", "</lark-td>", + } { + if strings.HasPrefix(s, prefix) { + return true + } + } + return false }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/doc/markdown_fix.go` around lines 237 - 242, The isTableStructuralTag function is too broad by checking strings.HasPrefix(s, "<lark-t"), which will match unintended tags (e.g., <lark-text>); update the predicate to explicitly check the known structural tags instead—use explicit prefix checks for "<lark-table", "</lark-table", "<lark-tr", "</lark-tr", "<lark-td", and "</lark-td" (or equivalent exact tag checks) inside isTableStructuralTag to limit matches to only table, tr and td open/close tags.shortcuts/doc/markdown_fix_test.go (2)
105-140: Add test for blockquote containing list items.Given the concern that
fixBlockquoteHardBreaksrewrites quoted lists, a test case would help document expected behavior:{ name: "blockquote list items", input: "> - item1\n> - item2", want: "> - item1\n> - item2", // or document if separator is expected },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/doc/markdown_fix_test.go` around lines 105 - 140, Add a test case to TestFixBlockquoteHardBreaks to cover blockquotes that contain list items: insert a struct entry with name "blockquote list items", input "> - item1\n> - item2", and want "> - item1\n> - item2" (i.e., expect no inserted blank quoted separator); place it alongside the other test cases so fixBlockquoteHardBreaks is exercised for quoted list lines and the expected behavior is documented.
287-313: Test coverage forapplyOutsideCodeFencesshould include edge cases.The test only covers triple-backtick fences. Given the fence-detection limitations flagged in the implementation, consider adding tests for:
- Tilde fences (
~~~)- Longer fences (````````)
- Blockquote-prefixed fences (`> ````)
These tests would document expected behavior and catch regressions if fence detection is improved.
Suggested additional test cases
func TestApplyOutsideCodeFencesTilde(t *testing.T) { // Currently expected to fail if tilde fences aren't supported input := "~~~\n**x **\n~~~" got := applyOutsideCodeFences(input, fixBoldSpacing) if got != input { t.Errorf("tilde fence content modified: got %q", got) } } func TestApplyOutsideCodeFencesBlockquote(t *testing.T) { input := "> ```\n> **x **\n> ```" got := applyOutsideCodeFences(input, fixBoldSpacing) if got != input { t.Errorf("blockquote fence content modified: got %q", got) } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/doc/markdown_fix_test.go` around lines 287 - 313, The current tests only exercise triple-backtick fences; add unit tests that cover tilde fences, longer fence delimiters, and blockquote-prefixed fences to document current behavior and prevent regressions: create tests named like TestApplyOutsideCodeFencesTilde, TestApplyOutsideCodeFencesLongerFences, and TestApplyOutsideCodeFencesBlockquote that call applyOutsideCodeFences with inputs using ~~~ fences, a longer fence (e.g., 8 backticks), and a blockquote-prefixed fence ("> ```" style), then assert that content inside those fences remains unchanged while content outside is transformed by fixBoldSpacing (or other transformer) as in the existing TestApplyOutsideCodeFences assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@shortcuts/doc/markdown_fix.go`:
- Around line 260-266: isListItemOrContinuation currently treats any line
starting with two spaces or a tab as a list continuation, which misclassifies
indented code blocks; change the logic so continuation detection requires that
the previous line was an actual list item. Update the call sites to pass context
(e.g., add a boolean parameter prevWasList to isListItemOrContinuation or
consult a surrounding state variable like lastWasListItem) and only return true
for indentation-based continuations when prevWasList is true; keep the existing
listItemRe.MatchString check unchanged.
---
Duplicate comments:
In `@shortcuts/doc/markdown_fix.go`:
- Around line 97-107: fixBlockquoteHardBreaks currently inserts a blank ">"
between any two consecutive quoted lines which rewrites quoted lists, headings,
and fences; change the logic in fixBlockquoteHardBreaks to first compute
nextContent := strings.TrimPrefix(lines[i+1], "> ") and only insert the extra
">" when nextContent is not a block-level marker by matching it against a
blockMarkerRe (a regexp that detects list markers, ATX headings, and fence
markers); add or reuse a blockMarkerRe (e.g., matching list bullets/numeric
lists, ^#{1,6}\s, and fence starts like ``` ) and use it in the existing
conditional so lists/headings/fences are skipped.
---
Nitpick comments:
In `@shortcuts/doc/markdown_fix_test.go`:
- Around line 105-140: Add a test case to TestFixBlockquoteHardBreaks to cover
blockquotes that contain list items: insert a struct entry with name "blockquote
list items", input "> - item1\n> - item2", and want "> - item1\n> - item2"
(i.e., expect no inserted blank quoted separator); place it alongside the other
test cases so fixBlockquoteHardBreaks is exercised for quoted list lines and the
expected behavior is documented.
- Around line 287-313: The current tests only exercise triple-backtick fences;
add unit tests that cover tilde fences, longer fence delimiters, and
blockquote-prefixed fences to document current behavior and prevent regressions:
create tests named like TestApplyOutsideCodeFencesTilde,
TestApplyOutsideCodeFencesLongerFences, and TestApplyOutsideCodeFencesBlockquote
that call applyOutsideCodeFences with inputs using ~~~ fences, a longer fence
(e.g., 8 backticks), and a blockquote-prefixed fence ("> ```" style), then
assert that content inside those fences remains unchanged while content outside
is transformed by fixBoldSpacing (or other transformer) as in the existing
TestApplyOutsideCodeFences assertions.
In `@shortcuts/doc/markdown_fix.go`:
- Around line 293-303: The fence detection in fixTopLevelSoftbreaks duplicates
the narrow logic from applyOutsideCodeFences (it only matches exactly "```"), so
extract a shared helper (e.g., isFenceStartOrEnd or detectFence) used by both
fixTopLevelSoftbreaks and applyOutsideCodeFences that correctly recognizes
backtick and tilde fences, supports longer/variable-length fences (e.g., "```",
"````", "~~~"), and accepts optional blockquote prefixes ("> " or repeated)
before the fence; update the code in fixTopLevelSoftbreaks to call that helper
instead of comparing trimmed == "```" and ensure it preserves inCodeBlock
toggling and appends the original line to out as before.
- Around line 237-242: The isTableStructuralTag function is too broad by
checking strings.HasPrefix(s, "<lark-t"), which will match unintended tags
(e.g., <lark-text>); update the predicate to explicitly check the known
structural tags instead—use explicit prefix checks for "<lark-table",
"</lark-table", "<lark-tr", "</lark-tr", "<lark-td", and "</lark-td" (or
equivalent exact tag checks) inside isTableStructuralTag to limit matches to
only table, tr and td open/close tags.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: be5b48e9-786e-4869-b1da-fc847562e345
📒 Files selected for processing (2)
shortcuts/doc/markdown_fix.goshortcuts/doc/markdown_fix_test.go
The previous regex "`[^`]+`" only matched single-backtick inline code spans. Any span delimited by two or more backticks (e.g. `` `foo` ``) would be split at the outer boundary, causing fixBoldSpacingLine to treat the inner content as regular text and potentially rewrite bold/ italic markers inside what should be a verbatim code span. Replace inlineCodeRe and its FindAllStringIndex usage with scanInlineCodeSpans(), a hand-written scanner that follows CommonMark §6.1: find an opening run of N backticks, then search forward for the next identical-length run to close the span. This correctly handles all delimiter lengths and embedded single backticks. Add two regression tests for double-backtick spans in TestFixBoldSpacing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@shortcuts/doc/markdown_fix.go`:
- Around line 141-145: Run gofmt on the file to fix CI formatting errors: apply
`gofmt -w` to shortcuts/doc/markdown_fix.go (which contains the function
scanInlineCodeSpans) so the function declaration and surrounding code match Go
formatting rules; after formatting, re-run the tests/CI to confirm the
formatting issue is resolved.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a850c073-c232-4781-90c3-a87e5f98ee9f
📒 Files selected for processing (2)
shortcuts/doc/markdown_fix.goshortcuts/doc/markdown_fix_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- shortcuts/doc/markdown_fix_test.go
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
fetch-doc(MCP) exports Markdown with several spec violations — missing blank lines between blocks, trailing spaces in bold markers, blockquote lines collapsing, etc. — that cause ~80% of document structure to be lost on re-import viacreate-doc. This PR adds afixExportedMarkdown()post-processing pipeline todocs +fetchthat corrects these issues at the CLI layer.Changes
shortcuts/doc/docs_fetch.go: CallfixExportedMarkdown()on themarkdownfield returned byfetch-docbefore outputting.shortcuts/doc/markdown_fix.go(new): Post-processing pipeline with five fix functions:fixBoldSpacing: removes trailing spaces before**/*closing delimiters; strips redundant**inside ATX headingsfixSetextAmbiguity: inserts a blank line before---that immediately follows a non-blank line, preventing Setext H2 misparsefixBlockquoteHardBreaks: inserts an empty>separator between consecutive blockquote lines so each becomes its own paragraph on re-importfixTopLevelSoftbreaks: inserts blank lines between adjacent top-level content blocks, and between content lines inside<callout>,<quote-container>, and<lark-td>containers; code fences are left untouchedfixCalloutEmoji: replaces named emoji aliases (e.g.emoji="warning") with Unicode emoji thatcreate-docrecognisesfixCodeBlockTrailingBlanks: removes spurious blank lines thatcreate-doc/fetch-docinserts before each closing```fenceshortcuts/doc/markdown_fix_test.go(new): Table-driven unit tests for all fix functions.Test Plan
go test ./shortcuts/doc/...)Related Issues
Summary by CodeRabbit
Bug Fixes
Tests