feat(doc): guard v2 markdown against custom-tag corruption; warn on whole-paragraph style#750
feat(doc): guard v2 markdown against custom-tag corruption; warn on whole-paragraph style#750herbertliu wants to merge 2 commits intomainfrom
Conversation
📝 WalkthroughWalkthroughAdds Markdown validation to detect unsupported Lark v2 custom tags and whole-paragraph emphasis; wires the tag check into v2 document create and update validation and adds unit tests for the new checks. ChangesMarkdown Validation for Document Operations
Sequence Diagram(s)(omitted — changes are local validation checks and test additions, not multi-component sequential flows) Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #750 +/- ##
==========================================
+ Coverage 64.96% 64.97% +0.01%
==========================================
Files 502 502
Lines 46224 46256 +32
==========================================
+ Hits 30030 30056 +26
- Misses 13583 13588 +5
- Partials 2611 2612 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
🚀 PR Preview Install Guide🧰 CLI updatenpm i -g https://pkg.pr.new/larksuite/cli/@larksuite/cli@d68af0c54dd3178ec1e5c1d026ed4d4247b117ec🧩 Skill updatenpx skills add larksuite/cli#feat/docs-v2-markdown-custom-tag-guard -y -g |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
shortcuts/doc/docs_update_check_test.go (1)
426-443: ⚡ Quick winStrengthen the multi-tag assertion to match the test name.
The case says “multiple custom tags all appear in message,” but it currently verifies only one tag. Please assert every expected tag to avoid silent regressions in message composition.
✅ Suggested test tightening
- wantTag string + wantTags []string @@ - wantTag: "<grid>", + wantTags: []string{"<grid>"}, @@ - wantTag: "<column>", + wantTags: []string{"<column>"}, @@ - wantTag: "<lark-table>", + wantTags: []string{"<lark-table>"}, @@ - wantTag: "<text color=...>", + wantTags: []string{"<text color=...>"}, @@ - wantTag: "<grid>", + wantTags: []string{"<grid>", "<lark-table>"}, @@ - if tt.wantTag != "" && !strings.Contains(got, tt.wantTag) { - t.Fatalf("expected message to contain %q, got: %q", tt.wantTag, got) - } + for _, tag := range tt.wantTags { + if !strings.Contains(got, tag) { + t.Fatalf("expected message to contain %q, got: %q", tag, got) + } + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@shortcuts/doc/docs_update_check_test.go` around lines 426 - 443, The test named "multiple custom tags all appear in message" only checks a single tt.wantTag; change the table-driven test to use a slice (e.g., wantTags []string) for cases expecting multiple tags and, inside the t.Run block after got := CheckV2MarkdownCustomTags(...), replace the single contains check with a loop that asserts strings.Contains(got, tag) for each tag in tt.wantTags (failing with t.Fatalf if any expected tag is missing). Also update the test case data for that scenario to list all expected tags (e.g., "<grid>", "<lark-table>", "<lark-tr>").
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@shortcuts/doc/docs_update_check.go`:
- Around line 291-295: The wholeParagraphStyleRe currently allows mismatched
asterisks and whitespace-only payloads; update the regexp used by
wholeParagraphStyleRe so it requires matching opening and closing markers and a
non-whitespace character inside. Replace the pattern used in
wholeParagraphStyleRe with one that alternates single-asterisk and
double-asterisk cases explicitly (e.g.,
^(?:\*[^*\s][^*\n]*\*|\*\*[^*\s][^*\n]*\*\*)$) so malformed `*text**`/`**text*`
and whitespace-only payloads no longer match; keep the variable name
wholeParagraphStyleRe and ensure any references such as
checkDocsUpdateBoldItalic continue to work.
---
Nitpick comments:
In `@shortcuts/doc/docs_update_check_test.go`:
- Around line 426-443: The test named "multiple custom tags all appear in
message" only checks a single tt.wantTag; change the table-driven test to use a
slice (e.g., wantTags []string) for cases expecting multiple tags and, inside
the t.Run block after got := CheckV2MarkdownCustomTags(...), replace the single
contains check with a loop that asserts strings.Contains(got, tag) for each tag
in tt.wantTags (failing with t.Fatalf if any expected tag is missing). Also
update the test case data for that scenario to list all expected tags (e.g.,
"<grid>", "<lark-table>", "<lark-tr>").
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 80be699e-7f8c-432c-995b-abdde99b5e8d
📒 Files selected for processing (4)
shortcuts/doc/docs_create_v2.goshortcuts/doc/docs_update_check.goshortcuts/doc/docs_update_check_test.goshortcuts/doc/docs_update_v2.go
…; add coverage - wholeParagraphStyleRe: replace \S with [^\s*\n] so that asymmetric markers like *text** and whitespace-only spans like * * no longer match. Addresses coderabbitai review comment on PR #750. - Add TestDocsUpdateWarningsWholeParagraphStyle to cover the append path through docsUpdateWarnings (previously uncovered). - Add two edge-case entries to TestCheckDocsUpdateWholeParagraphStyle: asymmetric markers and whitespace-only content. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
shortcuts/doc/docs_update_check_test.go (1)
438-443: ⚡ Quick winStrengthen the "multiple custom tags all appear in message" assertion.
The test name promises that all tags surface in the message, but the assertion only checks for
<grid>— which is also asserted by the single-taggrid tag triggers warningcase above, so this case currently adds no new coverage. Verifying the second tag (<lark-table>) is what makes this case meaningful and would catch a regression where the loop short-circuits after the first match.🧪 Proposed fix
{ name: "multiple custom tags all appear in message", content: `<grid cols="2"><lark-table><lark-tr></lark-tr></lark-table></grid>`, wantHint: true, wantTag: "<grid>", + // Note: this case is also asserted below to contain "<lark-table>" + // to verify the loop does not short-circuit after the first match. },if tt.wantTag != "" && !strings.Contains(got, tt.wantTag) { t.Fatalf("expected message to contain %q, got: %q", tt.wantTag, got) } + if tt.name == "multiple custom tags all appear in message" && + !strings.Contains(got, "<lark-table>") { + t.Fatalf("expected multi-tag message to also contain %q, got: %q", + "<lark-table>", got) + }A cleaner alternative is to extend the table struct with a
wantTags []stringfield and iterate, dropping the special case.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@shortcuts/doc/docs_update_check_test.go` around lines 438 - 443, The test case "multiple custom tags all appear in message" currently only asserts wantTag ("<grid>") so it doesn't validate the second tag; update the case to assert both tags — either change this entry to include a second expectation (e.g., check for "<lark-table>" in the produced message as well) or refactor the table test struct by replacing wantTag string with wantTags []string and update the test loop in the test function to iterate over wantTags and assert each is present (this prevents short-circuiting after the first match and ensures both "<grid>" and "<lark-table>" are verified).shortcuts/doc/docs_update_check.go (1)
291-301: ⚡ Quick winStale doc-comment paragraph left above
wholeParagraphStyleRe.Lines 291–294 and 295–298 are two stacked doc-comment paragraphs for the same identifier (the older draft was not removed when the more precise wording was added). Godoc will render both, and the duplication invites drift. Keep the lower paragraph (it accurately describes the regex, including the
* *exclusion), and fold the "do NOT match***…***" rationale into a single comment.♻️ Proposed fix
-// wholeParagraphStyleRe matches a line whose entire content is a single -// emphasis span: *…* or **…** (at least one non-whitespace char inside). -// CJK and ASCII content both match; we deliberately do NOT match ***…*** -// because that is already covered by checkDocsUpdateBoldItalic. -// wholeParagraphStyleRe matches a line whose entire content is exactly one -// *italic* or **bold** span. Requirements: opener and closer must be the same -// number of asterisks (1 or 2), content has no * or newline, and at least one -// non-whitespace non-asterisk character is present (prevents `* *` matches). +// wholeParagraphStyleRe matches a line whose entire content is exactly one +// *italic* or **bold** span. Requirements: opener and closer must be the same +// number of asterisks (1 or 2), content has no `*` or newline, and at least +// one non-whitespace non-asterisk character is present (prevents `* *` +// matches). CJK and ASCII content both match; we deliberately do NOT match +// `***…***` because that is already covered by checkDocsUpdateBoldItalic. var wholeParagraphStyleRe = regexp.MustCompile(🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@shortcuts/doc/docs_update_check.go` around lines 291 - 301, The doc comment for wholeParagraphStyleRe contains a duplicated/stacked paragraph—remove the older upper paragraph (the lines describing "wholeParagraphStyleRe matches a line whose entire content... CJK and ASCII..." that precedes the more precise wording) and keep the lower, accurate paragraph that explains the regex and the `* *` exclusion; merge the "do NOT match ***…***" rationale into that single remaining comment so there is one clear godoc description for wholeParagraphStyleRe.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@shortcuts/doc/docs_update_check.go`:
- Around line 344-361: CheckV2MarkdownCustomTags currently scans raw content and
can false-positive on tags inside fenced code; modify it to first call
stripMarkdownCodeRegions(content) (same preprocessing used by
checkDocsUpdateWholeParagraphStyle) and run the strings.Contains checks against
the sanitized output so code fences are ignored, and add a regression test
mirroring the "whole paragraph style inside code fence is ignored" case in
TestCheckDocsUpdateWholeParagraphStyle to ensure tags inside ``` fences do not
trigger the hard validation error.
---
Nitpick comments:
In `@shortcuts/doc/docs_update_check_test.go`:
- Around line 438-443: The test case "multiple custom tags all appear in
message" currently only asserts wantTag ("<grid>") so it doesn't validate the
second tag; update the case to assert both tags — either change this entry to
include a second expectation (e.g., check for "<lark-table>" in the produced
message as well) or refactor the table test struct by replacing wantTag string
with wantTags []string and update the test loop in the test function to iterate
over wantTags and assert each is present (this prevents short-circuiting after
the first match and ensures both "<grid>" and "<lark-table>" are verified).
In `@shortcuts/doc/docs_update_check.go`:
- Around line 291-301: The doc comment for wholeParagraphStyleRe contains a
duplicated/stacked paragraph—remove the older upper paragraph (the lines
describing "wholeParagraphStyleRe matches a line whose entire content... CJK and
ASCII..." that precedes the more precise wording) and keep the lower, accurate
paragraph that explains the regex and the `* *` exclusion; merge the "do NOT
match ***…***" rationale into that single remaining comment so there is one
clear godoc description for wholeParagraphStyleRe.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3f40f2a2-aca7-42da-bdb7-58cc11ee1dc2
📒 Files selected for processing (2)
shortcuts/doc/docs_update_check.goshortcuts/doc/docs_update_check_test.go
| func CheckV2MarkdownCustomTags(content string) string { | ||
| if content == "" { | ||
| return "" | ||
| } | ||
| var found []string | ||
| for _, t := range v2MarkdownUnsupportedTags { | ||
| if strings.Contains(content, t.prefix) { | ||
| found = append(found, t.display) | ||
| } | ||
| } | ||
| if len(found) == 0 { | ||
| return "" | ||
| } | ||
| return "--doc-format markdown does not support Lark custom tags (" + | ||
| strings.Join(found, ", ") + "); " + | ||
| "the v2 API will silently strip or corrupt them. " + | ||
| "Switch to --doc-format xml (the default) to preserve these blocks." | ||
| } |
There was a problem hiding this comment.
Custom-tag check ignores fenced code blocks, can hard-fail on legitimate documentation payloads.
checkDocsUpdateWholeParagraphStyle runs stripMarkdownCodeRegions before scanning so that *not a paragraph* inside a ``` fence does not warn. CheckV2MarkdownCustomTags does not — it scans the raw content with strings.Contains. Since the result is wired into v2 create/update Validate (per PR description, returns a hard error), a user pushing markdown that describes these tags inside a code fence (e.g., a tutorial paragraph containing ```html\n<grid cols="2">…\n```) will be blocked even though the parser would render the fence as inert code text rather than corrupt it.
Consider sanitizing the same way the sibling check does so the gating only fires on tags that the v2 parser would actually see as live markup.
🛠️ Proposed fix
func CheckV2MarkdownCustomTags(content string) string {
if content == "" {
return ""
}
+ scanned := stripMarkdownCodeRegions(content)
var found []string
for _, t := range v2MarkdownUnsupportedTags {
- if strings.Contains(content, t.prefix) {
+ if strings.Contains(scanned, t.prefix) {
found = append(found, t.display)
}
}A regression test for the fenced-code case (parallel to the whole paragraph style inside code fence is ignored case in TestCheckDocsUpdateWholeParagraphStyle) would lock this in.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@shortcuts/doc/docs_update_check.go` around lines 344 - 361,
CheckV2MarkdownCustomTags currently scans raw content and can false-positive on
tags inside fenced code; modify it to first call
stripMarkdownCodeRegions(content) (same preprocessing used by
checkDocsUpdateWholeParagraphStyle) and run the strings.Contains checks against
the sanitized output so code fences are ignored, and add a regression test
mirroring the "whole paragraph style inside code fence is ignored" case in
TestCheckDocsUpdateWholeParagraphStyle to ensure tags inside ``` fences do not
trigger the hard validation error.
…hole-paragraph style Case 15 — CheckV2MarkdownCustomTags: - Detect Lark custom tags (<grid>, <column>, <lark-table>, <text color=...>) in markdown-mode content for docs +create and docs +update (v2). - Return a Validate error before the request is sent; the v2 API silently strips or collapses these tags causing irreversible document corruption. - 8 table-driven tests covering each tag variant, safe tags (callout), and multi-tag messages. Case 14 — checkDocsUpdateWholeParagraphStyle: - Warn when a markdown line consists entirely of *…* or **…** markers. Lark does not render such lines as italic/bold; the markers appear verbatim. Fires only for prose lines outside fenced code blocks. - 8 table-driven tests including fence-skip and triple-asterisk exclusion.
…; add coverage - wholeParagraphStyleRe: replace \S with [^\s*\n] so that asymmetric markers like *text** and whitespace-only spans like * * no longer match. Addresses coderabbitai review comment on PR #750. - Add TestDocsUpdateWarningsWholeParagraphStyle to cover the append path through docsUpdateWarnings (previously uncovered). - Add two edge-case entries to TestCheckDocsUpdateWholeParagraphStyle: asymmetric markers and whitespace-only content.
01b60fd to
d68af0c
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
shortcuts/doc/docs_update_check.go (1)
344-351:⚠️ Potential issue | 🟠 Major | ⚡ Quick winIgnore fenced/inline code before custom-tag validation
CheckV2MarkdownCustomTagscurrently scans raw markdown, so inert examples like fenced HTML snippets can hard-failValidate. Since this path is now a blocking error in create/update, this is a false-positive blocker for legitimate docs content.Suggested fix
func CheckV2MarkdownCustomTags(content string) string { if content == "" { return "" } + scanned := stripMarkdownCodeRegions(content) var found []string for _, t := range v2MarkdownUnsupportedTags { - if strings.Contains(content, t.prefix) { + if strings.Contains(scanned, t.prefix) { found = append(found, t.display) } }Also add one regression case in
TestCheckV2MarkdownCustomTagsfor<grid ...>inside a fenced block expecting no hint.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@shortcuts/doc/docs_update_check.go` around lines 344 - 351, CheckV2MarkdownCustomTags scans raw markdown and falsely flags custom-tag patterns inside fenced code blocks and inline code; update CheckV2MarkdownCustomTags to first remove or ignore fenced code blocks (```...``` and indented blocks) and inline code spans (`...`) before searching for v2MarkdownUnsupportedTags so examples/snippets don't produce false positives, using the same v2MarkdownUnsupportedTags and found logic; also add a regression test in TestCheckV2MarkdownCustomTags that includes a <grid ...> inside a fenced block and asserts no hint is returned.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@shortcuts/doc/docs_update_check.go`:
- Around line 344-351: CheckV2MarkdownCustomTags scans raw markdown and falsely
flags custom-tag patterns inside fenced code blocks and inline code; update
CheckV2MarkdownCustomTags to first remove or ignore fenced code blocks
(```...``` and indented blocks) and inline code spans (`...`) before searching
for v2MarkdownUnsupportedTags so examples/snippets don't produce false
positives, using the same v2MarkdownUnsupportedTags and found logic; also add a
regression test in TestCheckV2MarkdownCustomTags that includes a <grid ...>
inside a fenced block and asserts no hint is returned.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d6edbcdf-66d7-4fb2-9ce9-e64da7561f0c
📒 Files selected for processing (4)
shortcuts/doc/docs_create_v2.goshortcuts/doc/docs_update_check.goshortcuts/doc/docs_update_check_test.goshortcuts/doc/docs_update_v2.go
Summary
Two new pre-flight checks for the v2 docs create/update path that prevent silent data loss and surface rendering pitfalls before content is sent to the API.
Changes
shortcuts/doc/docs_update_check.go: two new check functionsCheckV2MarkdownCustomTags: detects Lark custom tags (<grid>,<column>,<lark-table>,<text color=...>) in markdown-mode content and returns an error message. Called fromvalidateCreateV2andvalidateUpdateV2.checkDocsUpdateWholeParagraphStyle: warns when a line consists entirely of*…*or**…**markers (Lark renders these as literal characters, not styled paragraphs). Integrated intodocsUpdateWarnings.shortcuts/doc/docs_create_v2.go/shortcuts/doc/docs_update_v2.go: callCheckV2MarkdownCustomTagsin Validate when--doc-format markdownis set.shortcuts/doc/docs_update_check_test.go: 16 new table-driven tests (8 per check).Motivation
Case 15 (
<grid>/<lark-table>corruption): the v2 markdown parser does not understand Lark custom tags but attempts to process them — collapsing grid columns to 1, discarding lark-table content entirely, escaping<text color=...>to literal characters. This caused complete structural collapse of a weekly report document. Blocking early in Validate is the only reliable defense before MCP-layer support lands.Case 14 (whole-paragraph style markers): a line like
*为什么?*on its own renders as the literal string*为什么?*in Lark, not as italic text. This is a different failure mode from the existing bold+italic check (#569) and is worth a dedicated warning.Test Plan
go test ./shortcuts/doc/...)go vet ./...cleangofmt -lcleanRelated
Summary by CodeRabbit
New Features
Tests