Skip to content

fix(stage5): drop broken mermaid fence on terminal mmdc failure#64

Merged
RichardHightower merged 1 commit into
mainfrom
fix/stage5-strip-broken-mermaid-on-hil
May 14, 2026
Merged

fix(stage5): drop broken mermaid fence on terminal mmdc failure#64
RichardHightower merged 1 commit into
mainfrom
fix/stage5-strip-broken-mermaid-on-hil

Conversation

@RichardHightower

Copy link
Copy Markdown
Contributor

Summary

When mermaid generation exhausts MAX_ATTEMPTS with mmdc syntax errors, drop the ```mermaid fence entirely. Ship only the HIL marker + a 'diagram unavailable' line. LLM-semantic disputes (mmdc-passing, content-disputed) still keep the fence.

Why

Discovered during the agent-brain trust-eval pass (`docs/gen/agent-brain/EVAL.md`). Three live cases (HIL-003 `SummarizationConfig`, HIL-010 `TestGeminiConverter`, HIL-012 `TestGraphStoreManagerEdgeCases`) ship as `` markers plus a syntactically broken ```mermaid block. Every viewer (GitHub, IDE preview, mkdocs) renders the fence as a blank/error — the reader sees a code block that produces nothing AND a TODO marker. A flagged-but-unrenderable block is strictly worse than no diagram.

Root cause in every case: a relationship label like `Settings ..> ValidationError : calls get_api_key(); raises ValidationError`. The `;` breaks mmdc's lexer. (The per-package merger got the label-strip retry treatment in #60; per-class diagrams don't have that escape hatch because each is a single LLM-authored unit — when the LLM keeps producing the same bad label across 3 attempts, the only sane move is to stop pretending we have a diagram.)

Closes #61.

Changes

  • `src/designdoc/stages/s5_mermaid.py`
    • New `_is_mmdc_syntax_failure(verdict)` — reads `MermaidIssue.category` (the field already exists in `verdict.py`), no fragile string matching on `verdict.summary`.
    • New `_build_diagram_section(result, mermaid_src, hil_id)` — three-way pure function: pass / shipped-with-HIL-syntax / shipped-with-HIL-semantic. The previous inline section-building in `run()` collapses to a single call.
  • `tests/unit/test_stage5_hil_section.py` (new)
    • 7 unit tests covering both helpers: pass keeps fence, syntax HIL drops fence + writes 'diagram unavailable', semantic HIL keeps fence, mixed-category verdict counts as syntax (most conservative).

Invariants

  • MAX_ATTEMPTS=3 preserved (loop.py untouched)
  • Checker isolation preserved (no doer/checker context bridging)
  • Schema-validated verdicts / fail-loud preserved — discriminator uses the existing `MermaidIssue.category` enum, schema unchanged
  • Mermaid two-checker (mmdc + LLM) preserved — both still run; we're just deciding what to write differently when both fail
  • HIL fallback preserved — HIL marker still ships, hil-issues.yaml entry still appended; only the fenced source changes

Verification

  • `task ci` green locally (107 passed in 58.56s)
  • New tests cover all three branches (pass / syntax / semantic) + the mixed-category edge case
  • TWRC discipline followed (RED with `ImportError` → implementation → GREEN)
  • Manual rerun of Stage 5 against agent-brain — will confirm HIL-003/010/012 no longer ship broken fences (follow-up: a rerun isn't part of the merge gate; the unit tests prove the behaviour)

Related

…ax — closes #61

When mermaid generation exhausts MAX_ATTEMPTS with mmdc syntax errors, the
artifact previously shipped with BOTH a `<!-- HIL: HIL-NNN -->` marker AND
the broken ```mermaid fence. The fence is strictly harmful: every viewer
(GitHub, IDE preview, mkdocs) renders it as a blank/error block. Readers
see a code fence that produces nothing.

Three live cases in the agent-brain dogfood (HIL-003, HIL-010, HIL-012)
all ship as broken-fence blocks for the same reason — a `;` in a
relationship label that mmdc's parser rejects.

This change branches the `## Diagram` section builder on whether the
terminal verdict was an mmdc syntax failure (any issue with
category="syntax") vs an LLM-semantic failure (mmdc-passing, content-
disputed):

* mmdc syntax failure: drop the fence entirely, replace with a
  "Diagram unavailable; see HIL-NNN" line. Reader gets a clear signal.
* LLM-semantic failure: keep the fence (the diagram renders fine; the
  dispute is about quality, not parseability).
* pass: unchanged.

Refactored the section-building into _build_diagram_section() so the
three-way branch is testable as a pure function. The discriminator
_is_mmdc_syntax_failure() reads MermaidIssue.category instead of
string-matching verdict.summary — robust against future prompt changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@RichardHightower RichardHightower merged commit cf1910a into main May 14, 2026
2 checks passed
@RichardHightower RichardHightower deleted the fix/stage5-strip-broken-mermaid-on-hil branch May 14, 2026 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Stage 5: drop fence when mermaid fails mmdc after MAX_ATTEMPTS (ship HIL marker only)

1 participant