Skip to content

fix(dashboard): treat transcript text output as markdown or json only#462

Merged
dcramer merged 1 commit into
mainfrom
fix/transcript-output-rendering
May 31, 2026
Merged

fix(dashboard): treat transcript text output as markdown or json only#462
dcramer merged 1 commit into
mainfrom
fix/transcript-output-rendering

Conversation

@sentry-junior
Copy link
Copy Markdown
Contributor

@sentry-junior sentry-junior Bot commented May 31, 2026

What

Prose sections in parseMarkdownBlocks used the broad detectLanguage heuristic, which could return xml, html, typescript, or shellscript. When classified as xml/html, prose was rendered through StructuredMarkup (the collapsible XML tree) instead of HighlightedCode, breaking syntax highlighting for normal LLM chat output.

Changes

  • detectOutputLanguage (new export): returns only json (valid JSON/JSONL) or markdown for LLM text output — no XML/TS/shell heuristics
  • parseMarkdownBlocks: uses detectOutputLanguage for all prose sections; broad detectLanguage is kept only for the raw debug view
  • CodeBlock.fenced (new field on type): true for explicit code fences, false for prose; gates structured XML/HTML rendering on provenance, not just language
  • canRenderStructuredMarkup: now accepts CodeBlock and requires fenced === true; auto-detected prose is never eligible regardless of language
  • ThinkingPartView: switched to detectOutputLanguage (also LLM-generated output)
  • Raw message view: keeps broad detectLanguage (developer/debug surface)
  • 13 new tests covering all prose language scenarios and the structured markup eligibility invariant

Verified

  • tsc --noEmit: clean
  • vitest run tests/format.test.ts: 17/17 passing

View Session in Sentry

Action taken on behalf of David Cramer.

Prose sections in parseMarkdownBlocks were using the broad detectLanguage
heuristic, which could return xml/html/typescript/shellscript. When
classified as xml or html, the prose was rendered through StructuredMarkup
(the collapsible XML tree) instead of HighlightedCode, breaking syntax
highlighting for normal LLM chat output.

Changes:
- Add detectOutputLanguage: returns only 'json' (valid JSON/JSONL) or
  'markdown' for LLM text output; no XML/TS/shell heuristics
- Update parseMarkdownBlocks to use detectOutputLanguage for all prose
  sections, keeping broad detectLanguage only for the raw debug view
- Add fenced: boolean to CodeBlock so canRenderStructuredMarkup can gate
  structured XML/HTML rendering on explicit fence provenance
- canRenderStructuredMarkup now accepts CodeBlock and requires fenced===true;
  auto-detected prose is never eligible regardless of language
- ThinkingPartView switches to detectOutputLanguage (LLM reasoning output)
- raw message view keeps broad detectLanguage (debug/developer view)
- 13 new format tests covering all prose language scenarios and the
  structured markup eligibility invariant

Fixes: XML markers in raw LLM output no longer trigger the collapsible
structured markup renderer.

Co-authored-by: David Cramer <noreply>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 31, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
junior-docs Ready Ready Preview, Comment May 31, 2026 6:53pm

Request Review

@dcramer dcramer marked this pull request as ready for review May 31, 2026 18:59
@dcramer dcramer merged commit b8252e6 into main May 31, 2026
16 checks passed
@dcramer dcramer deleted the fix/transcript-output-rendering branch May 31, 2026 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant