fix: add remark plugin to render raw HTML as literal text #16505

ServeurpersoCom · 2025-10-10T20:07:12Z

fix: add remark plugin to render raw HTML as literal text

Implemented a missing MDAST stage to neutralize raw HTML like major LLM WebUIs
do ensuring consistent and safe Markdown rendering

Introduced 'remarkLiteralHtml', a plugin that converts raw HTML nodes in the
Markdown AST into plain-text equivalents while preserving indentation and
line breaks. This ensures consistent rendering and prevents unintended HTML
execution, without altering valid Markdown structure

Kept 'remarkRehype' in the pipeline since it performs the required conversion
from MDAST to HAST for KaTeX, syntax highlighting, and HTML serialization

Refined the link-enhancement logic to skip unnecessary DOM rewrites,
fixing a subtle bug where extra paragraphs were injected after the first
line due to full innerHTML reconstruction, and ensuring links open in new
tabs only when required

Final pipeline: remarkGfm -> remarkMath -> remarkBreaks -> remarkLiteralHtml
-> remarkRehype -> rehypeKatex -> rehypeHighlight -> rehypeStringify

Close #16417

ServeurpersoCom · 2025-10-10T20:21:26Z

Test sheet

reasoning_content:
all must render as plain text

Final content:
markdown -> must render normally
markdown link -> must be clickable and open in a new tab
html without code block -> must render as plain text
html link tag -> must render as plain text and URL not clickable
html in code block -> must render as plain text + syntax-highlighted
latex in code block -> must render as plain text
latex outside markdown -> must render as plain text
latex inside markdown (nominal LLM case) -> must render normally

This patch aligns the WebUI Markdown pipeline with industry-standard LLM renderers (OpenAI ChatGPT, Hugging Face Spaces, Anthropic...) by ensuring raw HTML safety without sacrificing formatting fidelity

This patch doesn't just "sanitize HTML" : it neutralizes raw XML-like output (e.g. <think>, <tool>, <meta>, <response>, <step>, <node>, <data>), ensuring these symbolic or structural tags, whether produced by LLMs or part of generic XML fragments, are displayed as plain text rather than parsed as DOM, preserving structure while keeping the UI safe and consistent.

ServeurpersoCom · 2025-10-10T20:34:55Z

zzokkolma · 2025-10-11T10:30:15Z

I tested this PR and it seems to solve the missing HTML output part, however I did run into a formatting issue.

It is probably caused by the fact that the model output an empty indented line before that tag

ServeurpersoCom · 2025-10-11T11:57:21Z

Interesting case ! I just ran a quick test inside ChatGPT's own WebUI, and it fails in exactly the same way 😅
Even their renderer neutralizes raw HTML and collapses line breaks when HTML is escaped inside Markdown.
So what you're seeing is consistent with how most production-safe Markdown pipelines behave when remark-rehype or rehype-sanitize flatten the tree.

Here's a screenshot from that test:

I'll dig a bit deeper, but it really confirms that our remarkLiteralHtml stage is the right approach keeping structure visible while neutralizing unsafe tags.

ServeurpersoCom · 2025-10-11T12:00:39Z

To reproduce the issue now, you need to explicitly ask the model to output XML-like tags in the stream, which is already a bit of a hack, since LLMs naturally know they’re emitting Markdown.
So this goes slightly beyond normal usage, and as long as it’s only a rendering glitch, it’s probably better to keep the code simple rather than over-engineering for rare edge cases that could degrade the quality of conventional Markdown rendering.

ServeurpersoCom · 2025-10-11T12:06:07Z

Test prompt :

Write HTML with real blank lines and indentation inside a code block and then output the same HTML outside a code block, so we can compare the rendering.

https://chatgpt.com/share/68ea480b-5c3c-8012-9201-62cfb687dc67

And also on llama.cpp with this PR :

conversation_6b69f066-c32c-4b0b-9f0d-92dad9c31764_tu_peux_crire_exacte.json

At this point, we’re actually doing slightly better than some major LLM WebUIs so that’s a good sign 😄
Let’s stop here before over-tuning it; the current behavior is safe, consistent, and covers all realistic use cases.

allozaur

Great stuff overall! Just left a few architectural remarks that need to be addressed.

tools/server/webui/package-lock.json

tools/server/webui/src/lib/markdown/literalHtml.ts

tools/server/webui/src/lib/markdown/literal-html.ts

tools/server/public/index.html.gz

Implemented a missing MDAST stage to neutralize raw HTML like major LLM WebUIs do ensuring consistent and safe Markdown rendering Introduced 'remarkLiteralHtml', a plugin that converts raw HTML nodes in the Markdown AST into plain-text equivalents while preserving indentation and line breaks. This ensures consistent rendering and prevents unintended HTML execution, without altering valid Markdown structure Kept 'remarkRehype' in the pipeline since it performs the required conversion from MDAST to HAST for KaTeX, syntax highlighting, and HTML serialization Refined the link-enhancement logic to skip unnecessary DOM rewrites, fixing a subtle bug where extra paragraphs were injected after the first line due to full innerHTML reconstruction, and ensuring links open in new tabs only when required Final pipeline: remarkGfm -> remarkMath -> remarkBreaks -> remarkLiteralHtml -> remarkRehype -> rehypeKatex -> rehypeHighlight -> rehypeStringify

* origin/master: (32 commits) metal : FA support F32 K and V and head size = 32 (ggml-org#16531) graph : support cacheless embeddings with FA and iSWA (ggml-org#16528) opencl: fix build targeting CL 2 (ggml-org#16554) CUDA: fix numerical issues in tile FA kernel (ggml-org#16540) ggml : fix build broken with -march=armv9-a on MacOS (ggml-org#16520) CANN: fix CPU memory leak in CANN backend (ggml-org#16549) fix: add remark plugin to render raw HTML as literal text (ggml-org#16505) metal: add support for opt_step_sgd (ggml-org#16539) ggml : fix scalar path for computing norm (ggml-org#16558) CANN: Update several operators to support FP16 data format (ggml-org#16251) metal : add opt_step_adamw and op_sum (ggml-org#16529) webui: remove client-side context pre-check and rely on backend for limits (ggml-org#16506) [SYCL] fix UT fault cases: count-equal, argsort, pad OPs (ggml-org#16521) ci : add Vulkan on Ubuntu with default packages build (ggml-org#16532) common : handle unicode during partial json parsing (ggml-org#16526) common : update presets (ggml-org#16504) ggml : Fix FP16 ELU positive branch (ggml-org#16519) hparams : add check for layer index in is_recurrent (ggml-org#16511) ggml: Correct SVE implementation in ggml_vec_dot_f16_unroll (ggml-org#16518) CUDA: faster tile FA, add oob checks, more HSs (ggml-org#16492) ...

ServeurpersoCom requested a review from allozaur as a code owner October 10, 2025 20:07

github-actions bot added examples server labels Oct 10, 2025

allozaur requested changes Oct 11, 2025

View reviewed changes

tools/server/webui/package-lock.json Show resolved Hide resolved

tools/server/webui/src/lib/markdown/literalHtml.ts Outdated Show resolved Hide resolved

tools/server/webui/src/lib/markdown/literal-html.ts Show resolved Hide resolved

allozaur requested changes Oct 13, 2025

View reviewed changes

tools/server/public/index.html.gz Show resolved Hide resolved

ServeurpersoCom added 3 commits October 13, 2025 10:14

fix: address review feedback from allozaur

13b4457

chore: update webui build output

24ec6ac

ServeurpersoCom force-pushed the fix-markdown-literal-html branch from 6555d6d to 24ec6ac Compare October 13, 2025 08:16

allozaur approved these changes Oct 13, 2025

View reviewed changes

allozaur merged commit 1fb9504 into ggml-org:master Oct 13, 2025
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: add remark plugin to render raw HTML as literal text #16505

fix: add remark plugin to render raw HTML as literal text #16505

ServeurpersoCom commented Oct 10, 2025 •

edited by allozaur

Loading

Uh oh!

ServeurpersoCom commented Oct 10, 2025 •

edited

Loading

Uh oh!

ServeurpersoCom commented Oct 10, 2025

Uh oh!

zzokkolma commented Oct 11, 2025

Uh oh!

ServeurpersoCom commented Oct 11, 2025 •

edited

Loading

Uh oh!

ServeurpersoCom commented Oct 11, 2025

Uh oh!

ServeurpersoCom commented Oct 11, 2025 •

edited

Loading

Uh oh!

allozaur left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix: add remark plugin to render raw HTML as literal text #16505

fix: add remark plugin to render raw HTML as literal text #16505

Conversation

ServeurpersoCom commented Oct 10, 2025 • edited by allozaur Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ServeurpersoCom commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ServeurpersoCom commented Oct 10, 2025

Uh oh!

zzokkolma commented Oct 11, 2025

Uh oh!

ServeurpersoCom commented Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ServeurpersoCom commented Oct 11, 2025

Uh oh!

ServeurpersoCom commented Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

allozaur left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ServeurpersoCom commented Oct 10, 2025 •

edited by allozaur

Loading

ServeurpersoCom commented Oct 10, 2025 •

edited

Loading

ServeurpersoCom commented Oct 11, 2025 •

edited

Loading

ServeurpersoCom commented Oct 11, 2025 •

edited

Loading