perf(wasm): skip serde_json::Value round-trip in renderLatex by EurFelux · Pull Request #32 · erweixin/RaTeX

EurFelux · 2026-04-07T08:27:02Z

Follow-up to the optimization (1) discussed in #30. Replaces the
to_value → sanitize_json_numbers → to_string pipeline with a
single in-place pass over the typed DisplayList, then a direct
serde_json::to_string.

What changed

Before:

let value = serde_json::to_value(&display_list)?;   // ① clone whole tree into Value
let sanitized = sanitize_json_numbers(value);       // ② recursively clone it again
serde_json::to_string(&sanitized)                   // ③ serialize Value tree

After:

let mut display_list = to_display_list(&layout_box);
sanitize_display_list(&mut display_list);           // single in-place walk, no allocation
serde_json::to_string(&display_list)                // direct typed serialize

sanitize_display_list walks DisplayList / DisplayItem / PathCommand
and clamps any non-finite f64 (NaN / Infinity) to 0.0. This preserves
the previous behavior — serde_json's default f64 serializer would
otherwise error on non-finite values, which is exactly why the original
code went through Value (whose from_f64 silently produces Null).

Net effect on the hot path:

	Before	After
Tree walks	4 (to_value + sanitize + to_string + JSON.parse)	2 (sanitize + to_string + JSON.parse)
Per-node `Map`/`Vec` allocations	yes (one per object/array)	none
Intermediate `Value` tree	yes	no

Benchmark results

Measured via the harness from #30, parse+layout stage, 200 iterations + 20 warmup, Chrome. The benchmark is somewhat noisy on small formulas — relative trends are reliable, individual cells less so.

Formula	KaTeX (ms)	RaTeX before	RaTeX after	Gap (after)
simple add	0.008	~0.038	0.033	0.025
Euler	0.011	~0.052	0.035	0.024
Schrödinger	0.019	—	0.040	0.021
Einstein field eq.	0.034	—	0.048	0.014
long mixed	0.072	—	0.094	0.022

Overall parse-stage speedup: 0.42× → 0.60× (sum of medians). Per-formula numbers are still <1×, but every row improved and the gain is structural rather than noise.

Full-render stage was already a RaTeX win and is not regressed by this change (it goes through the same renderLatex entry point, just with more downstream Canvas work).

What this doesn't fix — and why that's interesting

The most striking thing in the post-optimization numbers is that the gap between RaTeX and KaTeX is essentially constant at ~22 µs across all formulas, regardless of complexity. That is a very clean signal:

Simple formula gap: 0.025 ms
5× more complex formula gap: 0.022 ms
10× more complex formula gap: 0.022 ms

If the bottleneck were anything Rust-side (parse, layout, serialize), the gap would scale with formula size. It doesn't. So the remaining ~22 µs lives entirely in:

wasm → JS string copy at the wasm-bindgen boundary
JS-side JSON.parse fixed startup cost
wasm-bindgen's Result<String, JsValue> glue

In other words, this PR has taken the Rust-side serialize cost about as low as it can go without changing the boundary protocol. Any further parse-stage improvement would need to attack the boundary itself — either via a binary protocol (Uint8Array + DataView / bincode) or by pushing the draw call entirely into wasm so the display list never crosses the boundary at all.

I'm not proposing either of those in this PR — you already noted in #30 that the display list isn't large enough to justify a binary protocol, which is a fair call given the complexity cost. Just flagging the constant-gap observation as data for future reference: if parse-stage parity with KaTeX ever becomes a goal, the evidence points clearly at the FFI boundary as the only remaining lever.

(That said: I'd argue parse-stage parity matters less than it looks, because RaTeX's actual unique value lives outside the web target where KaTeX isn't even an option — see #31 for the longer version of that argument.)

Notes

I'll update README.md as a separate follow-up — this PR is intentionally scoped to the perf change so the diff is easy to review and revert if needed.
All existing tests pass (cargo test -p ratex-wasm, cargo test -p ratex-types).
The behavioral contract is unchanged: non-finite f64 values are still clamped to 0 before they reach JS.

cc @erweixin — thanks again for the project! 🐾

The previous pipeline did: to_value -> sanitize_json_numbers (recursive clone) -> to_string For every render this walked the display list four times and allocated a fresh Map/Vec at every node just to clamp NaN/Infinity. Replace it with a single in-place pass that clamps non-finite f64 fields and then calls serde_json::to_string directly on the typed display list. Refs: erweixin#30

erweixin · 2026-04-07T09:57:01Z

LGTM, Thank you for the PR !

EurFelux mentioned this pull request Apr 7, 2026

docs(readme): reframe positioning and link web benchmark #33

Merged

erweixin merged commit 4562607 into erweixin:main Apr 7, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(wasm): skip serde_json::Value round-trip in renderLatex#32

perf(wasm): skip serde_json::Value round-trip in renderLatex#32
erweixin merged 1 commit intoerweixin:mainfrom
EurFelux:perf/wasm-skip-json-value

EurFelux commented Apr 7, 2026 •

edited

Loading

Uh oh!

erweixin commented Apr 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

EurFelux commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changed

Benchmark results

What this doesn't fix — and why that's interesting

Notes

Uh oh!

erweixin commented Apr 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

EurFelux commented Apr 7, 2026 •

edited

Loading