perf(wasm): skip serde_json::Value round-trip in renderLatex#32
Merged
erweixin merged 1 commit intoerweixin:mainfrom Apr 7, 2026
Merged
perf(wasm): skip serde_json::Value round-trip in renderLatex#32erweixin merged 1 commit intoerweixin:mainfrom
erweixin merged 1 commit intoerweixin:mainfrom
Conversation
The previous pipeline did: to_value -> sanitize_json_numbers (recursive clone) -> to_string For every render this walked the display list four times and allocated a fresh Map/Vec at every node just to clamp NaN/Infinity. Replace it with a single in-place pass that clamps non-finite f64 fields and then calls serde_json::to_string directly on the typed display list. Refs: erweixin#30
Owner
|
LGTM, Thank you for the PR ! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up to the optimization (1) discussed in #30. Replaces the
to_value→sanitize_json_numbers→to_stringpipeline with asingle in-place pass over the typed
DisplayList, then a directserde_json::to_string.What changed
Before:
After:
sanitize_display_listwalksDisplayList/DisplayItem/PathCommandand clamps any non-finite
f64(NaN / Infinity) to0.0. This preservesthe previous behavior —
serde_json's defaultf64serializer wouldotherwise error on non-finite values, which is exactly why the original
code went through
Value(whosefrom_f64silently producesNull).Net effect on the hot path:
Map/VecallocationsValuetreeBenchmark results
Measured via the harness from #30, parse+layout stage, 200 iterations + 20 warmup, Chrome. The benchmark is somewhat noisy on small formulas — relative trends are reliable, individual cells less so.
Overall parse-stage speedup: 0.42× → 0.60× (sum of medians). Per-formula numbers are still <1×, but every row improved and the gain is structural rather than noise.
Full-render stage was already a RaTeX win and is not regressed by this change (it goes through the same
renderLatexentry point, just with more downstream Canvas work).What this doesn't fix — and why that's interesting
The most striking thing in the post-optimization numbers is that the gap between RaTeX and KaTeX is essentially constant at ~22 µs across all formulas, regardless of complexity. That is a very clean signal:
If the bottleneck were anything Rust-side (parse, layout, serialize), the gap would scale with formula size. It doesn't. So the remaining ~22 µs lives entirely in:
wasm-bindgenboundaryJSON.parsefixed startup costResult<String, JsValue>glueIn other words, this PR has taken the Rust-side serialize cost about as low as it can go without changing the boundary protocol. Any further parse-stage improvement would need to attack the boundary itself — either via a binary protocol (
Uint8Array+DataView/bincode) or by pushing the draw call entirely into wasm so the display list never crosses the boundary at all.I'm not proposing either of those in this PR — you already noted in #30 that the display list isn't large enough to justify a binary protocol, which is a fair call given the complexity cost. Just flagging the constant-gap observation as data for future reference: if parse-stage parity with KaTeX ever becomes a goal, the evidence points clearly at the FFI boundary as the only remaining lever.
(That said: I'd argue parse-stage parity matters less than it looks, because RaTeX's actual unique value lives outside the web target where KaTeX isn't even an option — see #31 for the longer version of that argument.)
Notes
cargo test -p ratex-wasm,cargo test -p ratex-types).f64values are still clamped to0before they reach JS.cc @erweixin — thanks again for the project! 🐾