⚡ Move scalar text out of events during decode, dropping the pre-pass#92
Conversation
The fast decoder copied every scalar's text into a parallel side table in a separate pass over the whole event stream before decoding, then pulled from that table by position. The decoder is a single forward pass that only ever mutated the side table, never the events, so the table was pure overhead: one extra walk of the stream and one Vec<Option<Cow>> allocation sized to it. Borrow the events mutably and mem::take each scalar's text straight out of its event as the decoder consumes it, handing the string to Value::String with the same zero-copy ownership transfer. Same result, one fewer pass and allocation. Measured with callgrind over the real-world config corpus: -1.1% instructions on the full loads pipeline.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe decoder in 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR refactors the fast-path decoder (src/decode/mod.rs) to eliminate the scalar “pre-pass” side table by moving each scalar’s text out of its EventKind::Scalar on-demand as the decoder consumes events, preserving the same zero-copy ownership transfer into the decoded Value tree while removing an extra full-stream walk and allocation.
Changes:
- Remove the up-front scalar extraction pass and the decoder’s
scalars: Vec<Option<Cow<...>>>side table. - Switch the decoding entrypoints from
&[Event]to&mut [Event]so scalar text can bemem::taken directly from events during decoding. - Introduce a small helper (
take_scalar) to move scalar text out of an event at a given position.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The helper is only ever called right after the event was matched as a scalar, so the non-scalar arm is unreachable. Returning an empty Cow there would mask a future invariant break by silently decoding the wrong event as ""; unreachable! makes it fail loudly instead.
Breaking change
None. Same decoded values, same zero-copy string ownership, byte-for-byte identical output.
Proposed change
The fast decoder copied every scalar's text into a parallel side table (
Vec<Option<Cow>>) in a separate pass over the whole event stream, before decoding, then pulled each string from that table by position. The decoder is a single forward pass that only ever mutated the side table, never the events themselves, so the table was pure overhead: one extra walk of the stream and one allocation sized to it.This borrows the events mutably and
mem::takes each scalar's text straight out of its event as the decoder consumes it, handing the string toValue::Stringwith the exact same zero-copy ownership transfer. Same result, one fewer pass and one fewer allocation.The change is mechanical: the decoder already copied each bound pattern value (
let flow = *flow;) before recursing, which releases the scrutinee borrow, so flipping the seven recursing methods from&[Event]to&mut [Event]and moving the single scalar-take inline was accepted by the borrow checker without restructuring. The two read-only helpers stay on&[Event].Measured with callgrind (deterministic instruction counts,
cache-sim=no branch-sim=no) over the real-world config corpus, loads pipeline:This is the safe, isolated first step toward eliminating the materialized event vector on the fast path. Full streaming (dropping
Vec<Event>entirely) remains a possible follow-up, to be re-measured once this lands.Type of change
Additional information
Checklist
uv run pytestpasses locally. A pull request cannot be merged unless CI is green.uv run ruff check .anduv run ruff format --check .pass.cargo fmt --checkandcargo clippy --all-targets -- -D warningspass.If the change is user-facing:
docs/is added or updated, anddocs/verify_examples.pystill passes.