perf: optimize decode paths for Nat/Int, primitive vecs, and strings#721
Merged
perf: optimize decode paths for Nat/Int, primitive vecs, and strings#721
Conversation
Four decode-side optimizations, all behavior-preserving: 1. Nat/Int deserialization bypass: for values fitting u64/i64, read LEB128 directly and call visitor.visit_u64/i64, avoiding the BigUint/BigInt → bytes → BigUint round-trip (saves 3 allocations per value). 2. BigNum vector fast path: batch cost tracking and skip per-element type cloning/checking for Vec<Nat>, Vec<Int>, and Vec<Int> with Nat wire type, mirroring the existing primitive vec fast path. 3. PrimitiveVecAccess with IntoDeserializer: on LE platforms, decode primitive vectors via a lightweight SeqAccess that reads directly from the input byte slice using serde's IntoDeserializer, bypassing the full Deserializer and Cursor overhead. 4. Borrowed string deserialization: use visit_borrowed_str instead of copying bytes, enabling zero-copy for &str targets. Benchmark improvements (decode, vs previous optimized baseline): vec_nat: 910M → 300M (-67%) vec_nat32: 406M → 247M (-39%) vec_nat64: 411M → 255M (-38%) vec_int16: 411M → 251M (-39%) btreemap: 13.3B → 11.2B (-16%) option_list: 23M → 18M (-20%) variant_list: 21M → 17M (-21%) Made-with: Cursor
Click to see raw report |
Remove redundant explicit cleanup blocks after visit_seq — Compound::drop already resets both primitive_vec_fast_path and bignum_vec_fast_path on all paths (success and error). Restore the explanatory comment on the Drop impl. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously set_position advanced by total_bytes unconditionally. Use access.offset (bytes actually consumed) so the cursor is correct if the visitor short-circuits before consuming all elements. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
bignum_vec_fast_path is only ever set from deserialize_seq when type information is available, so is_untyped must be false whenever the fast path is active. Add debug_assert to make this invariant explicit in both deserialize_int and deserialize_nat. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…_str Both methods were identical after the visit_borrowed_str change. Delegate deserialize_string to deserialize_str to avoid future drift. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Conflict in Compound::next_element_seed: master refactored to always set expect_type/wire_type upfront and simplified the cost condition. Resolved by keeping master's unconditional type assignment while extending the is_fast check to include bignum_vec_fast_path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
lwshang
approved these changes
Mar 18, 2026
is_untyped can be true with bignum_vec_fast_path active when deserializing IDLValue (get_value_with_type sets is_untyped=true). The LEB128 fast path is already correctly guarded by !is_untyped; the bignum fallback path works regardless because wire_type is pre-set by the vec fast path setup. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
Four decode-side optimizations, all behavior-preserving:
Nat/Int deserialization bypass: for values fitting u64/i64, read LEB128 directly and call visitor.visit_u64/i64, avoiding the BigUint/BigInt → bytes → BigUint round-trip (saves 3 allocations per value).
BigNum vector fast path: batch cost tracking and skip per-element type cloning/checking for Vec, Vec, and Vec with Nat wire type, mirroring the existing primitive vec fast path.
PrimitiveVecAccess with IntoDeserializer: on LE platforms, decode primitive vectors via a lightweight SeqAccess that reads directly from the input byte slice using serde's IntoDeserializer, bypassing the full Deserializer and Cursor overhead.
Borrowed string deserialization: use visit_borrowed_str instead of copying bytes, enabling zero-copy for &str targets.
Benchmark improvements (decode, vs previous optimized baseline):
vec_nat: 910M → 300M (-67%)
vec_nat32: 406M → 247M (-39%)
vec_nat64: 411M → 255M (-38%)
vec_int16: 411M → 251M (-39%)
btreemap: 13.3B → 11.2B (-16%)
option_list: 23M → 18M (-20%)
variant_list: 21M → 17M (-21%)