Skip to content

perf(scan): scanner micro-optimizations — shuffle structural detection + adaptive reserve + SmallVec #8

@membphis

Description

@membphis

Context

Three small scanner tweaks that fit together; all in README's Deferred list since PR #3. Bundled because each individually is low-impact and they touch the same area.

B1 — shuffle-based structural set check

structural_mask_chunk (src/scan/avx2.rs) currently does 7 × _mm256_cmpeq_epi8 + 7 × _mm256_movemask_epi8 per chunk half (one per char in {}[]:,"). A single _mm256_shuffle_epi8 against a 16-byte LUT plus one cmpeq can do the same set-membership test in 2-3 ops per half.

Only affects non-fast-path chunks. For string-heavy workloads ~5% of chunks hit this path; for object-heavy workloads up to 100%.

C1 — adaptive out.reserve

out.reserve(buf.len() / 6) is calibrated for object-heavy JSON. On string-heavy multimodal payloads the actual emit rate is <1 structural per KB, so we over-reserve by 100×+. Mainly a memory hygiene concern (mmap'd pages stay lazily faulted) but reduces alloc cost on smaller buffers.

Proposal: start at max(64, buf.len() / 128) and let Vec grow naturally. Standard amortized-doubling handles the rare growth case.

C2 — SmallVec for documents < 4 KB

For tiny payloads the indices Vec heap allocation is a meaningful fraction of total parse time. Switch indices to SmallVec<[u32; N]> for some inline N (e.g. 64). Heap alloc only triggers for documents with more than N structurals.

Estimated impact

est. speedup
B1 (string-heavy) ~2–5%
B1 (object-heavy) ~15–25%
C1 <2% across the board
C2 (small docs only) ~10–20%

Validation plan

  • All existing tests + 2000-case proptest
  • make bench median before/after per item separately (so each one's contribution is attributable)
  • Memory: confirm C1's smaller initial reserve doesn't trigger excessive Vec regrowth on object-heavy inputs

Notes

  • Validation semantics unchanged
  • Implement in one PR but with separate commits per item so individual revert is possible

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions