Skip to content

perf: decoder/document instance pooling (#6)#11

Merged
membphis merged 4 commits into
mainfrom
worktree-gh-issue-6
May 15, 2026
Merged

perf: decoder/document instance pooling (#6)#11
membphis merged 4 commits into
mainfrom
worktree-gh-issue-6

Conversation

@membphis
Copy link
Copy Markdown
Collaborator

Closes #6.

Summary

  • New reusable Decoder (Rust) and qjd_decoder_* C ABI: qjd_decoder_new / _free / _parse / _reset / _destroy. qjd_doc becomes a thin {decoder, gen, owns_decoder} handle so all existing qjd_get_* / qjd_open / cursor APIs work unchanged.
  • qd.new_decoder() Lua surface alongside the unchanged qd.parse(). Generation counter detects stale docs/cursors across re-parse / reset / destroy and returns QJD_STALE_DOC (nil at the wrapper, same convention as path-not-found).
  • Spec at docs/superpowers/specs/2026-05-15-decoder-pooling-design.md. On the medium fixture, allocations drop from ~10/parse (legacy qjd_parse) to 2/parse (pooled qjd_decoder_parse); enforced by a new count-allocs Cargo-feature test.

Test plan

  • cargo test --release — 76 unit + 14 new FFI + existing FFI suites all pass
  • cargo test --release --no-default-features — scalar-only build green
  • cargo test --features test-panic --release — panic barrier green
  • cargo test --release --features count-allocs --test alloc_count — head-to-head alloc count: legacy=9989, pooled=2000 over 1000 iters (asserts pooled < legacy/2)
  • cargo clippy --release -- -D warnings — clean (new qjd_decoder_* exports include # Safety blocks)
  • CI Lua busted suite — runs in CI (LuaJIT + busted not installed locally); covers parse/reset/destroy/stale-as-nil/cursor-staleness/legacy-isolation
  • Equivalence test (decoder_doc_equivalence cases) confirms qd.parse and decoder:parse return byte-identical results across every accessor on both shipped fixtures

membphis added a commit that referenced this pull request May 15, 2026
- Critical: sync include/lua_quick_decode.h with src/error.rs +
  lua/quickdecode.lua. Add QJD_STALE_DOC enum value and the five
  qjd_decoder_* prototypes that the public C consumer needs to see.
  CLAUDE.md requires these three files to stay in lockstep.

- Important: harmonize NULL-buf handling. qjd_decoder_parse now
  rejects NULL even with len == 0, matching qjd_parse and avoiding
  any reliance on slice::from_raw_parts's edge cases.

- Important: add doc_held_across_failed_parse_is_stale (FFI-level)
  to prove the actual safety claim — a held doc fails the gen
  check after a failed parse, not merely that state/gen invariants
  hold inside the decoder.

- Important: extend equivalence coverage with two probes hitting
  the categories the existing fixture probes did not exercise:
  escape-heavy strings (scratch reuse) and 5-6 level nested object
  with repeated key access (skip-cache reuse).

- Important: cursor_opened_before_reparse_becomes_stale now also
  opens a fresh cursor on the post-reparse doc and walks it, proving
  the stale check does not poison the new path.

- Add null_buf_rejected_even_with_zero_len for the harmonized
  qjd_decoder_parse contract.
membphis added 4 commits May 15, 2026 22:29
…#6)

WIP: in-progress implementation. Header / Lua wrapper / tests still pending.

- Rename Document<'a> to Decoder (no lifetime param); add state machine
  (Ready / Parsed / Destroyed), generation counter, in-place parse() that
  truncates and re-fills indices / scratch / skip cache.
- Add Decoder::reset() (shrink) and Decoder::destroy() (terminal).
- Repurpose qjd_doc as a thin {decoder, gen, owns_decoder} handle. All
  existing qjd_get_* / qjd_open / cursor APIs keep working unchanged.
- Add qjd_decoder_new / free / parse / reset / destroy exports.
- Add check_doc_alive helper: Destroyed -> QJD_INVALID_ARG; gen mismatch
  -> QJD_STALE_DOC (new error code, value 9).
- SkipCache gains clear() and clear_and_shrink() for reset / destroy paths.

Builds clean (one dead-code warning on parse_oneshot to be addressed when
qjd_parse is refactored to use it).
- lua/quickdecode.lua: new_decoder + Decoder:parse/reset/destroy with the
  STALE_DOC -> nil convention. Doc table pins _decoder so the decoder
  outlives any reachable doc, and the decoder pins _payload so the
  current input buffer outlives the parse.
- tests/decoder_ffi.rs (14 tests): equivalence between qjd_parse and
  qjd_decoder_parse on shipped fixtures; stale-doc / stale-cursor /
  reset / destroy semantics; legacy path isolation.
- tests/alloc_count.rs (count-allocs feature): head-to-head allocation
  count between legacy and pooled. Asserts pooled < legacy / 2. On the
  medium fixture: legacy ~10/parse, pooled 2/parse.
- tests/lua/decoder_spec.lua: busted spec covering parse/reset/destroy,
  stale-as-nil, cursor staleness, legacy-isolation.
- .github/workflows/ci.yml: add count-allocs matrix point.
- README: pooled-API usage section + two new Roadmap / Deferred items.
- Cargo.toml: count-allocs feature.

Spec updated to drop the now-unneeded qjd_cursor gen field (cursor
freshness derives from its doc.gen via check_doc_alive).
- Critical: sync include/lua_quick_decode.h with src/error.rs +
  lua/quickdecode.lua. Add QJD_STALE_DOC enum value and the five
  qjd_decoder_* prototypes that the public C consumer needs to see.
  CLAUDE.md requires these three files to stay in lockstep.

- Important: harmonize NULL-buf handling. qjd_decoder_parse now
  rejects NULL even with len == 0, matching qjd_parse and avoiding
  any reliance on slice::from_raw_parts's edge cases.

- Important: add doc_held_across_failed_parse_is_stale (FFI-level)
  to prove the actual safety claim — a held doc fails the gen
  check after a failed parse, not merely that state/gen invariants
  hold inside the decoder.

- Important: extend equivalence coverage with two probes hitting
  the categories the existing fixture probes did not exercise:
  escape-heavy strings (scratch reuse) and 5-6 level nested object
  with repeated key access (skip-cache reuse).

- Important: cursor_opened_before_reparse_becomes_stale now also
  opens a fresh cursor on the post-reparse doc and walks it, proving
  the stale check does not poison the new path.

- Add null_buf_rejected_even_with_zero_len for the harmonized
  qjd_decoder_parse contract.
@membphis membphis force-pushed the worktree-gh-issue-6 branch from 6b16353 to 7d9cf44 Compare May 15, 2026 22:32
@membphis membphis merged commit 0721d7d into main May 15, 2026
1 check passed
membphis added a commit that referenced this pull request May 16, 2026
* Revert "perf: decoder/document instance pooling (#6) (#11)"

This reverts commit 0721d7d.

* bench: warmup + median + interleaved + pooled/one-shot scenarios

The single-run-with-mean output the bench used to print swung 30-40%
between invocations on noisy machines, making it hard to tell signal
from noise when comparing perf commits.

- bench() now runs a warmup pass (JIT trace compile, pool fill), then
  five timed rounds. Reports median and mean ops/s plus the round-by-
  round min..max range so reviewers can see whether a delta is real.
- Add an `interleaved 100k,200k,500k,1m` scenario that rotates through
  four payload sizes, matching a server that handles varying request
  sizes back to back. The single-payload loops cannot exercise the
  doc pool the way real traffic does.
- For each scenario, probe `qd.new_decoder` and run two extra qd
  variants when present:
    quickdecode pooled :parse           — reused decoder across iters
    quickdecode new_decoder()+parse     — one-shot per iter (no reuse)
  So a reader can directly compare the legacy qd.parse path, the
  pool-API-with-reuse path, and the realistic "user creates a fresh
  decoder per request" pattern in one bench run.

Also ship benches/perf_probe.lua: a minimal hammer over qd.parse on a
fixed payload for use under `perf record` when investigating FFI hot
paths. Not invoked by Makefile targets.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: decoder/document instance pooling to amortize per-parse allocations

1 participant