Initial v1: Rust JSON decoder for LuaJIT FFI by membphis · Pull Request #1 · api7/lua-qjson

membphis · 2026-05-15T14:19:38Z

Summary

Rust cdylib JSON decoder (libquickdecode.so) exposed to LuaJIT via FFI. Optimized for "parse once, extract a few fields, discard" workloads — skips the full Lua-table construction that lua-cjson has to pay.
Two-phase architecture: Phase 1 runs a single SIMD structural scan (scalar fallback + AVX2 with PCLMUL, runtime-dispatched) and records byte offsets of structural chars; Phase 2 lazily resolves paths and decodes values, with a per-container sibling-skip cache for shared-prefix access.
Ships the C header (include/lua_quick_decode.h), a LuaJIT wrapper module (lua/quickdecode.lua with Doc + Cursor OO API), busted spec files, and a vs lua-cjson benchmark script.

Design spec: docs/superpowers/specs/2026-05-15-rust-quick-json-decode-design.md
Implementation plan: docs/superpowers/plans/2026-05-15-rust-quick-json-decode.md

Test plan

cargo test --release — 95 tests passing (65 unit + 29 integration + 1 proptest with 2000 cases)
cargo test --features test-panic --release — verifies the catch_unwind panic barrier
proptest cross-check: Scalar vs AVX2 produce bit-identical output across 2000 random inputs
AVX2 tail-bug repro (193-byte [{}{}...,]) confirmed fixed
Lua side: busted tests/lua --lpath='./lua/?.lua' --cpath='./target/release/lib?.so' (requires LuaJIT + busted; not run in CI here)
luajit benches/lua_bench.lua against lua-cjson for end-to-end timing/allocation comparison

Roadmap (deferred items captured in `README.md`)

ARM64 NEON backend, SmallVec fast path, SIMD backslash search, lexical float parser, lossless 64-bit int cdata mode, skip-cache LRU, Phase 1 error position, AVX2 tail-bypass optimization.

Captures the two-phase architecture (SIMD structural scan + lazy field decode), C ABI shape for LuaJIT FFI, sibling-skip cache for shared-prefix path access, and benchmark targets vs lua-cjson. Also seeds README Roadmap with NEON backend deferred to a later iteration.

18 tasks covering: project scaffold, ScalarScanner with shallow JSON validation, Document + Phase 1 FFI, zero-alloc PathIter, Cursor with lazy sibling-skip cache, string escape decode, number decode, typed getters and cursor C ABI, panic safety, AVX2 scanner staged through four sub-tasks (structural mask, escape, PCLMUL inside-string, multi-chunk carry + dispatch + proptest cross-check), and LuaJIT wrapper with busted tests and lua-cjson benchmark.

Two jobs: - rust: cargo build --release, cargo test --release, and cargo test --features test-panic --release on ubuntu-latest with stable Rust. - lua: depends on rust passing; installs LuaJIT 2.1 + LuaRocks via leafo/gh-actions-{lua,luarocks}, then busted and lua-cjson, and runs busted tests/lua against the built libquickdecode.so. Triggers on push to master/main and on all pull requests.

leafo/gh-actions-lua@v10 was failing with 404 when downloading luajit-2.1.0-beta3 (the LuaJIT project removed that tag). Install LuaJIT, lua5.1 dev headers, and luarocks from Ubuntu's apt instead; LuaJIT is ABI-compatible with Lua 5.1 so rocks built against 5.1 headers load fine under luajit. busted runs the tests via '--lua=\$(which luajit)'.

Address the Important #1 + three Minor findings from the local review of PR #4 (the docs drifted from the code introduced in the same PR): - benches/lua_bench.lua: rewrite the size-accuracy comment. The prior draft referenced a `remaining + slack` upper-bound expression that the final code does not use; replace with the two-branch behaviour actually implemented (normal `min(500K, remaining)`; final image falls through to `max(1024, remaining)`) and the real worst-case overshoot (~1 KB, not "~10 KB"). - Makefile: explain the `:[^#]*## ` FS choice and its tradeoff (targets with `#` in the prerequisite list won't render — none today). - tests/scanner_crosscheck.rs: clarify that the indices assertion holds for both Ok and Err cases because scan_emit_resume always completes emission before any potential Err, and validate_brackets does not modify the index list. - src/scan/avx2.rs: spell out the scalar_start invariant including the exact boundary case (scalar_start == buf.len() when i == buf.len()-1 && in_string != 0 && bs_carry != 0) that scan_emit_resume's post-loop in_str check covers. No code changes. All tests still pass.

* chore: address PR #3 review hygiene items Five Important findings + four small Minor follow-ups from the PR #3 local code review: Important - tests/scanner_crosscheck.rs — drop the stale "AVX2 does not validate brackets" comment and tighten the proptest to require full Result equality (Ok/Err verdict + error offset) and indices equality on every case, not just on Ok. After the tail-bypass fix scalar and AVX2 run the same scan_emit_resume + validate_brackets pipeline, so this is now enforceable. Still passes the 2000-case proptest. - benches/lua_bench.lua — switch image-size RNG from math.random (which delegates to libc rand() and varies across machines) to a deterministic Park-Miller LCG. Same target_bytes now produces byte-identical output on any LuaJIT 2.1 host. - benches/lua_bench.lua — tighten the loop so the actual payload size matches its label. Cap the per-iteration `upper` at `remaining` and allow the last image to shrink below the 50 KB floor when fewer bytes remain. Observed: every scenario now lands within ~0.1% of its label (100k -> 102351 bytes, 1m -> 1048527, 10m -> 10485711) vs up to +49% before. - src/scan/avx2.rs — remove the dead `else if in_string != 0` branch inside the tail handler. `i < buf.len()` makes the `scalar_start <= buf.len()` check trivially true, and scan_emit_resume already returns Err(buf.len()) when start == buf.len() and in_string is set. Replace the unreachable branch with a comment that documents the invariant. - src/scan/mod.rs — drop the inaccurate "the check is defensive" wording from validate_brackets. The function is correctness-coupled with the scanner that produced its index list; a forged quote would flip in_string and mask later mismatches. Minor - Makefile — match the spec's "target — description" help format (em-dash separator) and tighten the awk FS pattern from `:.*## ` to `:[^#]*## ` so descriptions containing `##` aren't truncated by the greedy `.*`. - benches/lua_bench.lua — bump 2m / 5m / 10m iters from 10 to 20 so bigger-payload measurements ride out one-shot allocator / page-fault noise. - src/scan/avx2.rs — rename `escaped_quotes_do_not_trip_fastpath` to `escaped_quotes_remain_correct_with_fastpath`. The test asserts parity with scalar, not that the branch was taken (we have no counter to observe that), so the name should reflect what's actually checked. cargo test --release: 70+3+10+1+5+3+12+1+1 = 106 unit/integration tests plus the 2000-case proptest, all pass. cargo test --release --no-default-features (scalar-only build): 60+3+10+1+5+3+12+1+1 = 96 tests, all pass. * docs: align review-followup comments with actual behavior Address the Important #1 + three Minor findings from the local review of PR #4 (the docs drifted from the code introduced in the same PR): - benches/lua_bench.lua: rewrite the size-accuracy comment. The prior draft referenced a `remaining + slack` upper-bound expression that the final code does not use; replace with the two-branch behaviour actually implemented (normal `min(500K, remaining)`; final image falls through to `max(1024, remaining)`) and the real worst-case overshoot (~1 KB, not "~10 KB"). - Makefile: explain the `:[^#]*## ` FS choice and its tradeoff (targets with `#` in the prerequisite list won't render — none today). - tests/scanner_crosscheck.rs: clarify that the indices assertion holds for both Ok and Err cases because scan_emit_resume always completes emission before any potential Err, and validate_brackets does not modify the index list. - src/scan/avx2.rs: spell out the scalar_start invariant including the exact boundary case (scalar_start == buf.len() when i == buf.len()-1 && in_string != 0 && bs_carry != 0) that scan_emit_resume's post-loop in_str check covers. No code changes. All tests still pass.

membphis added 30 commits May 15, 2026 12:07

Scaffold crate with error codes and C header skeleton

70e07aa

Add ScalarScanner with shallow JSON validation

391d92d

Fix formatting in ScalarScanner to comply with rustfmt

bf2224b

Add Document and qjd_parse/qjd_free/qjd_strerror FFI

7f0dd68

Add zero-alloc PathIter for path string parsing

22c424d

Add Cursor with brute-force path resolution

6c8ed52

Add lazy sibling-skip cache for cursor path resolution

37d6324

Add lazy string escape decode with surrogate-pair handling

8c44e2c

Add lazy i64/f64 number decode with overflow checking

c8f491b

Add qjd_typeof / qjd_is_null / qjd_len FFI

ccc7605

Fix cursor_len incorrectly treating single-scalar containers as empty

0d03a9e

Add qjd_get_str / get_i64 / get_f64 / get_bool FFI getters

8934cb6

Add qjd_cursor type and qjd_open / qjd_cursor_* FFI

d4bcaf5

Wrap FFI entry points in catch_unwind to prevent UB on panic

86953ae

Add AVX2 scanner skeleton with structural mask kernel

d593a5d

AVX2 scanner: chunk-local quote and escape masks

9fb6535

AVX2 scanner: PCLMUL prefix-XOR for inside-string mask

0c63fad

AVX2 scanner cross-chunk carry, runtime dispatch, proptest cross-check

575b67a

Finalize C header and add LuaJIT wrapper module

c052617

Add Lua integration tests and lua-cjson benchmark

125c9ea

Fix AVX2 tail scan dropping structural chars on bracket-close in tail

e68cd34

Remove panic = abort from release profile for catch_unwind safety

07411ee

Gate qjd_test_panic behind test-panic feature flag

54d9bed

Complete README Roadmap with all deferred items from design spec

0619aec

chore: add .gitignore for build artifacts and stray binaries

8e41308

ci: set LD_LIBRARY_PATH so ffi.load can find libquickdecode.so

1a45c93

membphis merged commit 63d3e00 into master May 15, 2026
2 checks passed

membphis deleted the worktree-rust-quick-json-decode-v1 branch May 15, 2026 14:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial v1: Rust JSON decoder for LuaJIT FFI#1

Initial v1: Rust JSON decoder for LuaJIT FFI#1
membphis merged 30 commits into
masterfrom
worktree-rust-quick-json-decode-v1

membphis commented May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

membphis commented May 15, 2026

Summary

Test plan

Roadmap (deferred items captured in README.md)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Roadmap (deferred items captured in `README.md`)