feat: lazy table cjson-compatible API (qd.decode / qd.encode)#15
Merged
Conversation
Adds qd.decode/qd.encode/qd.materialize/qd.pairs/qd.ipairs designed as a near-drop-in replacement for callers migrating from cjson. The new API lives alongside the existing path-based qd.parse + get_str surface. Key decisions captured in the spec: - decode returns a Lua table with LazyObject / LazyArray metatable; reads route through __index to FFI; nested containers stay lazy. - __newindex materializes the affected level only (shallow); nested proxies remain lazy. After materialization, that level is a normal Lua table. - qd.encode is the canonical exit point — cjson.encode bypasses metamethods in C and cannot transparently consume a lazy proxy. qd.encode emits the original JSON substring for unmodified subtrees and walks via lua_next for materialized ones. - Sentinels alias cjson.null / cjson.empty_array_mt when cjson is loaded, fall back to local definitions otherwise. - One new FFI export, qjd_cursor_bytes, exposes the original byte range for a cursor (needed for encode's substring fast path).
19 bite-sized TDD tasks decomposed from docs/superpowers/specs/2026-05-16-lazy-table-cjson-compat-design.md: 1. qjd_cursor_bytes FFI (substring fast path) 2. qjd_cursor_object_entry_at FFI (object iter) 3. Lua module skeleton + sentinel bridge 4. LazyObject __index for scalars 5. __index wrapping nested containers as proxies 6. LazyArray __index for integer keys 7. __len 8. __pairs / qd.pairs for LazyObject (+ decode_cursor refactor) 9. __ipairs / qd.ipairs for LazyArray 10. __newindex — shallow first-write materialization 11. qd.materialize — recursive 12. qd.encode for lazy proxies + __tostring 13. qd.encode for scalars 14. qd.encode for real / mixed tables 15. wire qd.decode / qd.encode / etc to top-level quickdecode 16. cjson round-trip equivalence + sentinel coverage tests 17. bench scenarios for qd.decode / qd.encode 18. README usage section + roadmap entry for O(N) iterator 19. final CI gate Each task includes the failing test, the implementation, the verification command (luajit -e smoke since busted is not on the dev machine), and a focused commit.
Address review feedback on qjd_cursor_bytes (ae8152e): - Extract scalar_byte_range() so scalar_bytes and qjd_cursor_bytes share the start/end-with-whitespace-strip computation. Prevents the two copies from drifting as the helper evolves. - Replace magic `7` in the NULL-out-pointer test with qjd_err::QJD_INVALID_ARG as c_int. - Add bytes_of_root_array_covers_full_json so the `[` branch is exercised.
Fix inaccurate comment in read_object_field (cur_box -> root_box). Explicitly handle the false return from check(trc) and check(brc) in _M.decode instead of silently discarding it.
Add read_array_index with 1-based indexing, null/bool/num/str/nested support and fix walk_children to visit trailing scalar elements whose indices entry equals the parent closer (i <= end + empty-container guard).
Add walk_children_trailing_scalar_integer and walk_children_trailing_scalar_bool to tests/ffi_cursor.rs. Both call qjd_cursor_index with i=2 on a 3-element scalar array and assert the trailing element is reachable, locking in the `while i <= end` fix from the Task 6 cursor.rs change.
Refactor wrap_child to take the source box (cdata array) explicitly rather than capturing the global child_box implicitly. Extract decode_cursor(parent_view, src_box) to eliminate duplicate type-dispatch logic in read_object_field and read_array_index, both of which now have a single-line tail. Add lazy_object_iter + LazyObject.__pairs using qjd_cursor_object_entry_at, and expose _M.pairs as the public entry point. Each iterator-produced child proxy gets its own own_box copy, so collected proxies are not aliased to the shared child_box scratch buffer.
…hed proxy identity
Stock LuaJIT 5.1 invokes __len only on userdata, not tables, unless built with LUAJIT_ENABLE_LUA52COMPAT (OpenResty's default). The CI runner uses Ubuntu's apt luajit, which is not a compat build — so `#lazy_t` returned rawlen=0 and two specs failed. Mirror what `__pairs` / `__ipairs` already do: keep the metamethod for LJ52 builds and expose an explicit `qd.len(t)` helper that works on both builds. Spec probes for __len-on-table support at load time and marks the `#t` cases pending when absent.
These design notes were produced by the brainstorming/planning workflow during development and are not part of the public documentation. Drop the directory and the three callers (CLAUDE.md, README.md, src/lib.rs) that linked to it.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a cjson-shaped API on top of quickdecode so callers can migrate from
cjsonwith two symbol swaps (cjson→qd,cjson.encode→qd.encode) while keeping quickdecode's lazy-evaluation win for "parse + extract a few fields" workloads.qd.decode(json)returns a Lua table withLazyObject/LazyArraymetatable; reads go through__indexto FFI on demand.qd.encode(v)works on lazy proxies (original-substring fast path), real Lua tables (matchingcjson.encodeoutput), and mixed trees. Nested mutations propagate correctly.qd.materialize(v)recursively converts a lazy view into a plain Lua table for callers that have to pass to a third-party encoder.qd.pairs/qd.ipairs,__pairs/__ipairs(LJ52),__len,__newindex(shallow first-write materialization).qd.null/qd.empty_array_mtaliased tocjson.null/cjson.empty_array_mtwhen cjson is loaded, with local fallbacks otherwise.Two new Rust FFI exports support this:
qjd_cursor_bytes— original-buffer byte range, for substring emit on unmodified subtrees.qjd_cursor_object_entry_at— i-th key/value pair of an object cursor, for iterators and materialization.Plus a pre-existing Rust correctness fix surfaced during implementation:
walk_childreninsrc/cursor.rspreviously skipped the trailing scalar element of arrays (while i < end→while i <= endwith empty-container guard). Direct regression test intests/ffi_cursor.rs.Why
The library's headline pitch is 14–44× faster than cjson on "parse once, read a few fields" workloads, but its existing path-based API (
d:get_str("messages[0].role")) requires rewriting every cjson-shaped call site to migrate. The lazy table API makes the migration largely a two-symbol swap while preserving the underlying lazy advantage (untouched subtrees never decode, encode emits the original bytes via memcpy).Bench results vs cjson (median ops/s):
The lazy-table read path is ~30–40% behind the existing path-API on small payloads (per-
__indexdispatch + transient cdata wrap), converging at multi-MB sizes. Documented in README Roadmap as a structural gap worth investigating if a workload need surfaces.Architecture
lua/quickdecode/table.lua(new, 480 lines) —LazyObject/LazyArraymetatables,qd.decode/encode/materialize/pairs/ipairs, sentinel bridging.lua/quickdecode.lua— adds twoffi.cdeflines for the new exports, re-exports the lazy API at the top level (soqd.decodeworks without separately requiringquickdecode.table).src/ffi.rs,src/doc.rs,src/cursor.rs,include/lua_quick_decode.h— 2 new FFI exports + thewalk_childrenfix.tests/ffi_cursor_bytes.rs,tests/ffi_object_iter.rs(new) — direct FFI coverage.tests/lua/lazy_table_spec.lua(new, 332 lines) — 75 busted tests including cjson round-trip equivalence over 6 fixtures + sentinel coverage + nested-mutation + cached-proxy identity regression tests.benches/lua_bench.lua— two new rows (qd.decode + t.field x3,qd.decode + qd.encode (unmodified)) in every scenario + the interleaved block.docs/superpowers/specs/2026-05-16-lazy-table-cjson-compat-design.md— design doc.docs/superpowers/plans/2026-05-16-lazy-table-cjson-compat.md— implementation plan.Notable correctness work
box[0]on aqjd_cursor[1]returns a reference, not a copy, so storing it in a Lua table and then reusing the box for the next FFI call silently corrupts the stored reference. Solved with a per-viewown_boxpattern (wrap_childdoesffi.copyfrom a passed-in source box into a fresh allocation, kept alive on the view as_cur_box).walk_childrentrailing scalar — Pre-existing latent bug, found via thet[4] == qt.nulltest on[10,"x",true,null]. Fixed and locked in by a Rust unit test (tests/ffi_cursor.rs::walk_children_trailing_scalar_*).t.a.b.c = 999materializes onlyt.a.b, butt's metatable is stillLazyObject. Plain substring fast path emits the original (unmutated) bytes. Fixed with anis_dirty(v)walker that scans rawget-cached children for materialization, falling through to a walking encoder only when needed. Substring fast path still fires for fully-untouched subtrees.__newindexwas rebuilding fresh proxies and overwriting cached child references. Fixed by snapshotting the rawget cache before nilling internals and preferring cached entries during the rebuild. Uses rawnext()rather thanpairs()(which would invoke__pairsand walk the FFI side).Test plan
cargo test --release(72 unit + integration, default features)cargo test --release --no-default-features(scalar scanner)cargo test --features test-panic --releasemake test→ 75 busted Lua tests (0 failures, 0 errors)make benchruns cleanly; new rows produce sane numbersMigration story for callers
cjson.nullandcjson.empty_array_mtare reused when cjson is loaded, so existingv == cjson.nullchecks keep working unchanged.Related
docs/superpowers/specs/2026-05-16-lazy-table-cjson-compat-design.mddocs/superpowers/plans/2026-05-16-lazy-table-cjson-compat.md