feat(lazy): structural patching for decode+modify+encode by membphis · Pull Request #44 · api7/lua-qjson

membphis · 2026-05-18T10:49:00Z

Summary

Add lazy patch support to avoid materializing the entire root container when modifying a few fields on a qjson.decode() result.

Problem

Current behavior when modifying a field:

local tab = qjson.decode(json_str)  -- Returns LazyObject proxy
tab.model = "gpt4o"                 -- Triggers __newindex → materializes entire root
local out = qjson.encode(tab)       -- Walks materialized table

This defeats the lazy decode advantage for the common "decode → modify few fields → encode" workflow.

Solution

Record patches instead of materializing, then splice the original buffer during encode:

local tab = qjson.decode(json_str)  -- Returns LazyObject proxy
tab.model = "gpt4o"                 -- Records patch, no materialization
local out = qjson.encode(tab)       -- Splices original buffer with patched values

Changes

Rust FFI: Add qjson_cursor_field_bytes to get field value byte spans
Lua: Modify LazyObject.__newindex to record patches
Lua: Modify LazyObject.__index to check patches first
Lua: Add encode_with_patches for splice-based encoding
Tests: Add lazy_patch_spec.lua
Docs: Add docs/lazy-patch-spec.md with detailed design

Status

Basic functionality working (patch, read, encode)
Lazy subtrees preserved after root modification
Performance optimization (currently ~32μs, target ~10μs)
Delete field support
Array patch support

Test Results

Test 1 - Basic patch: PASS
Test 2 - Read patched value: PASS
Test 3 - Multiple patches: PASS
Test 4 - Preserve lazy subtrees: PASS
Test 5 - Performance: 32μs/op (target <15μs) - needs optimization

Summary by CodeRabbit

New Features
- Lazy patching for efficient JSON field updates without fully materializing structures.
- Ability to retrieve a field's byte-range from JSON cursors to enable targeted encoding.
Documentation
- New design memo specifying lazy-patch semantics, encoding strategy, edge cases, and rollout plan.
Tests
- Comprehensive test suite covering patching, iteration, nested behaviors, edge cases, and regressions.

Add lazy patch support to avoid materializing the entire root container when modifying a few fields. Instead of triggering full materialization on __newindex, we now record patches and splice the original buffer during encode. Changes: - Add qjson_cursor_field_bytes FFI function to get field value byte spans - Modify LazyObject.__newindex to record patches instead of materializing - Modify LazyObject.__index to check patches first - Add encode_with_patches for splice-based encoding - Add lazy_patch_spec.lua tests Performance: decode+modify+encode on 100KB payload - Before: ~30 μs/op (full materialization + walking encode) - After: ~32 μs/op (WIP - splice path needs optimization) See docs/lazy-patch-spec.md for detailed design.

Copilot

Pull request overview

Adds a "lazy patch" path so that mutating a few fields on a qjson.decode() proxy no longer materializes the whole root container. Writes are recorded in _patches/_deleted side tables, and qjson.encode() splices the original buffer (with new helper FFI qjson_cursor_field_bytes) instead of walking a materialized table.

Changes:

New FFI qjson_cursor_field_bytes (Rust + C header + LuaJIT cdef) returning the byte span of a field's value in the original buffer.
Rewrites LazyObject.__newindex/__index/__pairs to record/read patches and deletions; adds encode_with_patches and encode_lazy_object_walking_with_patches to encode lazy objects with patches.
Adds tests/lua/lazy_patch_spec.lua and docs/lazy-patch-spec.md.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
src/ffi.rs	Adds `qjson_cursor_field_bytes` returning a field value's byte range.
include/qjson.h	Declares the new FFI entry point.
lua/qjson.lua	Adds matching LuaJIT cdef for `qjson_cursor_field_bytes`.
lua/qjson/table.lua	Replaces materialize-on-write with patch recording; new patch-aware encode and iteration paths.
tests/lua/lazy_patch_spec.lua	Spec covering basic patch/read/delete/new-field/iteration/nesting/edge cases.
docs/lazy-patch-spec.md	Detailed design doc explaining problem, algorithm, edge cases, and benchmarks.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

membphis · 2026-05-18T13:43:32Z

+    if has_deleted or #new_fields > 0 then
+        -- Fall back to walking for complex cases
+        return encode_lazy_object_walking_with_patches(t, patches, deleted)
+    end
+
+    -- Sort replacements by start offset
+    table.sort(replacements, function(a, b) return a.start < b.start end)
+
+    -- Build output by splicing
+    local parts = {}
+    local pos = t._bs + 1  -- 1-based Lua index
+
+    for _, r in ipairs(replacements) do
+        -- Copy unchanged portion (convert 0-based to 1-based)
+        local r_start_1based = r.start + 1
+        if r_start_1based > pos then
+            parts[#parts + 1] = buf:sub(pos, r_start_1based - 1)
+        end
+        -- Insert replacement
+        parts[#parts + 1] = r.value
+        pos = r.end_ + 1  -- end_ is exclusive, so +1 for 1-based
+    end
+
+    -- Copy remaining portion
+    if pos <= t._be then
+        parts[#parts + 1] = buf:sub(pos, t._be)
+    end
+
+    return table.concat(parts)
+end
+
+-- Walk a LazyObject with patches, handling deletions and new fields
+local function encode_lazy_object_walking_with_patches(t, patches, deleted)


Fixed in 176558a — encode_lazy_object_walking_with_patches is now defined ahead of encode_with_patches. Verified: tests/lua/lazy_patch_spec.lua 26/26 pass; full Lua suite 112/112.

membphis · 2026-05-18T13:43:40Z

+-- Check if a lazy view has patches (but not necessarily dirty children)
+local function has_patches(v)
+    local patches = rawget(v, "_patches")
+    return patches and #patches > 0
+end
+
+-- Check if a lazy view has deletions
+local function has_deletions(v)
+    local deleted = rawget(v, "_deleted")
+    if not deleted then return false end
+    for _ in pairs(deleted) do return true end
+    return false
+end


Confirmed unused. grep -n "has_patches\|has_deletions" lua/qjson/table.lua only matches the definitions. Removing both.

membphis · 2026-05-18T13:43:52Z

 LazyObject.__newindex = function(t, k, v)
-    local contents = materialize_object_contents(t)
-    -- Snapshot user-key cache BEFORE nilling internals.
-    -- Use next() for raw iteration: pairs() invokes __pairs on lazy tables,
-    -- walking the full JSON via FFI instead of the Lua-side rawget cache.
-    local cache = {}
-    local ck, cv = next(t)
-    while ck ~= nil do
-        if not INTERNAL_KEYS[ck] then
-            cache[ck] = cv
-        end
-        ck, cv = next(t, ck)
+    -- Initialize patches table if needed
+    if not rawget(t, "_patches") then
+        rawset(t, "_patches", {})
+        rawset(t, "_deleted", {})
    end
-    t._doc, t._cur_box, t._cur, t._bs, t._be = nil, nil, nil, nil, nil
-    setmetatable(t, nil)
-    for _, kv in ipairs(contents) do
-        rawset(t, kv[1], cache[kv[1]] or kv[2])
+
+    local patches = rawget(t, "_patches")
+    local deleted = rawget(t, "_deleted")
+
+    if v == nil then
+        -- Mark key as deleted
+        deleted[k] = true
+        -- Remove from patches if previously patched
+        for i, p in ipairs(patches) do
+            if p.key == k then
+                table.remove(patches, i)
+                break
+            end
+        end
+    else
+        -- Encode the new value (we need both encoded and lua_value)
+        local encoded = encode(v)
+
+        -- Update or add patch
+        local found = false
+        for _, p in ipairs(patches) do
+            if p.key == k then
+                p.encoded_value = encoded
+                p.lua_value = v
+                found = true
+                break
+            end
+        end
+        if not found then
+            patches[#patches + 1] = { key = k, encoded_value = encoded, lua_value = v }
+        end
+
+        -- Remove from deleted if previously deleted
+        deleted[k] = nil
    end
-    rawset(t, k, v)
 end


Real bug. t[1] = "x" on a LazyObject records {key=1, ...}, then read_object_field returns nil for non-string keys (line 117) so the read is silently lost, and encode_with_patches errors on #p.key later. Fix: raise in LazyObject.__newindex for non-string keys — matches the existing encode_object policy at line 774.

membphis · 2026-05-18T13:43:52Z

+    -- For each patch, find the field's byte range in the original buffer
+    for _, p in ipairs(patches) do
+        local rc = C.qjson_cursor_field_bytes(t._cur, p.key, #p.key, sz_a, sz_b)
+        if rc == QJSON_OK then
+            -- Existing field: replace value
+            replacements[#replacements + 1] = {
+                start = tonumber(sz_a[0]),
+                end_ = tonumber(sz_b[0]),
+                value = p.encoded_value,
+            }
+        end
+        -- If NOT_FOUND, it's a new field - handled separately below
+    end
+
+    -- Collect deleted field spans (we need to find the full field span including key)
+    -- For simplicity, we'll handle deletions by walking and skipping deleted keys
+    local has_deleted = false
+    for _ in pairs(deleted) do has_deleted = true; break end
+
+    -- If we have deletions or new fields, fall back to walking
+    -- (splicing deletions is complex due to comma handling)
+    local new_fields = {}
+    for _, p in ipairs(patches) do
+        local rc = C.qjson_cursor_field_bytes(t._cur, p.key, #p.key, sz_a, sz_b)
+        if rc == QJSON_NOT_FOUND then
+            new_fields[#new_fields + 1] = p
+        end
+    end
+
+    if has_deleted or #new_fields > 0 then
+        -- Fall back to walking for complex cases
+        return encode_lazy_object_walking_with_patches(t, patches, deleted)
+    end


Valid. Collapsing the two passes into one — classify each patch into replacements or new_fields in a single qjson_cursor_field_bytes call.

membphis · 2026-05-18T13:43:53Z

+    -- Check patches first
+    local patches = rawget(self, "_patches")
+    if patches then
+        for _, p in ipairs(patches) do
+            if p.key == key then
+                return p.lua_value
+            end
+        end
+    end
+
+    -- Check deleted
+    local deleted = rawget(self, "_deleted")
+    if deleted and deleted[key] then
+        return nil
+    end


Declining for now. Typical patch counts in this workflow are 1–3 modified fields; a 3-entry linear scan is faster than maintaining a parallel hash table (the FFI cost from encode/qjson_cursor_field_bytes already dominates). The O(n·m) only matters when m grows large, which the lazy-patch design discourages (assigning many fields → user is better off materializing). Happy to revisit if a profile shows it becoming hot; adding to README "Roadmap / Deferred" as a noted optimization.

coderabbitai · 2026-05-18T10:52:20Z

📝 Walkthrough

Walkthrough

Adds lazy patch tracking to qjson lazy proxies, a new FFI cursor API to obtain field byte spans, patch-aware encoding paths (splicing or walking fallback), a design/spec doc, and comprehensive Lua tests validating read/write/iterate/encode behavior.

Changes

Lazy Patching Implementation

Layer / File(s)	Summary
FFI Cursor Extension for Byte-Range Lookup `include/qjson.h`, `lua/qjson.lua`, `src/ffi.rs`	New `qjson_cursor_field_bytes` C API returns `[value_start, value_end)` for a given object-field key to support splice-based encoding.
Design spec: Lazy Patch `docs/lazy-patch-spec.md`	Design memo describing patch data structures (`_patches`, `_deleted`), read/write/iteration semantics, `encode_with_patches` splice algorithm, edge cases, phased implementation plan, tests, and benchmarks.
Lua: Patch Tracking, Read/Write, Iterate, Encode `lua/qjson/table.lua`	Adds per-view `_patches`/`_deleted`; `__index` checks patches/deletions before cursor lookups; `__newindex` records encoded patches instead of materializing; `__pairs` merges original fields with patches; `is_dirty` and `encode_proxy` route to slice, splice, or walking encoders as appropriate.
Comprehensive Lua Tests `tests/lua/lazy_patch_spec.lua`, `tests/lua/lazy_table_spec.lua`	New tests cover single/multi-field patches, additions, deletions, type changes, nested modifications, iteration semantics with patches, edge cases, and metatable preservation; `__newindex` tests updated to expect patch-tracking.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 5 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
E2e Test Quality Review	⚠️ Warning	Critical bug: lines 612, 629 use stale patch.encoded_value. When patched Lua table is modified post-assignment, encode() outputs old value. Missing test coverage for this scenario.	Add test: t.a={x:10}; t.a.x=20; encode(t) should output x=20. Fix: re-encode patch.lua_value if it's a mutable table, not just use cached patch.encoded_value.

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title clearly and specifically describes the main change: adding structural patching support that allows efficient decode→modify→encode workflows.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Security Check	✅ Passed	No security vulnerabilities found. JSON library with no logging, database, auth, or crypto code. Error messages reveal only type info, not sensitive data.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch worktree-crystalline-gliding-volcano

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 10

🧹 Nitpick comments (3)

src/ffi.rs (1)
840-867: 💤 Low value

Consider extracting the byte-range calculation into a shared helper.

The match block for computing [start, end) from a cursor (lines 846-867) duplicates the logic in qjson_cursor_bytes (lines 738-759). Both check the lead byte, handle containers/strings by using idx_end, and delegate scalars to scalar_byte_range.

Extracting this into a helper like cursor_byte_range(d: &Document, cur: Cursor) -> Result<(usize, usize), qjson_err> would reduce duplication and ensure consistency.
♻️ Suggested helper extraction
/// Compute the byte range [start, end) for a cursor's value in the original buffer.
unsafe fn cursor_byte_range(d: &Document<'_>, cur: Cursor) -> Result<(usize, usize), qjson_err> {
    let pos = d.indices[cur.idx_start as usize] as usize;
    let lead = match d.buf.get(pos) {
        Some(b) => *b,
        None => return Err(qjson_err::QJSON_PARSE_ERROR),
    };
    match lead {
        b'{' | b'[' | b'"' => {
            let end = d.indices[cur.idx_end as usize] as usize;
            if end >= d.buf.len() {
                return Err(qjson_err::QJSON_PARSE_ERROR);
            }
            Ok((pos, end + 1))
        }
        _ => scalar_byte_range(d, cur),
    }
}
Then both qjson_cursor_bytes and qjson_cursor_field_bytes can call this helper.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/ffi.rs` around lines 840 - 867, The byte-range computation for a cursor
is duplicated; extract it into a shared unsafe helper like cursor_byte_range(d:
&Document<'_>, cur: Cursor) -> Result<(usize, usize), qjson_err> that performs
the lead-byte check, handles containers/strings by using cur.idx_end and
bounds-checking against d.buf.len(), and delegates scalars to scalar_byte_range;
then replace the duplicated logic in qjson_cursor_bytes and
qjson_cursor_field_bytes (or any other sites) to call cursor_byte_range and map
its Ok/Err to the existing *value_start/*value_end and qjson_err return
conventions.
tests/lua/lazy_patch_spec.lua (1)
3-44: ⚡ Quick win

Missing test case: deletion of non-existent field.

The test suite covers deleting existing fields and re-adding them, but doesn't test deleting a field that was never in the original JSON. This edge case should verify that deleting a non-existent field doesn't cause errors and doesn't affect encoding.
🧪 Proposed additional test
it("deletes a non-existent field without error", function()
    local t = qjson.decode('{"a":1}')
    t.b = nil  -- b never existed
    assert.is_nil(t.b)
    assert.are.equal(1, t.a)
    local out = qjson.encode(t)
    local cjson = require("cjson")
    local parsed = cjson.decode(out)
    assert.is_nil(parsed.b)
    assert.are.equal(1, parsed.a)
end)
Also applies to: 72-97
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/lua/lazy_patch_spec.lua` around lines 3 - 44, Add a new test case named
"deletes a non-existent field without error" to the Lazy Patch suite that
decodes '{"a":1}' via qjson.decode, sets t.b = nil (where b was never present),
asserts t.b is nil and t.a remains 1, then qjson.encode(t) and cjson.decode the
output and assert parsed.b is nil and parsed.a equals 1; place this alongside
the other "it" blocks so the behavior of qjson.decode/encode and the lazy-patch
logic for deleting non-existent fields is validated.
docs/lazy-patch-spec.md (1)
196-208: ⚡ Quick win

The pseudocode helper functions in the specification need clarification or mapping to actual implementation.

Lines 196 and 212 reference find_field_value_span() and find_field_span() which are not defined in the spec. While the lazy patch feature is implemented and functional (confirmed by tests/lua/lazy_patch_spec.lua), the actual implementation uses C FFI calls (C.qjson_cursor_field_bytes()) rather than these abstract function names.

The specification should either:

Define what these abstract helper functions conceptually do (for algorithmic clarity)

Map them explicitly to the actual C FFI interface used in the implementation

Clarify upfront that this is pseudocode describing the algorithm rather than directly executable pseudocode

This will help developers understand the algorithm without needing to cross-reference the Lua implementation.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/lazy-patch-spec.md` around lines 196 - 208, The spec references
undefined helpers find_field_value_span and find_field_span; update the document
to either (a) add brief conceptual definitions of those helpers (what inputs
they take and that they return byte offsets for a field's value/span), (b)
explicitly map them to the real implementation (e.g., call sites using
C.qjson_cursor_field_bytes() via FFI) and show the equivalent return values, or
(c) add a clear note at the top that the listing is high-level pseudocode and
that the real implementation uses C FFI (see tests/lua/lazy_patch_spec.lua);
include the helper names find_field_value_span and find_field_span and the C
symbol C.qjson_cursor_field_bytes in the explanation so readers can locate the
implementation.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/lazy-patch-spec.md`:
- Around line 332-341: Specify and add a concrete implementation for
LazyObject.__pairs that merges original fields with patch entries and excludes
deletions: read raw _patches and _deleted tables, build a patch_map mapping
patch.key -> lua_value and a seen set, iterate the original pairs (using the
preserved original__pairs iterator) yielding keys not marked deleted and not
overridden by patches (returning patch value when key exists in patch_map), then
iterate remaining patch_map keys to yield new/overridden fields, ensuring each
key is yielded only once; reference LazyObject.__pairs, rawget(t, "_patches"),
rawget(t, "_deleted"), and original__pairs to locate and implement this
behavior.
- Around line 254-256: The current splice logic uses result:find("}%s*$") to
locate the closing brace after concatenating parts, which is fragile; modify the
code handling variables result, parts and close_pos so that if
result:find("}%s*$") returns nil you immediately fallback by returning
encode_lazy_object_walking(t) (or otherwise abort the splice), ensuring you do
not proceed with unsafe splicing when the closing brace cannot be reliably
located; update any callers that rely on the splice to handle this early return
and reference t._be only when the brace is confirmed.
- Around line 244-264: The prefix handling can produce invalid JSON when all
original fields were deleted and new_fields is non-empty; in the block that
computes result/close_pos/prefix (using variables result, close_pos, prefix,
parts, new_fields and function find_field_value_span), strip any trailing commas
and whitespace from prefix before appending new_fields (e.g., normalize prefix
by removing trailing "," and spaces so you never produce "{,"), or alternatively
compute whether any fields remain after applying deletions and only insert a
comma when appropriate; update the logic that currently uses
prefix:match("[^{%s]%s*$") to perform this normalization and add unit tests for
the edge case "all fields deleted + new fields added."
- Around line 86-143: The patch structure is inconsistent: LazyObject.__newindex
currently stores only encoded_value but LazyObject.__index expects p.lua_value;
modify __newindex so that when setting a non-nil value it stores both
encoded_value = encode_value(v) and lua_value = v in the patch entry (and when
updating an existing patch update both fields), and ensure deletions still
remove patch entries and clear deleted[k]; also update the patch-record
documentation to list lua_value alongside encoded_value and key so
readers/consumers know both fields exist.

In `@lua/qjson/table.lua`:
- Around line 712-717: The loop calling C.qjson_cursor_field_bytes on t._cur
should handle unexpected return codes the same way other calls do: after getting
rc, keep the existing branch for QJSON_NOT_FOUND that appends the new part, but
add an elseif checking for rc not equal to QJSON_OK (and not QJSON_NOT_FOUND)
and propagate/report the error (e.g., return nil / raise error / log with rc and
context) so failures from C.qjson_cursor_field_bytes are surfaced; locate the
loop over patches and modify the rc handling around C.qjson_cursor_field_bytes,
referencing variables/function names: patches, p, C.qjson_cursor_field_bytes,
QJSON_NOT_FOUND, parts, encode_string, p.encoded_value, t._cur, sz_a, sz_b.
- Around line 613-623: The call to C.qjson_cursor_field_bytes in the loop
(inside the function handling patches) only checks for QJSON_OK and ignores any
other non-NOT_FOUND return codes; update the loop to explicitly handle
unexpected error codes (e.g., QJSON_PARSE_ERROR) returned by
qjson_cursor_field_bytes: after calling C.qjson_cursor_field_bytes(t._cur,
p.key, `#p.key`, sz_a, sz_b) check rc and if rc is QJSON_OK proceed to add the
replacement as now, if rc is QJSON_NOT_FOUND skip to the new-field handling as
before, but for any other rc propagate or surface the error (for example return
nil plus an error message, or log and abort) so the caller of the function can
react instead of silently continuing; reference the symbols
qjson_cursor_field_bytes, QJSON_OK, QJSON_NOT_FOUND (and QJSON_PARSE_ERROR) and
the replacements table when making the change.
- Around line 209-213: The call to C.qjson_cursor_field_bytes (inside the loop
using view._cur and p.key) only handles QJSON_NOT_FOUND and silently ignores
other error codes; update the code to check rc for non-success values (e.g.,
QJSON_TYPE_MISMATCH, QJSON_PARSE_ERROR) and handle them explicitly instead of
skipping the field: after calling qjson_cursor_field_bytes, if rc ==
QJSON_NOT_FOUND keep the existing behavior, else if rc != 0 surface the error
(return an error tuple or raise error with a descriptive message including rc
and p.key, or log it via the module logger) so callers can react to parse/type
errors rather than silently skipping fields. Ensure to reference
qjson_cursor_field_bytes, view._cur, p.key, and the QJSON_* constants when
implementing the check.

In `@tests/lua/lazy_patch_spec.lua`:
- Around line 229-275: Add a new test case inside the describe("Lazy Patch -
edge cases", ...) block named it("handles all fields deleted with new fields
added") that uses qjson.decode to load '{"a":1,"b":2}', sets t.a = nil and t.b =
nil, then assigns t.c = 3 and t.d = 4, asserts the nil and value expectations,
calls qjson.encode and parses with cjson.decode and asserts parsed.a and
parsed.b are nil and parsed.c and parsed.d equal 3 and 4; this verifies
qjson.encode handles the "all fields deleted then new fields added" edge case
without emitting invalid JSON.
- Around line 246-254: The test "handles unicode in values" currently only uses
ASCII; update the test inside the it block that calls qjson.decode/qjson.encode
to use a multi-byte UTF-8 string (e.g., include accented characters and CJK or
emoji) for t.a so the round-trip via qjson.encode and cjson.decode actually
validates Unicode handling; ensure the asserts (assert.are.equal) compare the
original Unicode string to both t.a after assignment and parsed.a after
re-decoding to confirm multi-byte sequences are preserved.
- Around line 240-244: The test "handles special characters in keys" decodes
'{"a.b":1}', patches t["a.b"]=10 and only checks the in-memory value; extend the
test to also call qjson.encode(t) and assert the encoded string matches the
expected JSON with the special key correctly quoted/escaped (e.g.,
'{"a.b":10}'), so the splice/encoding logic in qjson.encode is validated; update
the it block that uses qjson.decode, the local variable t, and add an assertion
against qjson.encode(t).

---

Nitpick comments:
In `@docs/lazy-patch-spec.md`:
- Around line 196-208: The spec references undefined helpers
find_field_value_span and find_field_span; update the document to either (a) add
brief conceptual definitions of those helpers (what inputs they take and that
they return byte offsets for a field's value/span), (b) explicitly map them to
the real implementation (e.g., call sites using C.qjson_cursor_field_bytes() via
FFI) and show the equivalent return values, or (c) add a clear note at the top
that the listing is high-level pseudocode and that the real implementation uses
C FFI (see tests/lua/lazy_patch_spec.lua); include the helper names
find_field_value_span and find_field_span and the C symbol
C.qjson_cursor_field_bytes in the explanation so readers can locate the
implementation.

In `@src/ffi.rs`:
- Around line 840-867: The byte-range computation for a cursor is duplicated;
extract it into a shared unsafe helper like cursor_byte_range(d: &Document<'_>,
cur: Cursor) -> Result<(usize, usize), qjson_err> that performs the lead-byte
check, handles containers/strings by using cur.idx_end and bounds-checking
against d.buf.len(), and delegates scalars to scalar_byte_range; then replace
the duplicated logic in qjson_cursor_bytes and qjson_cursor_field_bytes (or any
other sites) to call cursor_byte_range and map its Ok/Err to the existing
*value_start/*value_end and qjson_err return conventions.

In `@tests/lua/lazy_patch_spec.lua`:
- Around line 3-44: Add a new test case named "deletes a non-existent field
without error" to the Lazy Patch suite that decodes '{"a":1}' via qjson.decode,
sets t.b = nil (where b was never present), asserts t.b is nil and t.a remains
1, then qjson.encode(t) and cjson.decode the output and assert parsed.b is nil
and parsed.a equals 1; place this alongside the other "it" blocks so the
behavior of qjson.decode/encode and the lazy-patch logic for deleting
non-existent fields is validated.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 39a453a2-e4f7-49ae-9d3e-5ab1c9602e90

📥 Commits

Reviewing files that changed from the base of the PR and between 11a644a and 69e6f77.

📒 Files selected for processing (6)

docs/lazy-patch-spec.md
include/qjson.h
lua/qjson.lua
lua/qjson/table.lua
src/ffi.rs
tests/lua/lazy_patch_spec.lua

coderabbitai · 2026-05-18T10:57:49Z

+    -- Handle new fields (not in original)
+    local new_fields = {}
+    for _, p in ipairs(patches) do
+        if not find_field_value_span(t, p.key) then
+            new_fields[#new_fields + 1] = '"' .. p.key .. '":' .. p.encoded_value
+        end
+    end
+
+    if #new_fields > 0 then
+        -- Insert before closing brace
+        local result = table.concat(parts)
+        local close_pos = result:find("}%s*$")
+        if close_pos then
+            local prefix = result:sub(1, close_pos - 1)
+            -- Add comma if there are existing fields
+            if prefix:match("[^{%s]%s*$") then
+                prefix = prefix .. ","
+            end
+            return prefix .. table.concat(new_fields, ",") .. "}"
+        end
+    end


⚠️ Potential issue | 🟠 Major | ⚖️ Poor tradeoff

Risk of invalid JSON when all fields are deleted.

When all original fields are deleted and new fields are added, the logic at line 259 checks if prefix:match("[^{%s]%s*$") to decide whether to add a comma. However, if the prefix is just {, this pattern won't match, and no comma is added. But if replacements included deletions that left trailing commas, you could end up with malformed JSON like {,"new_field":1}.

Consider a more robust approach:

Track whether any fields remain after applying deletions

Normalize comma handling by removing any trailing comma in the prefix before appending new fields

Add unit tests for this specific edge case (all fields deleted + new fields added)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/lazy-patch-spec.md` around lines 244 - 264, The prefix handling can produce invalid JSON when all original fields were deleted and new_fields is non-empty; in the block that computes result/close_pos/prefix (using variables result, close_pos, prefix, parts, new_fields and function find_field_value_span), strip any trailing commas and whitespace from prefix before appending new_fields (e.g., normalize prefix by removing trailing "," and spaces so you never produce "{,"), or alternatively compute whether any fields remain after applying deletions and only insert a comma when appropriate; update the logic that currently uses prefix:match("[^{%s]%s*$") to perform this normalization and add unit tests for the edge case "all fields deleted + new fields added."

The implementation does not use the spec's }%s*$ splice path for this case. When deletions or new fields are present, encode_with_patches falls through to encode_lazy_object_walking_with_patches (table.lua:641-643), which walks original keys (skipping deleted) and then appends new fields — it cannot produce {,. Splice path is only used for the "all-existing-key replacements, no deletions" case. Will add a clarifying note to the spec; adding a regression test for the all-deleted-plus-new edge case (R8) regardless.

@membphis, thanks for the detailed clarification. That makes sense — since encode_proxy routes any case involving deletions or new fields to encode_lazy_object_walking_with_patches before reaching the splice path, the {, scenario can't arise in practice. My concern was against the spec's pseudo-code in isolation, which doesn't make the fallthrough guard explicit in that section.

Adding a clarifying note to the spec (e.g., a comment in the encode_with_patches block stating "only reached when patches are pure replacements of existing keys with no deletions/new fields") would make the invariant self-documenting and prevent future confusion. The R8 regression test is a good defensive addition regardless.

coderabbitai · 2026-05-18T10:57:49Z

+
+```lua
+tab.model = "gpt4o"
+for k, v in qjson.pairs(tab) do
+    print(k, v)
+end
+```
+
+**Handling**: `__pairs` must merge original fields with patches and exclude deleted fields.
+


🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Missing algorithm specification for __pairs iteration.

The edge case section mentions that "__pairs must merge original fields with patches and exclude deleted fields" (line 340), but the algorithm for implementing this is not provided. Given the complexity of correctly merging three sources (original fields, patches, deletions), this algorithm should be specified.

Add a section showing the __pairs implementation, for example:

LazyObject.__pairs = function(t) local patches = rawget(t, "_patches") or {} local deleted = rawget(t, "_deleted") or {} local seen = {} local patch_map = {} -- Build patch map for _, p in ipairs(patches) do patch_map[p.key] = p.lua_value seen[p.key] = true end -- Iterator state local orig_iter, orig_state, orig_key = original__pairs(t) local patch_idx = 1 local patch_keys = {} for k in pairs(patch_map) do patch_keys[`#patch_keys` + 1] = k end return function(state, k) -- Iterate original fields first (skipping deleted & patched) while true do local next_k, next_v = orig_iter(orig_state, k) if next_k == nil then break end if not deleted[next_k] and not seen[next_k] then seen[next_k] = true return next_k, patch_map[next_k] or next_v end k = next_k end -- Then iterate patched/new fields while patch_idx <= `#patch_keys` do local pk = patch_keys[patch_idx] patch_idx = patch_idx + 1 if not seen[pk] then seen[pk] = true return pk, patch_map[pk] end end return nil end, t, nil end

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/lazy-patch-spec.md` around lines 332 - 341, Specify and add a concrete implementation for LazyObject.__pairs that merges original fields with patch entries and excludes deletions: read raw _patches and _deleted tables, build a patch_map mapping patch.key -> lua_value and a seen set, iterate the original pairs (using the preserved original__pairs iterator) yielding keys not marked deleted and not overridden by patches (returning patch value when key exists in patch_map), then iterate remaining patch_map keys to yield new/overridden fields, ensuring each key is yielded only once; reference LazyObject.__pairs, rawget(t, "_patches"), rawget(t, "_deleted"), and original__pairs to locate and implement this behavior.

__pairs is implemented at lua/qjson/table.lua:169-225 (lazy_object_iter + LazyObject.__pairs): iterates original entries via qjson_cursor_object_entry_at, skips deleted, returns patched value when a patch covers the key, then iterates remaining patches and yields only those whose key is QJSON_NOT_FOUND in the original. Adding a pointer to this in the spec.

@membphis, thanks for the clarification! The implementation at lua/qjson/table.lua:169-225 covers the algorithm well — using qjson_cursor_object_entry_at to iterate originals, skipping deleted keys, returning patched values inline, and then draining new-only patches at the end is exactly the right approach. Adding a cross-reference from the spec's edge-case section to that implementation will close the documentation gap nicely.

- Move encode_lazy_object_walking_with_patches before encode_with_patches (Lua local function defined later was unresolvable at compile time, causing nil-call errors on new-field/deletion paths). - Update tests in lazy_patch_spec.lua and lazy_table_spec.lua to match the spec's edge-case-1 design: writing to a LazyObject records a patch on the object instead of materializing it, so the metatable stays LazyObject after assignment.

membphis · 2026-05-18T13:44:32Z

Responses to the three nitpicks in the top-level review:

src/ffi.rs 840-867 (extract cursor_byte_range helper) — declining for now. Two call sites with different return conventions (qjson_cursor_bytes writes via *mut usize outparams; qjson_cursor_field_bytes does the same but after first resolving a child cursor). The shared logic is ~10 lines and only one path. Adding the helper introduces an indirection without consolidating any divergent logic. Will revisit if a third call site appears.
tests/lua/lazy_patch_spec.lua (delete non-existent field test) — adding.
docs/lazy-patch-spec.md 196-208 (undefined find_field_value_span / find_field_span helpers) — these are pseudocode names in the design listing; the actual implementation calls C.qjson_cursor_field_bytes (rust FFI defined in src/ffi.rs:821). Adding an upfront note that the doc is high-level pseudocode and pointing at the FFI symbol.

- LazyObject.__newindex: raise on non-string keys instead of silently recording an unreadable patch (Copilot C3). - Remove unused has_patches / has_deletions helpers (Copilot C2). - encode_with_patches: classify each patch in a single qjson_cursor_field_bytes pass (was doing two) and propagate unexpected return codes via check() (Copilot C4 + CodeRabbit R6). - encode_lazy_object_walking_with_patches: propagate unexpected qjson_cursor_field_bytes return codes (CodeRabbit R7). - lazy_object_iter: same error-propagation fix (CodeRabbit R5). - tests/lua/lazy_patch_spec.lua: extend special-key test to verify encoded output, swap the unicode test to multi-byte UTF-8, add "delete non-existent field", "all originals deleted + new fields added", and "non-string key raises" cases (CodeRabbit R8/R9/R10 + nitpick N2). - docs/lazy-patch-spec.md: add upfront note that the listing is design pseudocode (pointing helper names at qjson_cursor_field_bytes), and correct the patch-record example to include lua_value alongside encoded_value (CodeRabbit R1 + nitpick N3).

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

tests/lua/lazy_patch_spec.lua (1)

120-128: ⚡ Quick win

Add a regression test for mutating a patched table after assignment.

Please add a case like t.a = {x=1}; t.a.x = 2; and assert qjson.encode(t) contains x=2. Current coverage only checks immediate assignment snapshots.

Also applies to: 251-260

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/lua/lazy_patch_spec.lua` around lines 120 - 128, Add a regression case
to the "changes scalar to object" test in lazy_patch_spec.lua that verifies
mutations to a patched table after assignment are preserved: after creating t
via qjson.decode and assigning t.a = {x = 1}, mutate the nested field (t.a.x =
2) and then qjson.encode(t) and decode with cjson (or assert the encoded string)
to assert the value is 2; update the existing it("changes scalar to object"...)
block (and similar block around lines 251-260) to perform t.a = {x=1}; t.a.x =
2; and assert the encoded/decoded result shows x == 2 so the lazy patching
preserves later mutations.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/lazy-patch-spec.md`:
- Line 472: In docs/lazy-patch-spec.md update the two fenced code blocks that
currently have bare backticks (the blocks showing the
"decode/modify/encode/total" timings around the examples including "decode:  
5.97 μs (20%)" and "decode:   5.97 μs (55%)") to include a language identifier
(e.g., change ``` to ```text or ```lua) so the MD040 lint rule is satisfied;
locate the fences that wrap the timing lines and add the chosen language token
to the opening fence.

In `@lua/qjson/table.lua`:
- Around line 351-366: The code stores a one-time snapshot in p.encoded_value
which becomes stale if p.lua_value (a table) is mutated later; instead stop
caching encoded bytes: remove p.encoded_value from patches and always call
encode(p.lua_value) at emit time (replace reads of p.encoded_value with
encode(p.lua_value)). Update the creation/update site (where patches[`#patches`+1]
and the loop that sets p.encoded_value and p.lua_value) to only store lua_value
= v, and modify all emit sites that reference p.encoded_value (including the
places noted around the other ranges) to re-encode via encode(p.lua_value) so
mutations are serialized correctly.

---

Nitpick comments:
In `@tests/lua/lazy_patch_spec.lua`:
- Around line 120-128: Add a regression case to the "changes scalar to object"
test in lazy_patch_spec.lua that verifies mutations to a patched table after
assignment are preserved: after creating t via qjson.decode and assigning t.a =
{x = 1}, mutate the nested field (t.a.x = 2) and then qjson.encode(t) and decode
with cjson (or assert the encoded string) to assert the value is 2; update the
existing it("changes scalar to object"...) block (and similar block around lines
251-260) to perform t.a = {x=1}; t.a.x = 2; and assert the encoded/decoded
result shows x == 2 so the lazy patching preserves later mutations.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d05f2b5c-890c-485e-9098-2ca7cf5b41ac

📥 Commits

Reviewing files that changed from the base of the PR and between 69e6f77 and 208d8b9.

📒 Files selected for processing (4)

docs/lazy-patch-spec.md
lua/qjson/table.lua
tests/lua/lazy_patch_spec.lua
tests/lua/lazy_table_spec.lua

coderabbitai · 2026-05-18T13:52:38Z

+
+### Current Implementation Profile
+
+```


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add language identifiers to fenced code blocks (Line 472, Line 487).

These fences trigger MD040 and can fail markdown linting in CI. Add a language like text (or lua if preferred).

Proposed fix

-``` +```text decode: 5.97 μs (20%) modify: 0.37 μs (1%) -- triggers materialization encode: 12.55 μs (41%) -- walks materialized root total: 30.34 μs @@ -``` +```text decode: 5.97 μs (55%) modify: 0.10 μs (1%) -- just records patch encode: 4.80 μs (44%) -- splices original buffer total: 10.87 μs

Also applies to: 487-487

🧰 Tools

🪛 markdownlint-cli2 (0.22.1)

[warning] 472-472: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/lazy-patch-spec.md` at line 472, In docs/lazy-patch-spec.md update the two fenced code blocks that currently have bare backticks (the blocks showing the "decode/modify/encode/total" timings around the examples including "decode: 5.97 μs (20%)" and "decode: 5.97 μs (55%)") to include a language identifier (e.g., change ``` to ```text or ```lua) so the MD040 lint rule is satisfied; locate the fences that wrap the timing lines and add the chosen language token to the opening fence.

coderabbitai · 2026-05-18T13:52:38Z

+        -- Encode the new value (we need both encoded and lua_value)
+        local encoded = encode(v)
+
+        -- Update or add patch
+        local found = false
+        for _, p in ipairs(patches) do
+            if p.key == k then
+                p.encoded_value = encoded
+                p.lua_value = v
+                found = true
+                break
+            end
+        end
+        if not found then
+            patches[#patches + 1] = { key = k, encoded_value = encoded, lua_value = v }
+        end


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Patched table values can be serialized from stale snapshots.

Line 352 snapshots encoded_value once, but later mutations to p.lua_value (e.g. t.a = {x=1}; t.a.x = 2) are not re-encoded. The emit sites then use stale bytes and can drop updates.

💡 Minimal fix approach

+local function patch_encoded_value(p) + if type(p.lua_value) == "table" then + return encode(p.lua_value) + end + return p.encoded_value +end ... - parts[`#parts` + 1] = encode_string(k) .. ":" .. patch.encoded_value + parts[`#parts` + 1] = encode_string(k) .. ":" .. patch_encoded_value(patch) ... - parts[`#parts` + 1] = encode_string(p.key) .. ":" .. p.encoded_value + parts[`#parts` + 1] = encode_string(p.key) .. ":" .. patch_encoded_value(p) ... - value = p.encoded_value, + value = patch_encoded_value(p),

Also applies to: 610-611, 629-629, 653-657

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lua/qjson/table.lua` around lines 351 - 366, The code stores a one-time snapshot in p.encoded_value which becomes stale if p.lua_value (a table) is mutated later; instead stop caching encoded bytes: remove p.encoded_value from patches and always call encode(p.lua_value) at emit time (replace reads of p.encoded_value with encode(p.lua_value)). Update the creation/update site (where patches[`#patches`+1] and the loop that sets p.encoded_value and p.lua_value) to only store lua_value = v, and modify all emit sites that reference p.encoded_value (including the places noted around the other ranges) to re-encode via encode(p.lua_value) so mutations are serialized correctly.

Copilot AI review requested due to automatic review settings May 18, 2026 10:49

Copilot started reviewing on behalf of membphis May 18, 2026 10:49 View session

Copilot AI reviewed May 18, 2026

View reviewed changes

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

Conversation

membphis commented May 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Changes

Status

Test Results

Summary by CodeRabbit

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

membphis May 18, 2026

Choose a reason for hiding this comment

Uh oh!

membphis May 18, 2026

Choose a reason for hiding this comment

Uh oh!

membphis May 18, 2026

Choose a reason for hiding this comment

Uh oh!

membphis May 18, 2026

Choose a reason for hiding this comment

Uh oh!

membphis May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

membphis May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

membphis May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

membphis commented May 18, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

membphis commented May 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 18, 2026 •

edited

Loading

coderabbitai Bot May 18, 2026 •

edited

Loading

coderabbitai Bot May 18, 2026 •

edited

Loading