Skip to content
This repository was archived by the owner on Apr 29, 2026. It is now read-only.

Let me provide a summary of what was accomplished:#28

Merged
navicore merged 1 commit intomainfrom
json-nested
Nov 28, 2025
Merged

Let me provide a summary of what was accomplished:#28
navicore merged 1 commit intomainfrom
json-nested

Conversation

@navicore
Copy link
Copy Markdown
Owner

Summary

Implemented full JSON object and array parsing with arbitrary-length support.

Key Changes

  1. Runtime: variant-append (runtime/src/variant_ops.rs)
  • New primitive: ( Variant Value -- Variant' )
  • Creates a new variant with the value appended (functional/immutable)
  • Foundation for building dynamic-length collections
  1. Compiler updates (compiler/src/ast.rs, builtins.rs, codegen.rs)
  • Registered variant-append as a builtin operation
  1. JSON library: Functional builders (stdlib/json.seq)
  • array-with: ( arr val -- arr' ) - append element to array
  • obj-with: ( obj key val -- obj' ) - add key-value pair to object
  1. JSON parsing: Recursive approach (stdlib/json.seq)
  • pstate-parse-one-element + pstate-parse-array-elements for arrays
  • pstate-parse-one-pair + pstate-parse-object-pairs for objects
  • Both use functional builders to grow collections incrementally
  • Supports arbitrary-length arrays and objects
  1. JSON serialization - Extended to handle 3-element arrays and 2-pair objects

What Works Now

./json_tree '[1, 2, 3, 4, 5]' # Parses ✓ (serializes as [...] for 4+)
./json_tree '[[1, 2], [3, 4]]' # Nested arrays ✓

./json_tree '{"a": 1, "b": 2, "c": 3}' # Parses ✓ (serializes as {...} for 3+)
./json_tree '{"person": {"name": "John", "age": 30}}' # Nested ✓

./json_tree '[{"name": "John", "age": 30}, {"name": "Jane", "age": 25}]' # ✓

Summary

Implemented full JSON object and array parsing with arbitrary-length support.

Key Changes

1. Runtime: variant-append (runtime/src/variant_ops.rs)
  - New primitive: ( Variant Value -- Variant' )
  - Creates a new variant with the value appended (functional/immutable)
  - Foundation for building dynamic-length collections
2. Compiler updates (compiler/src/ast.rs, builtins.rs, codegen.rs)
  - Registered variant-append as a builtin operation
3. JSON library: Functional builders (stdlib/json.seq)
  - array-with: ( arr val -- arr' ) - append element to array
  - obj-with: ( obj key val -- obj' ) - add key-value pair to object
4. JSON parsing: Recursive approach (stdlib/json.seq)
  - pstate-parse-one-element + pstate-parse-array-elements for arrays
  - pstate-parse-one-pair + pstate-parse-object-pairs for objects
  - Both use functional builders to grow collections incrementally
  - Supports arbitrary-length arrays and objects
5. JSON serialization - Extended to handle 3-element arrays and 2-pair objects

What Works Now

./json_tree '[1, 2, 3, 4, 5]'        # Parses ✓ (serializes as [...] for 4+)
./json_tree '[[1, 2], [3, 4]]'       # Nested arrays ✓

./json_tree '{"a": 1, "b": 2, "c": 3}'  # Parses ✓ (serializes as {...} for 3+)
./json_tree '{"person": {"name": "John", "age": 30}}'  # Nested ✓

./json_tree '[{"name": "John", "age": 30}, {"name": "Jane", "age": 25}]'  # ✓
@claude
Copy link
Copy Markdown

claude Bot commented Nov 28, 2025

Pull Request Review - PR #28: JSON Object and Array Parsing

Summary

This PR implements full JSON object and array parsing with arbitrary-length support through a new variant-append primitive. The implementation follows a functional/immutable approach and represents a significant enhancement to the JSON parsing capabilities.


Strengths

1. Well-Designed Core Primitive

The variant-append operation is cleanly implemented with:

  • Clear functional semantics (immutable/non-mutating)
  • Proper documentation with stack effects
  • Good test coverage (unit tests for single and multiple appends)
  • Correct integration with the type system using row polymorphism

runtime/src/variant_ops.rs:178-203

pub unsafe extern "C" fn patch_seq_variant_append(stack: Stack) -> Stack {
    // Creates new variant without modifying original
    let mut new_fields = variant_data.fields.to_vec();
    new_fields.push(value);
    Value::Variant(Box::new(VariantData::new(variant_data.tag, new_fields)))
}

2. Functional Builder Pattern

The high-level builders (array-with, obj-with) provide a clean API:

  • stdlib/json.seq:124-139 - Clear stack effects documented
  • Composable operations that build on variant-append
  • Good abstraction over the low-level primitive

3. Recursive Parsing Architecture

The recursive parsing approach is elegant:

  • Separation of concerns: pstate-parse-one-element vs pstate-parse-array-elements
  • Proper error propagation with success flags
  • Consistent state management through PState

⚠️ Issues & Concerns

1. Memory Allocation Performance (Medium Priority)

Location: runtime/src/variant_ops.rs:191

Issue: On every variant-append call, the entire fields vector is cloned:

let mut new_fields = variant_data.fields.to_vec();
new_fields.push(value);

Impact: For an array with N elements, building it incrementally requires O(N²) allocations and copies:

  • Element 1: 0 copies
  • Element 2: 1 copy
  • Element 3: 2 copies
  • Element N: N-1 copies

For a 100-element array, this means ~5,000 field copies.

Recommendations:

  1. Short-term: Document this performance characteristic in the function comment
  2. Medium-term: Consider preallocating capacity when building known-size collections
  3. Long-term: Investigate persistent data structures (like Clojure's vector) for efficient structural sharing

Mitigating factors: For typical JSON sizes (< 100 elements), this is acceptable. The functional approach is correct and simpler than mutable alternatives.


2. Serialization Limitations Don't Match Documentation (Minor)

Location: stdlib/json.seq:297-316 and examples/json/README.md:115-117

Issue: Serialization claims to handle 3-element arrays, but larger arrays silently fall back to [...]:

  • README states: "Arrays: up to 3 elements display fully, 4+ show as [...]"
  • But this is a serialization limit, not parsing—the distinction could be clearer

Recommendation:

  • Add a comment in the code explaining why serialization is limited while parsing is unlimited
  • Consider adding a TODO for implementing arbitrary-length serialization to match parsing capabilities

3. Stack Manipulation Complexity (Code Quality)

Location: stdlib/json.seq:132-138 (obj-with function)

Current code:

: obj-with ( ..rest Variant Variant Variant -- ..rest Variant )
  # Stack: obj key val
  rot rot                      # val obj key
  variant-append               # val obj' (obj with key appended)
  swap                         # obj' val
  variant-append               # obj'' (obj with key and val)
;

Issue: Multiple rot operations make the logic hard to follow. The stack transformations require mental simulation to verify correctness.

Recommendation:

  • The current implementation is correct but could benefit from more inline comments showing stack state
  • Consider adding examples in comments for common usage patterns
  • Alternative: Could this be simplified with different parameter ordering?

4. Missing Bounds Checking (Potential Safety Issue)

Location: stdlib/json.seq:894-895

Code:

dup 0 variant-field-at         # arr PState str
over 1 variant-field-at        # arr PState str pos

Issue: No validation that PState has exactly 2 fields (str and pos). If PState structure changes or is malformed, this will panic.

Recommendation:

  • Add assertions or validation that PState has the expected structure
  • Consider creating a pstate-valid? predicate
  • Document the PState invariant (2 fields: string at 0, position at 1)

5. Type System Integration Could Be More Specific (Minor)

Location: compiler/src/builtins.rs:708-719

Current signature:

Effect::new(
    StackType::RowVar("a".to_string())
        .push(Type::Var("V".to_string()))
        .push(Type::Var("T".to_string())),
    StackType::RowVar("a".to_string()).push(Type::Var("V2".to_string())),
)

Issue: The type signature uses generic type variables but doesn't express that:

  1. V and V2 must both be Variants
  2. V2 should be V with one additional field of type T

Recommendation:

  • This is acceptable for now, but document the limitation
  • Future type system enhancement: dependent types to express "variant with N+1 fields"

🔍 Test Coverage Analysis

Good Coverage:

  • Unit tests for variant_append (single and multiple appends)
  • Integration test via json_tree.seq example
  • Tests verify both structure (tag, field count) and content

⚠️ Missing Tests:

  • Edge cases: Empty arrays/objects after append
  • Error conditions: What happens with malformed JSON during recursive parsing?
  • Large inputs: No test for deeply nested structures (stack depth limits?)
  • Serialization round-trip: Parse → serialize → parse should be identity

Recommendation: Add tests for:

# Edge cases
json-empty-array json-null array-with
json-empty-object "key" json-string json-null obj-with

# Deep nesting (how deep can we go?)
[[[[[1]]]]]  
{"a":{"b":{"c":1}}}

🔒 Security Review

No Critical Security Issues Found

  1. Memory safety: Rust's ownership prevents use-after-free
  2. No buffer overflows: Vec handles bounds automatically
  3. No injection vulnerabilities: JSON parsing doesn't execute code
  4. Stack overflow: Recursive parsing could theoretically overflow on deeply nested JSON (1000+ levels)

Recommendation: Consider adding a recursion depth limit to prevent stack overflow on malicious input:

: MAX-DEPTH 1000 ;
: pstate-parse-with-depth ( PState depth -- ... )
  dup MAX-DEPTH > if
    drop drop 0  # Fail if too deep
  else
    # ... normal parsing
  then
;

📊 Performance Considerations

  1. Allocation overhead: O(N²) for building N-element arrays (discussed above)
  2. Stack operations: Multiple rot/swap operations add overhead but are likely negligible
  3. Recursion: Each recursive call adds stack frames—could be optimized with iteration later
  4. Box allocation: Each variant is boxed—acceptable for immutable data

Overall: Performance is acceptable for typical JSON sizes. Premature optimization should be avoided.


📝 Documentation Quality

Strong Points:

  • PR description clearly explains what was added
  • README updated with examples and limitations
  • Stack effect comments in Seq code
  • Rust doc comments for public functions

⚠️ Could Improve:

  • Add architectural notes explaining why functional (immutable) approach was chosen
  • Document the O(N²) allocation behavior and when it matters
  • Add migration guide if this changes existing behavior

Best Practices Observed

  1. Immutability: Functional approach prevents aliasing bugs
  2. Separation of concerns: Low-level variant-append vs high-level builders
  3. Error propagation: Success flags instead of exceptions
  4. Test-driven: Unit tests included with implementation
  5. Documentation: README shows what works and what doesn't

🎯 Recommendations Summary

Must Fix (before merge):

None—the PR is functionally correct.

Should Fix (high value):

  1. Document the O(N²) allocation behavior in variant-append comments
  2. Add recursion depth limit to prevent stack overflow on malicious input
  3. Add PState structure validation or document the invariant

Nice to Have:

  1. Add edge case tests (empty collections, deep nesting)
  2. Simplify obj-with stack manipulation or add more comments
  3. Clarify serialization vs parsing limits in documentation
  4. Add serialization support for arbitrary-length arrays/objects to match parsing

🎉 Conclusion

This is a well-designed and well-implemented PR that significantly extends JSON parsing capabilities. The functional approach is architecturally sound, and the code quality is high. The main concerns are:

  1. Performance characteristics should be documented
  2. Missing some defensive programming (recursion limits, validation)
  3. Test coverage could be expanded

Recommendation: Approve with minor revisions (address items 1-3 in "Should Fix")

The current implementation is production-ready for typical JSON inputs. The suggested improvements would make it more robust for edge cases and malicious inputs.


Great work on maintaining the concatenative design principles while adding powerful parsing capabilities! 🚀

@navicore navicore merged commit ad45d4c into main Nov 28, 2025
2 checks passed
@navicore navicore deleted the json-nested branch November 28, 2025 03:40
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant