Skip to content

feat: Add new feature related to debugging and tracing#31

Merged
TheUncharted merged 10 commits intomasterfrom
develop
Mar 12, 2026
Merged

feat: Add new feature related to debugging and tracing#31
TheUncharted merged 10 commits intomasterfrom
develop

Conversation

@TheUncharted
Copy link
Copy Markdown
Owner

@TheUncharted TheUncharted commented Mar 12, 2026

Summary

Add new feature related to debugging and tracing

Changes

  • New Features

    • Added execution tracing to display per-phase timing and execution paths for debugging.
    • Added debug mode enabling step-by-step code, tool call, and output logging.
    • Added auto-fix feature allowing automatic error recovery and self-correction.
  • Documentation

    • Enhanced main documentation with new sections on tracing, debugging, and auto-fix capabilities.
    • Reorganized examples with individual language-specific directories and setup guides.
    • Added new example-specific READMEs and project configurations for TypeScript, Python, and Rust.

Test plan

  • Unit tests pass (cargo test)
  • CI passes

Summary by CodeRabbit

Release Notes

  • New Features

    • Added execution tracing to capture detailed timing and hierarchical information across parsing, compilation, and execution phases.
    • Added debug mode for detailed logging of generated code, tool calls, and execution results.
    • Added auto-fix mode to automatically handle and recover from execution errors.
  • Examples & Documentation

    • Reorganized examples with improved directory structure and new debug-tracing examples.
    • Added comprehensive documentation for all example categories and use cases.

Adds a structured trace tree (TraceSpan) that captures timing for each
phase of execution: parse → compile → execute. This gives AI agent
developers visibility into where time is spent inside the sandbox,
which is critical for debugging latency in production agent loops
where Zapcode executes thousands of code snippets.

Each span records name, status (Ok/Error), start/end timestamps,
duration in microseconds, key-value attributes (e.g. suspended
function name, argument count), and child spans.
Surfaces the trace tree from zapcode-core through all three binding
layers so that agent developers can inspect execution timing regardless
of their language. Without this, trace data was only accessible from
Rust — now every SDK gets the same observability.
13 tests covering trace structure, timing validity, error handling,
suspension attributes, pretty printing, and independence of multiple
runs. Ensures the trace system is correct before building higher-level
features (autoFix, debug logging) on top of it.
autoFix catches execution errors and returns them as tool results
instead of throwing, letting the LLM see the error and self-correct
on the next step. This eliminates the main risk of code execution:
a single bad generation no longer kills the entire agent loop.

Execution trace collects a session-level span tree across all
executions, accessible via printTrace()/getTrace() (TS) and
print_trace()/get_trace() (Python). This gives developers a single
view of every code execution, tool call, and retry in the session.

Both features are implemented in the TypeScript and Python AI packages.
Restructures examples/ from a flat layout to language-first,
topic-second (e.g. examples/typescript/debug-tracing/). Each example
is now a self-contained project with its own package.json/pyproject.toml,
making it easy to cd in and run without affecting other examples.

Also adds debug-tracing examples (TypeScript + Python) that demonstrate
autoFix, step-by-step logging of generated code and tool calls, and
execution trace printing — serving as the reference for developers
who want full observability into their agent's code execution.
These features were implemented but undiscoverable — a user looking at
the README had no idea they could enable error recovery or inspect
execution timing. Adds a dedicated section explaining the why (LLM
self-correction, production observability) and links to the
debug-tracing examples for step-by-step logging patterns.

Also updates example paths to reflect the new directory structure.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 12, 2026

📝 Walkthrough

Walkthrough

This PR adds an execution tracing system (TraceSpan, TraceStatus, ExecutionTrace) to core, instruments VM parse/compile/execute flows to emit traces, exposes traces through JS/Python/WASM bindings, extends zapcode-ai (JS/Python) with debug/auto-fix and tracing, and reorganizes examples and docs to per-example subdirectories and new tracing/debug examples.

Changes

Cohort / File(s) Summary
Execution Tracing Core
crates/zapcode-core/src/lib.rs, crates/zapcode-core/src/trace.rs, crates/zapcode-core/tests/trace.rs
Adds trace module, public types TraceSpan, TraceStatus, ExecutionTrace, SpanBuilder utilities, pretty-printing, serialization, and comprehensive tests validating structure, timing, errors, suspension, and printing.
VM Integration
crates/zapcode-core/src/vm/mod.rs
Instruments parse/compile/execute with SpanBuilder, attaches ExecutionTrace to RunResult, records statuses and attributes, and returns traces on success/error/suspension.
JavaScript Bindings
crates/zapcode-js/src/lib.rs
Adds JsTraceSpan, converts core traces to JS, extends ZapcodeResult with trace, and threads ExecutionTrace through run/start/resume via updated vm_state_to_either.
Python Bindings
crates/zapcode-py/src/lib.rs
Adds trace serialization (trace_span_to_py), includes trace in run_result_to_py, and unifies start/run path to return trace when present.
WASM Bindings
crates/zapcode-wasm/src/lib.rs
Adds trace-aware conversion in vm_state_to_js, populates optional trace field for start/run results, and keeps resume behavior consistent.
Zapcode-AI JS Package
packages/zapcode-ai/src/index.ts
Introduces TraceSpan and tracing utilities, extends ExecutionResult with code, error, trace, adds debug/autoFix options, records per-step and per-tool spans, and exposes getTrace/printTrace.
Zapcode-AI Python Package
packages/zapcode-ai-python/src/zapcode_ai/__init__.py
Adds TraceSpan dataclass, session and per-attempt tracing, extends ExecutionResult, wires debug/auto_fix through execution, and exposes get_trace/print_trace.
Examples — TypeScript
examples/typescript/basic/..., examples/typescript/ai-agent/..., examples/typescript/debug-tracing/..., examples/typescript/ai-bedrock/...
Restructures TS examples into per-example folders with README/package.json/main.ts, adds debug-tracing example demonstrating Bedrock, updates inline docs and run scripts, and switches some package deps to local file: paths.
Examples — Python
examples/python/basic/..., examples/python/ai-agent/..., examples/python/debug-tracing/..., examples/python/ai-bedrock/...
Adds per-example Python subprojects (pyproject.toml, README, main.py), new debug-tracing and ai-agent examples, updates docstrings and run instructions, and removes the obsolete top-level examples/python README.
Examples — Rust
examples/rust/basic/...
Creates Rust basic example subdir, updates local path in Cargo.toml, and adds README; removes parent Rust README.
Examples Index & Docs
examples/README.md, CONTRIBUTING.md, README.md
Updates examples overview and quick-start, updates CONTRIBUTING E2E example invocations to new per-example build/run steps, updates README references and adds "Auto-Fix, Debug & Execution Tracing" sections (duplicated in places).
CI Workflow
.github/workflows/ci.yml
Adjusts npm cache key and E2E CI paths to new example subdirectories (examples/typescript/basic, examples/python/basic).
Removed Top-level TS Configs
examples/typescript/package.json, examples/typescript/tsconfig.json, examples/typescript/README.md
Removes top-level TypeScript example package/config and README as examples moved to subdirectories.
Misc Examples README edits/deletions
examples/python/README.md, examples/rust/README.md, examples/ai-bedrock/...
Deletes or moves various example READMEs and adds targeted READMEs for new subfolders and integration examples.

Sequence Diagram

sequenceDiagram
    participant Client
    participant ZapcodeVM as Zapcode VM
    participant Parser
    participant Compiler
    participant Executor
    participant Trace as SpanBuilder/Trace

    Client->>ZapcodeVM: run(code)
    activate ZapcodeVM

    ZapcodeVM->>Trace: new("parse")
    Trace->>Parser: parse(code)
    Parser-->>Trace: parse_result
    Trace-->>ZapcodeVM: parse_span

    ZapcodeVM->>Trace: new("compile")
    Trace->>Compiler: compile(ast)
    Compiler-->>Trace: compile_result
    Trace-->>ZapcodeVM: compile_span

    ZapcodeVM->>Trace: new("execute")
    Trace->>Executor: execute(bytecode)
    Executor-->>Trace: execute_result
    Trace-->>ZapcodeVM: execute_span

    ZapcodeVM->>ZapcodeVM: assemble ExecutionTrace(root with children)
    ZapcodeVM-->>Client: RunResult{state, stdout, trace}
    deactivate ZapcodeVM
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'feat: Add new feature related to debugging and tracing' is vague and generic, using non-descriptive terms that don't clearly convey the main changes. Replace with a more specific title that clearly describes the primary feature, such as 'feat: Add execution tracing, debug mode, and auto-fix functionality' or similar.
✅ Passed checks (2 passed)
Check name Status Explanation
Description check ✅ Passed The description covers key features and changes with appropriate structure, though the test plan section remains incomplete with unchecked boxes.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch develop
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

CI was referencing old flat paths (examples/typescript/package.json,
examples/python/basic.py) which no longer exist after the move to
language-first, topic-second structure.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 12, 2026

Benchmark Results

    Updating crates.io index
   Compiling zapcode-core v1.3.0 (/home/runner/work/zapcode/zapcode/crates/zapcode-core)
    Finished `bench` profile [optimized] target(s) in 8.35s
     Running unittests src/lib.rs (target/release/deps/zapcode_core-39d9b0417d9f9c10)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running benches/execution.rs (target/release/deps/execution-1041eb5029624734)
Timer precision: 20 ns
execution                  fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ array_creation          3.766 µs      │ 84.17 µs      │ 3.912 µs      │ 4.923 µs      │ 100     │ 100
├─ fibonacci_10            175 µs        │ 234.7 µs      │ 176.3 µs      │ 180 µs        │ 100     │ 100
├─ function_call           6.782 µs      │ 18.04 µs      │ 7.012 µs      │ 7.229 µs      │ 100     │ 100
├─ loop_100                102.7 µs      │ 121.5 µs      │ 106.9 µs      │ 106.9 µs      │ 100     │ 100
├─ object_creation         7.523 µs      │ 19.26 µs      │ 7.723 µs      │ 8.057 µs      │ 100     │ 100
├─ promise_all_3           10.06 µs      │ 21.89 µs      │ 10.5 µs       │ 10.82 µs      │ 100     │ 100
├─ promise_catch_resolved  6.481 µs      │ 18.21 µs      │ 6.762 µs      │ 7.014 µs      │ 100     │ 100
├─ promise_resolve_await   4.478 µs      │ 6.641 µs      │ 4.658 µs      │ 4.689 µs      │ 100     │ 100
├─ promise_then_chain_3    13.73 µs      │ 44.51 µs      │ 14.39 µs      │ 15.08 µs      │ 100     │ 100
├─ promise_then_single     7.884 µs      │ 29.92 µs      │ 8.31 µs       │ 8.669 µs      │ 100     │ 100
├─ simple_expression       3.455 µs      │ 6.401 µs      │ 3.576 µs      │ 3.624 µs      │ 100     │ 100
├─ string_concat           4.086 µs      │ 6.952 µs      │ 4.222 µs      │ 4.271 µs      │ 100     │ 100
├─ template_literal        4.477 µs      │ 26.1 µs       │ 4.633 µs      │ 5.066 µs      │ 100     │ 100
╰─ variable_arithmetic     4.478 µs      │ 9.226 µs      │ 4.757 µs      │ 4.917 µs      │ 100     │ 100


Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

🧹 Nitpick comments (2)
packages/zapcode-ai-python/README.md (1)

1-5: Add a minimal install/usage snippet here.

This README will likely be read standalone, so pointing only to the repo README makes the package page a dead end for first-time users.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/zapcode-ai-python/README.md` around lines 1 - 5, Add a minimal
install and usage snippet to the package README so first-time users can get
started without visiting the repo README: include a short "Install" line showing
how to install the zapcode-ai package (e.g., pip or similar), and a concise
"Usage" example that demonstrates importing the package, creating the primary
client/entry object, and calling one simple method (e.g., run/execute) with a
brief comment about expected output; place this under the existing header in
packages/zapcode-ai-python/README.md so readers immediately see installation and
a one-shot usage example.
examples/typescript/ai-agent/package.json (1)

6-8: Use the pinned local tsx binary in these npm scripts.

npm run already resolves node_modules/.bin, so the npx prefix is unnecessary here and makes the example less reproducible than just invoking the devDependency directly.

♻️ Proposed refactor
-    "agent": "npx tsx ai-agent-zapcode-ai.ts",
-    "agent:anthropic": "npx tsx ai-agent-anthropic.ts",
-    "agent:vercel": "npx tsx ai-agent-vercel-ai.ts"
+    "agent": "tsx ai-agent-zapcode-ai.ts",
+    "agent:anthropic": "tsx ai-agent-anthropic.ts",
+    "agent:vercel": "tsx ai-agent-vercel-ai.ts"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/typescript/ai-agent/package.json` around lines 6 - 8, The npm
scripts "agent", "agent:anthropic", and "agent:vercel" use "npx tsx ..." which
is unnecessary because npm run already resolves devDependencies; remove the "npx
" prefix so the scripts invoke the pinned local tsx binary directly (update the
values for "agent", "agent:anthropic", and "agent:vercel" to call "tsx
ai-agent-zapcode-ai.ts", "tsx ai-agent-anthropic.ts", and "tsx
ai-agent-vercel-ai.ts" respectively).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CONTRIBUTING.md`:
- Around line 63-64: Update the CONTRIBUTING.md lines under the E2E checks so
they are runnable shell commands instead of plain file paths: for "E2E JS"
replace "build bindings then run `examples/typescript/basic/main.ts`" with a
concrete command sequence (e.g., build bindings, install deps, then run the
TypeScript example via a specified runner such as `npx ts-node
examples/typescript/basic/main.ts` or `node
dist/examples/typescript/basic/main.js` if compiled) and for "E2E Python"
replace "build bindings then run `examples/python/basic/main.py`" with an
explicit command like `python3 examples/python/basic/main.py` (including any
required venv/installation steps or flags); ensure the updated text names the
required preparatory step (build bindings) and shows a single copy-pasteable
shell command for each platform (E2E JS and E2E Python).

In `@crates/zapcode-core/src/vm/mod.rs`:
- Around line 2297-2300: The start() method currently discards RunResult.trace
by calling self.run(...) and returning only result.state; change the API to
preserve and return the trace (either by: 1) modifying start() to return
Result<RunResult> instead of Result<VmState>, or 2) adding a new
start_with_trace / start_full that returns RunResult while keeping start() as a
convenience wrapper (if backward compatibility is required, have start() call
run() and map/runResult.state but provide the new trace-carrying variant).
Update callers that expect VmState to use the new return type or unwrap the
.state from RunResult, and ensure documentation/comments reference
RunResult.trace and the documented suspension API.
- Around line 2214-2219: The error-path builds an ExecutionTrace
(root_span.finish(TraceStatus::Error)) into a local _trace and then immediately
drops it by returning Err(e); instead preserve and return or attach that trace
to the error so callers can inspect it. Modify the error return at the
parse/compile/execute failure sites (where root_span,
parse_span.finish_error(...), ExecutionTrace and TraceStatus::Error are used and
you currently do return Err(e)) to wrap or augment the error with the
constructed ExecutionTrace (e.g. return Err(ErrorType::with_trace(e, trace)) or
convert to an error type that contains the ExecutionTrace), and update the
function return types/signatures accordingly; apply the same change to the other
two spots that build _trace (the blocks around compile_span and execute_span).
Ensure the chosen approach preserves the original error payload while making the
ExecutionTrace reachable to callers.

In `@crates/zapcode-wasm/src/lib.rs`:
- Around line 350-354: The resume() implementation is discarding tracing by
passing None to vm_state_to_js; instead preserve and forward the trace from the
VM state returned by self.inner.clone().resume(val). After obtaining state in
resume(), extract its trace (e.g., state.trace or similar field) and pass that
(or its cloned/borrowed form as the expected Option<&str>/Option<String>) into
vm_state_to_js rather than None so resumed executions keep end-to-end trace
data; update the call in resume() accordingly while keeping zapcode_err mapping
intact.

In `@examples/python/ai-bedrock/README.md`:
- Around line 5-19: Update the README Setup section to add a prerequisite note
that AWS credentials and model access are required before running the examples:
state that users must configure AWS credentials (environment variables, shared
credentials file, or an IAM role) and ensure their account has permission and
model access for the chosen MODEL_ID in the target AWS_REGION referenced in the
Run snippet (e.g., MODEL_ID and AWS_REGION overrides); keep the instruction
short and add a pointer to common AWS credential methods and to verify model
access in the AWS console or via the Bedrock permissions.

In `@examples/python/debug-tracing/main.py`:
- Line 93: The print statement in main.py uses an unnecessary f-string: change
the print call that currently reads print(f"Debug: ON | AutoFix: ON") to use a
plain string by removing the f prefix so it becomes print("Debug: ON | AutoFix:
ON"); this removes Ruff F541 by eliminating the unused f-string interpolation
while keeping behavior identical.

In `@examples/python/debug-tracing/README.md`:
- Around line 11-25: The README's setup/run instructions omit AWS credential and
region prerequisites required when using Bedrock-backed models (e.g.,
MODEL_ID=anthropic.claude-sonnet-4-20250514 in main.py); update the docs to
instruct users to configure the AWS credential chain (AWS CLI
~/.aws/credentials, environment variables, or IAM role) and to set AWS_REGION
(example: export AWS_REGION=us-east-1) before running python main.py, and
include a short note that missing credentials will cause authentication errors
on first run.

In `@examples/README.md`:
- Around line 5-21: The fenced tree block in examples/README.md is missing a
language tag; update the opening triple-backtick for that block (the line
containing "```" before the tree) to include a language identifier such as
"text" so it reads "```text", keeping the block content unchanged (locate the
fenced block shown under the examples/ tree and add the language tag).
- Around line 27-38: Update the quick-start commands so they are runnable from
the repository root: prefix each example path with "examples/" (e.g., change "cd
typescript/basic" to "cd examples/typescript/basic") and replace the macOS-only
"open wasm/basic/index.html" with a portable invocation such as using "xdg-open"
on Linux or a note to open the file in a browser (or provide both "xdg-open" and
"start" alternatives) so users on different OSes can run the WASM example
without first changing directories.

In `@examples/rust/basic/README.md`:
- Around line 5-8: Update the README entry for the "cargo run --example basic"
command to clarify the required working directory: either add a note "Run this
command from the examples/rust/basic/ directory" or provide the alternative full
command to run from the repo root using the manifest flag (use the Cargo.toml
for the basic example with --manifest-path and keep --example basic) so users
know how to run it from root versus inside the example folder.

In `@examples/typescript/ai-bedrock/README.md`:
- Around line 5-18: Update the README "Setup/Run" section to document the
required AWS authentication and model access before running npm start: explain
that Bedrock requires AWS credentials (e.g., AWS_ACCESS_KEY_ID,
AWS_SECRET_ACCESS_KEY, optional AWS_SESSION_TOKEN or AWS_PROFILE via aws
configure), the correct AWS_REGION, and IAM permissions to call Bedrock and
access the specified MODEL_ID; instruct users how to set these (environment
variables or aws cli config) and note that MODEL_ID and AWS_REGION environment
variables shown (MODEL_ID, AWS_REGION) must correspond to a model the
account/region has access to.

In `@examples/typescript/debug-tracing/README.md`:
- Around line 29-52: The fenced code block that starts with "Model:
global.amazon.nova-2-lite-v1:0 | Region: eu-west-1" is missing a language tag,
causing markdownlint MD040; update the opening fence from ``` to ```text (or
another appropriate language) for that block so the example output is treated as
plain text; ensure the closing ``` remains and do not alter the block contents
(this applies to the fenced example containing the zapcode tool calls like
getWeather and searchFlights).

In `@packages/zapcode-ai-python/src/zapcode_ai/__init__.py`:
- Around line 212-275: The returned ExecutionResult currently only includes the
synthetic exec_span (with tool_call children) and does not attach the Zapcode
engine's per-phase trace; fix by retrieving the underlying Zapcode trace from
the sandbox/state (e.g., call sandbox.get_trace() or use state["trace"] /
snapshot trace if available) after execution and merge or attach it to the
exec_span (or set ExecutionResult.trace to that full trace) so callers get the
parse/compile/execute timing tree; update the code around where exec_span is
created and where the ExecutionResult is returned to combine the engine trace
with exec_span (or replace exec_span) before constructing ExecutionResult.
- Around line 414-418: The session_trace and attempt_count are stored in the
closure created by zapcode(), causing trace state to persist across chats;
modify zapcode()/the returned ZapcodeAI so that session_trace and attempt_count
are reinitialized at the start of each new conversation or API call (e.g., move
their initialization into the per-conversation handler or add an explicit reset
method) and ensure get_trace() reads only the current per-conversation span
created via _create_span("session", ...) (also update any logic in the related
block around the existing 420-434 code to reference the per-call/session
variables rather than the outer closure values).
- Line 185: The f-string prefix on the literal "<1ms" is unnecessary and
triggers Ruff F541; update the assignment to remove the f-prefix so the ternary
assigns a plain string when span.duration_ms < 1 and keeps the f-string for the
else branch (i.e., change the duration assignment that references
span.duration_ms accordingly).
- Around line 247-256: The code calls tool_def.execute(named_args) synchronously
and may receive an awaitable from async tool implementations; detect awaitable
results (use inspect.isawaitable/asyncio.iscoroutine) after calling
tool_def.execute and resolve them before storing to tool_calls and passing to
snapshot.resume: if inside an async context await the result (result = await
result), otherwise run it to completion with the event loop (e.g.,
loop.run_until_complete(result) or asyncio.run(result)) so that tool_calls,
tool_span attribute logging, and snapshot.resume receive the actual resolved
value; update the block around tool_def.execute, tool_calls, tool_span, and
snapshot.resume accordingly.

In `@packages/zapcode-ai/src/index.ts`:
- Around line 455-460: The execOptions object is hardcoding debug:false so
per-attempt traces never get forwarded into executeCode(); change execOptions to
pass the computed tracing flag (or the original debug flag) instead of false so
executeCode() receives debug=true when zapcode({ debug: true }) is used — update
the execOptions declaration (the variable named execOptions) to set its debug
property to tracing (or debug) and ensure any call sites that pass execOptions
into executeCode(...)/executeCode are using that updated object.
- Around line 318-387: The current executeCode path creates a top-level execSpan
via createSpan("execute", ...) which replaces the VM's own run trace so
parse/compile spans never appear; locate the VM/run-level trace produced by
Zapcode (look for the run/root span exposed by Zapcode instance or the
snapshot/state object returned from Zapcode.start()/ZapcodeSnapshotHandle.load —
e.g., a runSpan on the sandbox or state) and make the execSpan a child of that
run-level span instead of a standalone root: create execSpan using createSpan as
before but attach it into runSpan.children, and when ending spans use
endSpan(toolSpan) and endSpan(execSpan) then ensure you add execSpan to the
runSpan.children and pass runSpan into printTrace(debug) so the final trace
shows parse → compile → execute with tool_call children.

---

Nitpick comments:
In `@examples/typescript/ai-agent/package.json`:
- Around line 6-8: The npm scripts "agent", "agent:anthropic", and
"agent:vercel" use "npx tsx ..." which is unnecessary because npm run already
resolves devDependencies; remove the "npx " prefix so the scripts invoke the
pinned local tsx binary directly (update the values for "agent",
"agent:anthropic", and "agent:vercel" to call "tsx ai-agent-zapcode-ai.ts", "tsx
ai-agent-anthropic.ts", and "tsx ai-agent-vercel-ai.ts" respectively).

In `@packages/zapcode-ai-python/README.md`:
- Around line 1-5: Add a minimal install and usage snippet to the package README
so first-time users can get started without visiting the repo README: include a
short "Install" line showing how to install the zapcode-ai package (e.g., pip or
similar), and a concise "Usage" example that demonstrates importing the package,
creating the primary client/entry object, and calling one simple method (e.g.,
run/execute) with a brief comment about expected output; place this under the
existing header in packages/zapcode-ai-python/README.md so readers immediately
see installation and a one-shot usage example.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: cb36cee8-57da-4a76-9f7c-c7762b1ea5ab

📥 Commits

Reviewing files that changed from the base of the PR and between 50486ae and ccfc2ee.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock
  • examples/rust/basic/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (51)
  • .github/workflows/ci.yml
  • CONTRIBUTING.md
  • README.md
  • crates/zapcode-core/src/lib.rs
  • crates/zapcode-core/src/trace.rs
  • crates/zapcode-core/src/vm/mod.rs
  • crates/zapcode-core/tests/trace.rs
  • crates/zapcode-js/src/lib.rs
  • crates/zapcode-py/src/lib.rs
  • crates/zapcode-wasm/src/lib.rs
  • examples/README.md
  • examples/ai-bedrock/README.md
  • examples/python/README.md
  • examples/python/ai-agent/README.md
  • examples/python/ai-agent/ai_agent_anthropic.py
  • examples/python/ai-agent/ai_agent_zapcode_ai.py
  • examples/python/ai-agent/pyproject.toml
  • examples/python/ai-bedrock/README.md
  • examples/python/ai-bedrock/main.py
  • examples/python/ai-bedrock/pyproject.toml
  • examples/python/basic/README.md
  • examples/python/basic/main.py
  • examples/python/basic/pyproject.toml
  • examples/python/debug-tracing/README.md
  • examples/python/debug-tracing/main.py
  • examples/python/debug-tracing/pyproject.toml
  • examples/rust/README.md
  • examples/rust/basic/Cargo.toml
  • examples/rust/basic/README.md
  • examples/rust/basic/basic.rs
  • examples/typescript/README.md
  • examples/typescript/ai-agent/README.md
  • examples/typescript/ai-agent/ai-agent-anthropic.ts
  • examples/typescript/ai-agent/ai-agent-vercel-ai.ts
  • examples/typescript/ai-agent/ai-agent-zapcode-ai.ts
  • examples/typescript/ai-agent/package.json
  • examples/typescript/ai-bedrock/README.md
  • examples/typescript/ai-bedrock/main.ts
  • examples/typescript/ai-bedrock/package.json
  • examples/typescript/basic/README.md
  • examples/typescript/basic/main.ts
  • examples/typescript/basic/package.json
  • examples/typescript/debug-tracing/README.md
  • examples/typescript/debug-tracing/main.ts
  • examples/typescript/debug-tracing/package.json
  • examples/typescript/package.json
  • examples/typescript/tsconfig.json
  • examples/wasm/basic/index.html
  • packages/zapcode-ai-python/README.md
  • packages/zapcode-ai-python/src/zapcode_ai/__init__.py
  • packages/zapcode-ai/src/index.ts
💤 Files with no reviewable changes (6)
  • examples/ai-bedrock/README.md
  • examples/rust/README.md
  • examples/typescript/README.md
  • examples/python/README.md
  • examples/typescript/package.json
  • examples/typescript/tsconfig.json

Comment on lines +2214 to +2219
Err(e) => {
root_span.add_child(parse_span.finish_error(&e.to_string()));
let _trace = ExecutionTrace {
root: root_span.finish(TraceStatus::Error),
};
return Err(e);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The error-path trace never leaves this function.

Lines 2216, 2233, and 2276 build an ExecutionTrace into _trace and immediately drop it by returning Err(e). Callers still cannot inspect parse/compile/execute traces after a failure, even though this code already paid the cost to build them.

Also applies to: 2231-2236, 2274-2279

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/zapcode-core/src/vm/mod.rs` around lines 2214 - 2219, The error-path
builds an ExecutionTrace (root_span.finish(TraceStatus::Error)) into a local
_trace and then immediately drops it by returning Err(e); instead preserve and
return or attach that trace to the error so callers can inspect it. Modify the
error return at the parse/compile/execute failure sites (where root_span,
parse_span.finish_error(...), ExecutionTrace and TraceStatus::Error are used and
you currently do return Err(e)) to wrap or augment the error with the
constructed ExecutionTrace (e.g. return Err(ErrorType::with_trace(e, trace)) or
convert to an error type that contains the ExecutionTrace), and update the
function return types/signatures accordingly; apply the same change to the other
two spots that build _trace (the blocks around compile_span and execute_span).
Ensure the chosen approach preserves the original error payload while making the
ExecutionTrace reachable to callers.

Comment on lines 2297 to 2300
pub fn start(&self, input_values: Vec<(String, Value)>) -> Result<VmState> {
let program = crate::parser::parse(&self.source)?;
let ext_set: HashSet<String> = self.external_functions.iter().cloned().collect();
let compiled = crate::compiler::compile_with_externals(&program, ext_set.clone())?;
let mut vm = Vm::new(compiled, self.limits.clone(), ext_set);

for (name, value) in input_values {
vm.globals.insert(name, value);
}

vm.run()
let result = self.run(input_values)?;
Ok(result.state)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

start() throws away the new trace.

Lines 2298-2299 call self.run() and then return only result.state. Direct Rust callers using the documented suspension API have no way to access tracing on this path; this needs a trace-carrying start API instead of discarding RunResult.trace.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/zapcode-core/src/vm/mod.rs` around lines 2297 - 2300, The start()
method currently discards RunResult.trace by calling self.run(...) and returning
only result.state; change the API to preserve and return the trace (either by:
1) modifying start() to return Result<RunResult> instead of Result<VmState>, or
2) adding a new start_with_trace / start_full that returns RunResult while
keeping start() as a convenience wrapper (if backward compatibility is required,
have start() call run() and map/runResult.state but provide the new
trace-carrying variant). Update callers that expect VmState to use the new
return type or unwrap the .state from RunResult, and ensure
documentation/comments reference RunResult.trace and the documented suspension
API.

Comment on lines 350 to 354
pub fn resume(&self, return_value: JsValue) -> Result<JsValue, JsError> {
let val = js_to_value(&return_value)?;
let state = self.inner.clone().resume(val).map_err(zapcode_err)?;
vm_state_to_js(state, "")
vm_state_to_js(state, "", None)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

resume() drops tracing on the main suspension path.

Lines 353-354 force trace to None, so any execution that suspends loses trace data as soon as it resumes. That leaves multi-step external-call flows without end-to-end tracing in the wasm API.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/zapcode-wasm/src/lib.rs` around lines 350 - 354, The resume()
implementation is discarding tracing by passing None to vm_state_to_js; instead
preserve and forward the trace from the VM state returned by
self.inner.clone().resume(val). After obtaining state in resume(), extract its
trace (e.g., state.trace or similar field) and pass that (or its cloned/borrowed
form as the expected Option<&str>/Option<String>) into vm_state_to_js rather
than None so resumed executions keep end-to-end trace data; update the call in
resume() accordingly while keeping zapcode_err mapping intact.

Comment on lines +212 to +275
exec_span = _create_span("execute", {"zapcode.code": code}) if tracing else None

result = tool_def.execute(named_args)
tool_calls.append({"name": fn_name, "args": args, "result": result})
try:
kwargs: dict[str, Any] = {"external_functions": tool_names}
if time_limit_ms is not None:
kwargs["time_limit_ms"] = time_limit_ms
if memory_limit_bytes is not None:
kwargs["memory_limit_bytes"] = memory_limit_bytes

snapshot: ZapcodeSnapshot = state["snapshot"]
state = snapshot.resume(result)
sandbox = Zapcode(code, **kwargs)
state = sandbox.start()

return ExecutionResult(
output=state.get("output"),
stdout=state.get("stdout", ""),
tool_calls=tool_calls,
)
while state.get("suspended"):
fn_name = state["function_name"]
args = state["args"]

tool_def = tool_defs.get(fn_name)
if not tool_def:
raise ValueError(
f"Guest code called unknown function '{fn_name}'. "
f"Available: {', '.join(tool_names)}"
)

# Build named args from positional args
param_names = list(tool_def.parameters.keys())
named_args = {
param_names[i]: args[i]
for i in range(min(len(param_names), len(args)))
}

tool_span = _create_span("tool_call", {
"zapcode.tool.name": fn_name,
"zapcode.tool.args": json.dumps(args, default=str),
}) if tracing else None

result = tool_def.execute(named_args)
tool_calls.append({"name": fn_name, "args": args, "result": result})

if tool_span:
tool_span.attributes["zapcode.tool.result"] = json.dumps(result, default=str)
_end_span(tool_span)
exec_span.children.append(tool_span)

snapshot: ZapcodeSnapshot = state["snapshot"]
state = snapshot.resume(result)

stdout = state.get("stdout", "")

if exec_span:
exec_span.attributes["zapcode.output"] = json.dumps(state.get("output"), default=str)
if stdout:
exec_span.attributes["zapcode.stdout"] = stdout
_end_span(exec_span)

if debug and exec_span:
_print_trace(exec_span)

return ExecutionResult(
code=code,
output=state.get("output"),
stdout=stdout,
tool_calls=tool_calls,
trace=exec_span,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

ExecutionResult.trace drops the per-phase trace this feature is supposed to expose.

The wrapper always returns the synthetic span built here, and its children are only tool_call spans. Nothing in this path threads the underlying Zapcode trace into result.trace, so callers cannot get the documented parse/compile/execute timing tree from get_trace().

🧰 Tools
🪛 Ruff (0.15.5)

[warning] 230-233: Abstract raise to an inner function

(TRY301)


[warning] 230-233: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/zapcode-ai-python/src/zapcode_ai/__init__.py` around lines 212 -
275, The returned ExecutionResult currently only includes the synthetic
exec_span (with tool_call children) and does not attach the Zapcode engine's
per-phase trace; fix by retrieving the underlying Zapcode trace from the
sandbox/state (e.g., call sandbox.get_trace() or use state["trace"] / snapshot
trace if available) after execution and merge or attach it to the exec_span (or
set ExecutionResult.trace to that full trace) so callers get the
parse/compile/execute timing tree; update the code around where exec_span is
created and where the ExecutionResult is returned to combine the engine trace
with exec_span (or replace exec_span) before constructing ExecutionResult.

Comment on lines +247 to +256
result = tool_def.execute(named_args)
tool_calls.append({"name": fn_name, "args": args, "result": result})

if tool_span:
tool_span.attributes["zapcode.tool.result"] = json.dumps(result, default=str)
_end_span(tool_span)
exec_span.children.append(tool_span)

snapshot: ZapcodeSnapshot = state["snapshot"]
state = snapshot.resume(result)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find packages/zapcode-ai-python -type f -name "*.py" | head -20

Repository: TheUncharted/zapcode

Length of output: 117


🏁 Script executed:

rg "class ToolDefinition" packages/zapcode-ai-python -A 10

Repository: TheUncharted/zapcode

Length of output: 984


🏁 Script executed:

rg "def execute" packages/zapcode-ai-python -B 2 -A 10

Repository: TheUncharted/zapcode

Length of output: 1084


🏁 Script executed:

sed -n '240,265p' packages/zapcode-ai-python/src/zapcode_ai/__init__.py | cat -n

Repository: TheUncharted/zapcode

Length of output: 1192


🏁 Script executed:

rg "def resume" packages/zapcode-ai-python -B 2 -A 15

Repository: TheUncharted/zapcode

Length of output: 46


🏁 Script executed:

rg "snapshot.resume" packages/zapcode-ai-python -B 3 -A 3

Repository: TheUncharted/zapcode

Length of output: 640


🏁 Script executed:

sed -n '1,50p' packages/zapcode-ai-python/src/zapcode_ai/__init__.py | cat -n

Repository: TheUncharted/zapcode

Length of output: 1802


🏁 Script executed:

rg "class ZapcodeSnapshot" packages/zapcode-ai-python

Repository: TheUncharted/zapcode

Length of output: 46


🏁 Script executed:

rg "ZapcodeSnapshot" packages/zapcode-ai-python | head -20

Repository: TheUncharted/zapcode

Length of output: 274


🏁 Script executed:

rg "async def execute" packages/zapcode-ai-python

Repository: TheUncharted/zapcode

Length of output: 46


🏁 Script executed:

sed -n '195,210p' packages/zapcode-ai-python/src/zapcode_ai/__init__.py | cat -n

Repository: TheUncharted/zapcode

Length of output: 704


🏁 Script executed:

sed -n '200,220p' packages/zapcode-ai-python/src/zapcode_ai/__init__.py | cat -n

Repository: TheUncharted/zapcode

Length of output: 904


🏁 Script executed:

rg "def handle_tool_call" packages/zapcode-ai-python -A 30

Repository: TheUncharted/zapcode

Length of output: 2683


🏁 Script executed:

rg "import inspect" packages/zapcode-ai-python

Repository: TheUncharted/zapcode

Length of output: 46


🏁 Script executed:

sed -n '225,280p' packages/zapcode-ai-python/src/zapcode_ai/__init__.py | cat -n

Repository: TheUncharted/zapcode

Length of output: 2390


🏁 Script executed:

sed -n '60,90p' packages/zapcode-ai-python/src/zapcode_ai/__init__.py | cat -n

Repository: TheUncharted/zapcode

Length of output: 1148


🏁 Script executed:

rg "async" packages/zapcode-ai-python

Repository: TheUncharted/zapcode

Length of output: 46


🏁 Script executed:

find packages/zapcode-ai-python -name "*test*" -o -name "*spec*" | head -10

Repository: TheUncharted/zapcode

Length of output: 46


🏁 Script executed:

sed -n '52,65p' packages/zapcode-ai-python/src/zapcode_ai/__init__.py | cat -n

Repository: TheUncharted/zapcode

Length of output: 565


🏁 Script executed:

rg "Awaitable" packages/zapcode-ai-python -B 2 -A 2

Repository: TheUncharted/zapcode

Length of output: 476


🏁 Script executed:

cat packages/zapcode-ai-python/src/zapcode_ai/__init__.py | grep -n "class ToolDefinition" -A 20

Repository: TheUncharted/zapcode

Length of output: 747


🏁 Script executed:

find packages/zapcode-ai-python -type f \( -name "*.py" -o -name "*.md" \) | xargs grep -l "async def" 2>/dev/null

Repository: TheUncharted/zapcode

Length of output: 48


🏁 Script executed:

cat packages/zapcode-ai-python/README.md 2>/dev/null || cat packages/zapcode-ai-python/README.rst 2>/dev/null || echo "No README found"

Repository: TheUncharted/zapcode

Length of output: 244


🏁 Script executed:

ls -la packages/zapcode-ai-python/

Repository: TheUncharted/zapcode

Length of output: 350


🏁 Script executed:

rg "Awaitable" packages/zapcode-ai-python --no-filename

Repository: TheUncharted/zapcode

Length of output: 107


🏁 Script executed:

grep -n "result = tool_def.execute" packages/zapcode-ai-python/src/zapcode_ai/__init__.py

Repository: TheUncharted/zapcode

Length of output: 117


🏁 Script executed:

python3 << 'EOF'
import inspect

# Verify inspect.isawaitable works as expected
async def async_func():
    return 42

def sync_func():
    return 42

coro = async_func()
print(f"inspect.isawaitable(coro) = {inspect.isawaitable(coro)}")
print(f"inspect.isawaitable(sync_func()) = {inspect.isawaitable(sync_func())}")

# Clean up
coro.close()
EOF

Repository: TheUncharted/zapcode

Length of output: 137


Add guard against async tool implementations passed to synchronous execution.

Line 247 calls tool_def.execute(named_args) synchronously, yet ToolDefinition.execute is documented as returning Any or an awaitable. If a caller supplies an async def tool, the coroutine object gets stored in tool_calls and passed to snapshot.resume(result) instead of being awaited, causing unexpected behavior.

Suggested fix
 import json
 import time
+import inspect
 from dataclasses import dataclass, field
             result = tool_def.execute(named_args)
+            if inspect.isawaitable(result):
+                raise TypeError(
+                    f"Tool '{fn_name}' returned an awaitable, but handle_tool_call() is synchronous. "
+                    "Use a synchronous tool or add an async entry point."
+                )
             tool_calls.append({"name": fn_name, "args": args, "result": result})
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
result = tool_def.execute(named_args)
tool_calls.append({"name": fn_name, "args": args, "result": result})
if tool_span:
tool_span.attributes["zapcode.tool.result"] = json.dumps(result, default=str)
_end_span(tool_span)
exec_span.children.append(tool_span)
snapshot: ZapcodeSnapshot = state["snapshot"]
state = snapshot.resume(result)
result = tool_def.execute(named_args)
if inspect.isawaitable(result):
raise TypeError(
f"Tool '{fn_name}' returned an awaitable, but handle_tool_call() is synchronous. "
"Use a synchronous tool or add an async entry point."
)
tool_calls.append({"name": fn_name, "args": args, "result": result})
if tool_span:
tool_span.attributes["zapcode.tool.result"] = json.dumps(result, default=str)
_end_span(tool_span)
exec_span.children.append(tool_span)
snapshot: ZapcodeSnapshot = state["snapshot"]
state = snapshot.resume(result)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/zapcode-ai-python/src/zapcode_ai/__init__.py` around lines 247 -
256, The code calls tool_def.execute(named_args) synchronously and may receive
an awaitable from async tool implementations; detect awaitable results (use
inspect.isawaitable/asyncio.iscoroutine) after calling tool_def.execute and
resolve them before storing to tool_calls and passing to snapshot.resume: if
inside an async context await the result (result = await result), otherwise run
it to completion with the event loop (e.g., loop.run_until_complete(result) or
asyncio.run(result)) so that tool_calls, tool_span attribute logging, and
snapshot.resume receive the actual resolved value; update the block around
tool_def.execute, tool_calls, tool_span, and snapshot.resume accordingly.

Comment on lines +414 to +418
session_trace: TraceSpan | None = (
_create_span("session", {"zapcode.tools": ", ".join(tools.keys())})
if tracing else None
)
attempt_count = 0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Session trace state leaks across separate uses of the same ZapcodeAI instance.

session_trace and attempt_count live in the closure created by zapcode(). If an app reuses one ZapcodeAI across multiple chats, later attempts append onto the earlier trace and keep prior tool args/results reachable via get_trace(). This needs an explicit per-conversation session boundary or reset.

Also applies to: 420-434

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/zapcode-ai-python/src/zapcode_ai/__init__.py` around lines 414 -
418, The session_trace and attempt_count are stored in the closure created by
zapcode(), causing trace state to persist across chats; modify zapcode()/the
returned ZapcodeAI so that session_trace and attempt_count are reinitialized at
the start of each new conversation or API call (e.g., move their initialization
into the per-conversation handler or add an explicit reset method) and ensure
get_trace() reads only the current per-conversation span created via
_create_span("session", ...) (also update any logic in the related block around
the existing 420-434 code to reference the per-call/session variables rather
than the outer closure values).

Comment on lines +318 to +387
const execSpan = tracing ? createSpan("execute", { "zapcode.code": code }) : undefined;

try {
const sandbox = new Zapcode(code, {
externalFunctions: toolNames,
timeLimitMs: options.timeLimitMs ?? 10_000,
memoryLimitMb: options.memoryLimitMb ?? 32,
});

let state = sandbox.start();
let stdout = "";

// Snapshot/resume loop — resolve each tool call as the VM suspends
while (!state.completed) {
const { functionName, args } = state;

const toolDef = toolDefs[functionName];
if (!toolDef) {
throw new Error(
`Guest code called unknown function '${functionName}'. ` +
`Available: ${toolNames.join(", ")}`
);
}

// Build named args from positional args using the parameter schema
const paramNames = Object.keys(toolDef.parameters);
const namedArgs: Record<string, unknown> = {};
for (let i = 0; i < paramNames.length && i < args.length; i++) {
namedArgs[paramNames[i]] = args[i];
}

const toolSpan = tracing ? createSpan("tool_call", {
"zapcode.tool.name": functionName,
"zapcode.tool.args": JSON.stringify(args),
}) : undefined;

const result = await toolDef.execute(namedArgs);
toolCalls.push({ name: functionName, args, result });

if (toolSpan) {
toolSpan.attributes["zapcode.tool.result"] = JSON.stringify(result);
endSpan(toolSpan);
execSpan!.children.push(toolSpan);
}

// Resume the VM with the tool's return value
const snapshot = ZapcodeSnapshotHandle.load(state.snapshot);
state = snapshot.resume(result);
}

const sandbox = new Zapcode(code, {
externalFunctions: toolNames,
timeLimitMs: options.timeLimitMs ?? 10_000,
memoryLimitMb: options.memoryLimitMb ?? 32,
});

let state = sandbox.start();
let stdout = "";

// Snapshot/resume loop — resolve each tool call as the VM suspends
while (!state.completed) {
const { functionName, args } = state;

const toolDef = toolDefs[functionName];
if (!toolDef) {
throw new Error(
`Guest code called unknown function '${functionName}'. ` +
`Available: ${toolNames.join(", ")}`
);
if (state.stdout) {
stdout = state.stdout;
}

// Build named args from positional args using the parameter schema
const paramNames = Object.keys(toolDef.parameters);
const namedArgs: Record<string, unknown> = {};
for (let i = 0; i < paramNames.length && i < args.length; i++) {
namedArgs[paramNames[i]] = args[i];
if (execSpan) {
execSpan.attributes["zapcode.output"] = JSON.stringify(state.output);
if (stdout) execSpan.attributes["zapcode.stdout"] = stdout;
endSpan(execSpan);
}

const result = await toolDef.execute(namedArgs);
toolCalls.push({ name: functionName, args, result });
if (debug && execSpan) {
printTrace(execSpan);
}

// Resume the VM with the tool's return value
const snapshot = ZapcodeSnapshotHandle.load(state.snapshot);
state = snapshot.resume(result);
}
return {
code,
output: state.output,
stdout,
toolCalls,
...(execSpan ? { trace: execSpan } : {}),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

This trace tree can never show parse/compile.

executeCode() only creates an execute span plus tool_call children. Nothing in this path adds parse or compile, so printTrace() cannot produce the parse → compile → execute tree shown by the new debug-tracing example. Reuse the lower-level run trace as the attempt root and hang the AI-layer spans under it instead of replacing it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/zapcode-ai/src/index.ts` around lines 318 - 387, The current
executeCode path creates a top-level execSpan via createSpan("execute", ...)
which replaces the VM's own run trace so parse/compile spans never appear;
locate the VM/run-level trace produced by Zapcode (look for the run/root span
exposed by Zapcode instance or the snapshot/state object returned from
Zapcode.start()/ZapcodeSnapshotHandle.load — e.g., a runSpan on the sandbox or
state) and make the execSpan a child of that run-level span instead of a
standalone root: create execSpan using createSpan as before but attach it into
runSpan.children, and when ending spans use endSpan(toolSpan) and
endSpan(execSpan) then ensure you add execSpan to the runSpan.children and pass
runSpan into printTrace(debug) so the final trace shows parse → compile →
execute with tool_call children.

- Fix debug flag not passed through in zapcode-ai TS (was hardcoded
  to false, so debug:true had no effect on per-attempt traces)
- Remove unnecessary f-strings (Ruff F541) in Python files
- Add language tags to fenced code blocks (markdownlint MD040)
- Fix cd paths in examples/README.md to work from repo root
- Remove unnecessary npx prefix in npm scripts (tsx is a devDep)
- Make CONTRIBUTING.md E2E commands copy-pasteable
- Add AWS credentials prerequisites to Bedrock/debug-tracing READMEs
- Clarify working directory for Rust example
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (4)
packages/zapcode-ai-python/src/zapcode_ai/__init__.py (3)

414-418: ⚠️ Potential issue | 🟠 Major

Make trace state per conversation, not per zapcode() instance.

session_trace and attempt_count live in the outer closure, so a reused ZapcodeAI instance keeps appending later chats onto the old session. That leaks prior tool args/results through get_trace() and keeps mutating a span that may already have been finalized.

Also applies to: 420-434, 470-475

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/zapcode-ai-python/src/zapcode_ai/__init__.py` around lines 414 -
418, session_trace and attempt_count are declared in the outer closure so they
persist across calls to zapcode(), leaking previous conversation state into
get_trace() and mutating finalized spans; move their declarations into the
per-conversation scope (e.g., inside the zapcode() function or a new
Conversation context) so each invocation gets a fresh session_trace (created via
_create_span("session", ...)) and attempt_count initialized to 0, and update any
logic that references session_trace/attempt_count (including code paths around
_create_span, get_trace(), and span finalization) to use the per-call variables
rather than the outer-closure ones.

247-256: ⚠️ Potential issue | 🟠 Major

Reject or resolve awaitable tool results here.

tool_def.execute() is invoked synchronously, but the public contract still says it may return an awaitable. An async def tool will hand a coroutine object to snapshot.resume() instead of the real tool result.

🛠 Suggested guard
+import inspect
 import json
 import time
@@
-            result = tool_def.execute(named_args)
+            result = tool_def.execute(named_args)
+            if inspect.isawaitable(result):
+                raise TypeError(
+                    f"Tool '{fn_name}' returned an awaitable, but handle_tool_call() is synchronous."
+                )
             tool_calls.append({"name": fn_name, "args": args, "result": result})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/zapcode-ai-python/src/zapcode_ai/__init__.py` around lines 247 -
256, tool_def.execute() can return an awaitable (coroutine/awaitable) which must
be resolved before calling snapshot.resume; detect awaitables (e.g. via
inspect.isawaitable(result)) after result = tool_def.execute(named_args), await
them to get the real value, then use that resolved value for tool_calls,
tool_span logging (zapcode.tool.result) and snapshot.resume(result); ensure this
change touches the block around tool_def.execute, tool_span handling, and
snapshot.resume so that awaitable results for async tools are resolved before
resuming the ZapcodeSnapshot.

212-213: ⚠️ Potential issue | 🟠 Major

Return the engine trace, not only exec_span.

trace=exec_span still drops the per-phase trace coming from Zapcode itself, so get_trace() can never show parse/compile/execute timings. This path needs to thread the binding's trace through and attach the wrapper span under that root instead of replacing it.

Also applies to: 269-275

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/zapcode-ai-python/src/zapcode_ai/__init__.py` around lines 212 -
213, The current code sets trace=exec_span which overwrites the binding's
existing per-phase trace; instead, retrieve the binding's root trace (e.g.,
binding.get_trace() or the trace object returned/available from the binding
call), attach or nest the wrapper span created by _create_span("execute",
{"zapcode.code": code}) under that root (e.g., as a child span) and pass the
original binding trace as trace to the caller; update both occurrences where
exec_span is used (the earlier exec path and the second occurrence around lines
269-275) so you thread the binding trace through and only add the wrapper span
rather than replacing the binding's trace.
packages/zapcode-ai/src/index.ts (1)

318-319: ⚠️ Potential issue | 🟠 Major

Use the Zapcode run trace as ExecutionResult.trace.

ExecutionResult.trace still points at the synthetic span created in this wrapper, so callers only ever see execute/tool_call nodes. The new tracing feature still won't expose the documented parse → compile → execute tree until this path threads the engine trace through and attaches the AI-layer span under that root.

Also applies to: 382-388

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/zapcode-ai/src/index.ts` around lines 318 - 319, The current wrapper
creates execSpan via createSpan("execute", { "zapcode.code": code }) and then
returns an ExecutionResult whose trace points at that synthetic exec span,
preventing callers from seeing the engine's parse→compile→execute tree; update
the code path to accept/obtain the engine's trace (the trace produced by the
engine run/execute call), set ExecutionResult.trace to that engine trace, and
attach the AI-layer execSpan as a child of the engine root (either by creating
execSpan with the engine trace/span as parent or by linking execSpan into
engineTrace using the engine trace API) so callers receive the full engine trace
with the AI-layer span nested under it; apply the same change for the other
occurrence referenced (the block around lines 382–388).
🧹 Nitpick comments (1)
examples/python/debug-tracing/main.py (1)

189-200: Exception messages are acceptable for example code.

Static analysis flags TRY003 (long messages in exception raises), which suggests moving messages to custom exception classes. For a self-contained example script, inline descriptive messages in RuntimeError are appropriate and improve readability.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/python/debug-tracing/main.py` around lines 189 - 200, The inline
descriptive exception messages in the two RuntimeError raises inside main() are
acceptable for an example script, so suppress the static analysis warning by
adding a per-line noqa pragma (e.g., append "# noqa: TRY003") to each raise
statement (the one raising for unexpected stop_reason and the one for exceeding
max_steps) so the linter ignores TRY003 here while keeping the helpful inline
messages.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CONTRIBUTING.md`:
- Line 64: Update the E2E Python instruction that currently runs "cd
crates/zapcode-py && maturin develop --release && cd ../../examples/python/basic
&& python main.py" to first create/activate a Python virtualenv and install
maturin into it; specifically, add steps to create a venv (python -m venv
.venv), activate it (.venv\\Scripts\\activate on Windows or source
.venv/bin/activate on Unix), and run pip install --upgrade pip maturin before
running maturin develop so that maturin installs into the active environment.

In `@packages/zapcode-ai-python/src/zapcode_ai/__init__.py`:
- Around line 186-190: The debug trace currently omits the generated code
because _print_trace() filters out attributes with keys starting with
"zapcode.code", which contradicts the public docstring promise; update
_print_trace() to include a truncated representation of the code attribute
(e.g., take span.attributes.get("zapcode.code") and include only the first N
characters with an ellipsis) when present instead of filtering it out, while
still avoiding dumping full code content; locate the attrs construction that
iterates span.attributes.items() and add special handling for the "zapcode.code"
key so it is included truncated in the printed attrs string.

In `@packages/zapcode-ai/src/index.ts`:
- Around line 463-466: The sessionTrace and attemptCount are currently captured
once inside zapcode() causing a single TraceSpan and attempt counter to be
reused across requests; move creation of sessionTrace (from createSpan) and
initialization of attemptCount into the per-request execution scope (the
function returned by zapcode()) so each invocation creates a fresh TraceSpan and
resets attemptCount, and ensure a new ZapcodeAIResult is constructed per call
(not reused) so getTrace() only returns the current request's spans and tool
args/results. Also audit related code paths referenced by the comment (the
blocks around where sessionTrace is used and where ZapcodeAIResult is
built/returned) to remove shared mutable state across invocations.
- Around line 349-352: The current tracing code calls JSON.stringify(args) which
can throw on bigints or circular structures; wrap the serialization in a
non-throwing routine before calling createSpan (referencing tracing, toolSpan,
createSpan, "tool_call", functionName, args) — implement a safeSerialize that
tries JSON.stringify(args) in a try/catch and on failure falls back to a safe
replacer (handle bigint by converting to string and track seen objects to avoid
circular errors) or uses a fallback like String(args) or util.inspect, then pass
the safe result into createSpan so tracing cannot throw and will always produce
a string for "zapcode.tool.args".

---

Duplicate comments:
In `@packages/zapcode-ai-python/src/zapcode_ai/__init__.py`:
- Around line 414-418: session_trace and attempt_count are declared in the outer
closure so they persist across calls to zapcode(), leaking previous conversation
state into get_trace() and mutating finalized spans; move their declarations
into the per-conversation scope (e.g., inside the zapcode() function or a new
Conversation context) so each invocation gets a fresh session_trace (created via
_create_span("session", ...)) and attempt_count initialized to 0, and update any
logic that references session_trace/attempt_count (including code paths around
_create_span, get_trace(), and span finalization) to use the per-call variables
rather than the outer-closure ones.
- Around line 247-256: tool_def.execute() can return an awaitable
(coroutine/awaitable) which must be resolved before calling snapshot.resume;
detect awaitables (e.g. via inspect.isawaitable(result)) after result =
tool_def.execute(named_args), await them to get the real value, then use that
resolved value for tool_calls, tool_span logging (zapcode.tool.result) and
snapshot.resume(result); ensure this change touches the block around
tool_def.execute, tool_span handling, and snapshot.resume so that awaitable
results for async tools are resolved before resuming the ZapcodeSnapshot.
- Around line 212-213: The current code sets trace=exec_span which overwrites
the binding's existing per-phase trace; instead, retrieve the binding's root
trace (e.g., binding.get_trace() or the trace object returned/available from the
binding call), attach or nest the wrapper span created by
_create_span("execute", {"zapcode.code": code}) under that root (e.g., as a
child span) and pass the original binding trace as trace to the caller; update
both occurrences where exec_span is used (the earlier exec path and the second
occurrence around lines 269-275) so you thread the binding trace through and
only add the wrapper span rather than replacing the binding's trace.

In `@packages/zapcode-ai/src/index.ts`:
- Around line 318-319: The current wrapper creates execSpan via
createSpan("execute", { "zapcode.code": code }) and then returns an
ExecutionResult whose trace points at that synthetic exec span, preventing
callers from seeing the engine's parse→compile→execute tree; update the code
path to accept/obtain the engine's trace (the trace produced by the engine
run/execute call), set ExecutionResult.trace to that engine trace, and attach
the AI-layer execSpan as a child of the engine root (either by creating execSpan
with the engine trace/span as parent or by linking execSpan into engineTrace
using the engine trace API) so callers receive the full engine trace with the
AI-layer span nested under it; apply the same change for the other occurrence
referenced (the block around lines 382–388).

---

Nitpick comments:
In `@examples/python/debug-tracing/main.py`:
- Around line 189-200: The inline descriptive exception messages in the two
RuntimeError raises inside main() are acceptable for an example script, so
suppress the static analysis warning by adding a per-line noqa pragma (e.g.,
append "# noqa: TRY003") to each raise statement (the one raising for unexpected
stop_reason and the one for exceeding max_steps) so the linter ignores TRY003
here while keeping the helpful inline messages.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 852875e0-aa90-45f9-95c7-e1bb697851a0

📥 Commits

Reviewing files that changed from the base of the PR and between ccfc2ee and cdbc6ee.

📒 Files selected for processing (11)
  • CONTRIBUTING.md
  • examples/README.md
  • examples/python/ai-bedrock/README.md
  • examples/python/debug-tracing/README.md
  • examples/python/debug-tracing/main.py
  • examples/rust/basic/README.md
  • examples/typescript/ai-agent/package.json
  • examples/typescript/ai-bedrock/README.md
  • examples/typescript/debug-tracing/README.md
  • packages/zapcode-ai-python/src/zapcode_ai/__init__.py
  • packages/zapcode-ai/src/index.ts
🚧 Files skipped from review as they are similar to previous changes (7)
  • examples/typescript/debug-tracing/README.md
  • examples/README.md
  • examples/python/ai-bedrock/README.md
  • examples/rust/basic/README.md
  • examples/python/debug-tracing/README.md
  • examples/typescript/ai-agent/package.json
  • examples/typescript/ai-bedrock/README.md

- E2E JS: build bindings then run `examples/typescript/basic.ts`
- E2E Python: build bindings then run `examples/python/basic.py`
- E2E JS: `cd crates/zapcode-js && npm install && npx napi build --release --platform --js index.js --dts index.d.ts && cd ../../examples/typescript/basic && npm install && npx tsx main.ts`
- E2E Python: `cd crates/zapcode-py && maturin develop --release && cd ../../examples/python/basic && python main.py`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -name "CONTRIBUTING.md" -type f

Repository: TheUncharted/zapcode

Length of output: 81


🏁 Script executed:

head -n 70 CONTRIBUTING.md | tail -n 20

Repository: TheUncharted/zapcode

Length of output: 1138


🏁 Script executed:

# Check if there's a Python environment setup already documented
rg -A 5 -B 5 "E2E Python" CONTRIBUTING.md

Repository: TheUncharted/zapcode

Length of output: 663


🏁 Script executed:

# Check for any existing venv or setup instructions
rg "venv|virtualenv|maturin" CONTRIBUTING.md

Repository: TheUncharted/zapcode

Length of output: 181


🌐 Web query:

maturin develop virtualenv requirements documentation

💡 Result:

  • Create/activate a virtualenv (recommended for local dev): maturin’s own tutorial starts by creating a virtual environment, then installing maturin. [1]
  • Use maturin develop to install into the current environment: maturin develop builds your Rust crate and installs it directly into the currently active virtualenv (skipping wheel generation). This is the standard workflow for fast local iteration. [1][2][3]
  • Docs (Sphinx) “requirements” setup: maturin’s Sphinx integration docs explicitly suggest putting documentation dependencies in docs/requirements.txt (i.e., Python packages needed to build the docs). [4]
  • If you’re packaging (pyproject.toml): the tutorial notes you typically declare maturin as a requirement (with a pinned version) and set it as the build-backend per PEP 517. [1]

Sources: [1] Maturin User Guide – Tutorial, [2] Maturin User Guide – Local Development, [3] PyPI maturin (current release page), [4] Maturin User Guide – Sphinx Integration.


🌐 Web query:

maturin installation requirements Python binding

💡 Result:

Maturin (Python binding / Rust-extension builds): installation requirements

  • Python environment: a working Python install (for PyO3-based projects, Python ≥ 3.7 is required). [3]
  • Rust toolchain: rustc/cargo installed (for PyO3, the minimum Rust version is 1.83). [3]
  • Maturin itself (recommended install): pipx install maturin (or pip install maturin). [1][3]

Optional / scenario-specific dependencies

  • zig: optional, used as a linker to simplify cross-compiling and help with manylinux compliance. [1]
  • patchelf: optional, used to repair wheels that link to other shared libraries (install via maturin[patchelf]). [1]
  • If using cffi bindings: ensure cffi is installed (maturin may auto-install it only in some virtualenv cases; otherwise you must pip install cffi). [2]

Sources: [1] Maturin Installation docs, [2] Maturin Bindings docs, [3] PyO3 “Getting started / Installation” docs.


Document the Python env/bootstrap step before maturin develop.

maturin develop installs into the current virtualenv, and maturin itself must be installed separately. A fresh contributor will hit a failure before the example runs if neither is set up. Please add the venv creation and maturin installation step.

♻️ Suggested doc update
-- E2E Python: `cd crates/zapcode-py && maturin develop --release && cd ../../examples/python/basic && python main.py`
+- E2E Python: `python -m venv .venv && . .venv/bin/activate && pip install maturin && cd crates/zapcode-py && maturin develop --release && cd ../../examples/python/basic && python main.py`
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- E2E Python: `cd crates/zapcode-py && maturin develop --release && cd ../../examples/python/basic && python main.py`
- E2E Python: `python -m venv .venv && . .venv/bin/activate && pip install maturin && cd crates/zapcode-py && maturin develop --release && cd ../../examples/python/basic && python main.py`
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@CONTRIBUTING.md` at line 64, Update the E2E Python instruction that currently
runs "cd crates/zapcode-py && maturin develop --release && cd
../../examples/python/basic && python main.py" to first create/activate a Python
virtualenv and install maturin into it; specifically, add steps to create a venv
(python -m venv .venv), activate it (.venv\\Scripts\\activate on Windows or
source .venv/bin/activate on Unix), and run pip install --upgrade pip maturin
before running maturin develop so that maturin installs into the active
environment.

Comment on lines +186 to +190
attrs = " ".join(
f"{k}={str(v)[:80]}" for k, v in span.attributes.items()
if not k.startswith("zapcode.code") # don't dump full code in trace
)
print(f"{prefix}{icon} {span.name} ({duration}){' ' + attrs if attrs else ''}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

debug=True currently omits the generated code.

_print_trace() filters out zapcode.code, so the implementation doesn't match the public promise on Lines 376-379. Either print a truncated code attribute here or narrow the docstring.

Also applies to: 376-379

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/zapcode-ai-python/src/zapcode_ai/__init__.py` around lines 186 -
190, The debug trace currently omits the generated code because _print_trace()
filters out attributes with keys starting with "zapcode.code", which contradicts
the public docstring promise; update _print_trace() to include a truncated
representation of the code attribute (e.g., take
span.attributes.get("zapcode.code") and include only the first N characters with
an ellipsis) when present instead of filtering it out, while still avoiding
dumping full code content; locate the attrs construction that iterates
span.attributes.items() and add special handling for the "zapcode.code" key so
it is included truncated in the printed attrs string.

Comment on lines +349 to +352
const toolSpan = tracing ? createSpan("tool_call", {
"zapcode.tool.name": functionName,
"zapcode.tool.args": JSON.stringify(args),
}) : undefined;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's check if the file exists and read it
cd packages/zapcode-ai && wc -l src/index.ts

Repository: TheUncharted/zapcode

Length of output: 80


🏁 Script executed:

# Read the specific sections mentioned in the review
cd packages/zapcode-ai && sed -n '340,380p' src/index.ts | cat -n

Repository: TheUncharted/zapcode

Length of output: 1620


🏁 Script executed:

# Search for all JSON.stringify calls in the file
cd packages/zapcode-ai && rg 'JSON\.stringify' src/index.ts -n -A 2 -B 2

Repository: TheUncharted/zapcode

Length of output: 851


🏁 Script executed:

cd packages/zapcode-ai && rg 'safeStringify|function.*Stringify' src/index.ts -n

Repository: TheUncharted/zapcode

Length of output: 46


🏁 Script executed:

# Find the printTrace function to understand the context for the proposed fix
cd packages/zapcode-ai && rg 'function printTrace' src/index.ts -A 5 -n

Repository: TheUncharted/zapcode

Length of output: 411


🏁 Script executed:

# Get all JSON.stringify occurrences with context to see if any are already protected
cd packages/zapcode-ai && rg 'JSON\.stringify' src/index.ts -n

Repository: TheUncharted/zapcode

Length of output: 364


🏁 Script executed:

# Get more context around the printTrace function
cd packages/zapcode-ai && sed -n '280,310p' src/index.ts | cat -n

Repository: TheUncharted/zapcode

Length of output: 1418


🏁 Script executed:

# Check if there are any error handling patterns for tracing in the code
cd packages/zapcode-ai && rg 'tracing|createSpan|endSpan' src/index.ts -n | head -20

Repository: TheUncharted/zapcode

Length of output: 812


🏁 Script executed:

# Check the structure to see if this is in a try-catch or similar error handling context
cd packages/zapcode-ai && sed -n '300,380p' src/index.ts | cat -n

Repository: TheUncharted/zapcode

Length of output: 3225


Make trace serialization non-throwing.

JSON.stringify() will throw on values like bigint and circular objects. With debug or autoFix enabled, that means tracing can fail an otherwise successful execution path.

🛠 Suggested change
+function safeStringify(value: unknown): string {
+  try {
+    return JSON.stringify(value, (_key, current) =>
+      typeof current === "bigint" ? `${current}n` : current
+    );
+  } catch {
+    return String(value);
+  }
+}
+
 function printTrace(span: TraceSpan, indent = 0): void {
   const prefix = indent === 0 ? "" : "│ ".repeat(indent - 1) + "├─ ";
   const icon = span.status === "error" ? "✗" : "✓";
@@
       const toolSpan = tracing ? createSpan("tool_call", {
         "zapcode.tool.name": functionName,
-        "zapcode.tool.args": JSON.stringify(args),
+        "zapcode.tool.args": safeStringify(args),
       }) : undefined;
@@
       if (toolSpan) {
-        toolSpan.attributes["zapcode.tool.result"] = JSON.stringify(result);
+        toolSpan.attributes["zapcode.tool.result"] = safeStringify(result);
         endSpan(toolSpan);
         execSpan!.children.push(toolSpan);
       }
@@
     if (execSpan) {
-      execSpan.attributes["zapcode.output"] = JSON.stringify(state.output);
+      execSpan.attributes["zapcode.output"] = safeStringify(state.output);
       if (stdout) execSpan.attributes["zapcode.stdout"] = stdout;
       endSpan(execSpan);
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/zapcode-ai/src/index.ts` around lines 349 - 352, The current tracing
code calls JSON.stringify(args) which can throw on bigints or circular
structures; wrap the serialization in a non-throwing routine before calling
createSpan (referencing tracing, toolSpan, createSpan, "tool_call",
functionName, args) — implement a safeSerialize that tries JSON.stringify(args)
in a try/catch and on failure falls back to a safe replacer (handle bigint by
converting to string and track seen objects to avoid circular errors) or uses a
fallback like String(args) or util.inspect, then pass the safe result into
createSpan so tracing cannot throw and will always produce a string for
"zapcode.tool.args".

Comment on lines +463 to +466
const sessionTrace: TraceSpan | undefined = tracing
? createSpan("session", { "zapcode.tools": Object.keys(toolDefs).join(", ") })
: undefined;
let attemptCount = 0;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't keep one session trace for the entire wrapper lifetime.

sessionTrace and attemptCount are captured once in zapcode(). Reusing the same ZapcodeAIResult across chats/requests keeps appending new attempts onto the previous session and exposes earlier tool args/results through getTrace().

Also applies to: 469-477, 527-530

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/zapcode-ai/src/index.ts` around lines 463 - 466, The sessionTrace
and attemptCount are currently captured once inside zapcode() causing a single
TraceSpan and attempt counter to be reused across requests; move creation of
sessionTrace (from createSpan) and initialization of attemptCount into the
per-request execution scope (the function returned by zapcode()) so each
invocation creates a fresh TraceSpan and resets attemptCount, and ensure a new
ZapcodeAIResult is constructed per call (not reused) so getTrace() only returns
the current request's spans and tool args/results. Also audit related code paths
referenced by the comment (the blocks around where sessionTrace is used and
where ZapcodeAIResult is built/returned) to remove shared mutable state across
invocations.

@TheUncharted TheUncharted changed the title Develop feat: Add new feature related to debugging and tracing Mar 12, 2026
@TheUncharted TheUncharted merged commit d3a37cc into master Mar 12, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant