feat: Add new feature related to debugging and tracing#31
feat: Add new feature related to debugging and tracing#31TheUncharted merged 10 commits intomasterfrom
Conversation
Adds a structured trace tree (TraceSpan) that captures timing for each phase of execution: parse → compile → execute. This gives AI agent developers visibility into where time is spent inside the sandbox, which is critical for debugging latency in production agent loops where Zapcode executes thousands of code snippets. Each span records name, status (Ok/Error), start/end timestamps, duration in microseconds, key-value attributes (e.g. suspended function name, argument count), and child spans.
Surfaces the trace tree from zapcode-core through all three binding layers so that agent developers can inspect execution timing regardless of their language. Without this, trace data was only accessible from Rust — now every SDK gets the same observability.
13 tests covering trace structure, timing validity, error handling, suspension attributes, pretty printing, and independence of multiple runs. Ensures the trace system is correct before building higher-level features (autoFix, debug logging) on top of it.
autoFix catches execution errors and returns them as tool results instead of throwing, letting the LLM see the error and self-correct on the next step. This eliminates the main risk of code execution: a single bad generation no longer kills the entire agent loop. Execution trace collects a session-level span tree across all executions, accessible via printTrace()/getTrace() (TS) and print_trace()/get_trace() (Python). This gives developers a single view of every code execution, tool call, and retry in the session. Both features are implemented in the TypeScript and Python AI packages.
Restructures examples/ from a flat layout to language-first, topic-second (e.g. examples/typescript/debug-tracing/). Each example is now a self-contained project with its own package.json/pyproject.toml, making it easy to cd in and run without affecting other examples. Also adds debug-tracing examples (TypeScript + Python) that demonstrate autoFix, step-by-step logging of generated code and tool calls, and execution trace printing — serving as the reference for developers who want full observability into their agent's code execution.
These features were implemented but undiscoverable — a user looking at the README had no idea they could enable error recovery or inspect execution timing. Adds a dedicated section explaining the why (LLM self-correction, production observability) and links to the debug-tracing examples for step-by-step logging patterns. Also updates example paths to reflect the new directory structure.
📝 WalkthroughWalkthroughThis PR adds an execution tracing system (TraceSpan, TraceStatus, ExecutionTrace) to core, instruments VM parse/compile/execute flows to emit traces, exposes traces through JS/Python/WASM bindings, extends zapcode-ai (JS/Python) with debug/auto-fix and tracing, and reorganizes examples and docs to per-example subdirectories and new tracing/debug examples. Changes
Sequence DiagramsequenceDiagram
participant Client
participant ZapcodeVM as Zapcode VM
participant Parser
participant Compiler
participant Executor
participant Trace as SpanBuilder/Trace
Client->>ZapcodeVM: run(code)
activate ZapcodeVM
ZapcodeVM->>Trace: new("parse")
Trace->>Parser: parse(code)
Parser-->>Trace: parse_result
Trace-->>ZapcodeVM: parse_span
ZapcodeVM->>Trace: new("compile")
Trace->>Compiler: compile(ast)
Compiler-->>Trace: compile_result
Trace-->>ZapcodeVM: compile_span
ZapcodeVM->>Trace: new("execute")
Trace->>Executor: execute(bytecode)
Executor-->>Trace: execute_result
Trace-->>ZapcodeVM: execute_span
ZapcodeVM->>ZapcodeVM: assemble ExecutionTrace(root with children)
ZapcodeVM-->>Client: RunResult{state, stdout, trace}
deactivate ZapcodeVM
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
CI was referencing old flat paths (examples/typescript/package.json, examples/python/basic.py) which no longer exist after the move to language-first, topic-second structure.
Benchmark Results |
There was a problem hiding this comment.
Actionable comments posted: 18
🧹 Nitpick comments (2)
packages/zapcode-ai-python/README.md (1)
1-5: Add a minimal install/usage snippet here.This README will likely be read standalone, so pointing only to the repo README makes the package page a dead end for first-time users.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/zapcode-ai-python/README.md` around lines 1 - 5, Add a minimal install and usage snippet to the package README so first-time users can get started without visiting the repo README: include a short "Install" line showing how to install the zapcode-ai package (e.g., pip or similar), and a concise "Usage" example that demonstrates importing the package, creating the primary client/entry object, and calling one simple method (e.g., run/execute) with a brief comment about expected output; place this under the existing header in packages/zapcode-ai-python/README.md so readers immediately see installation and a one-shot usage example.examples/typescript/ai-agent/package.json (1)
6-8: Use the pinned localtsxbinary in these npm scripts.
npm runalready resolvesnode_modules/.bin, so thenpxprefix is unnecessary here and makes the example less reproducible than just invoking the devDependency directly.♻️ Proposed refactor
- "agent": "npx tsx ai-agent-zapcode-ai.ts", - "agent:anthropic": "npx tsx ai-agent-anthropic.ts", - "agent:vercel": "npx tsx ai-agent-vercel-ai.ts" + "agent": "tsx ai-agent-zapcode-ai.ts", + "agent:anthropic": "tsx ai-agent-anthropic.ts", + "agent:vercel": "tsx ai-agent-vercel-ai.ts"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/typescript/ai-agent/package.json` around lines 6 - 8, The npm scripts "agent", "agent:anthropic", and "agent:vercel" use "npx tsx ..." which is unnecessary because npm run already resolves devDependencies; remove the "npx " prefix so the scripts invoke the pinned local tsx binary directly (update the values for "agent", "agent:anthropic", and "agent:vercel" to call "tsx ai-agent-zapcode-ai.ts", "tsx ai-agent-anthropic.ts", and "tsx ai-agent-vercel-ai.ts" respectively).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CONTRIBUTING.md`:
- Around line 63-64: Update the CONTRIBUTING.md lines under the E2E checks so
they are runnable shell commands instead of plain file paths: for "E2E JS"
replace "build bindings then run `examples/typescript/basic/main.ts`" with a
concrete command sequence (e.g., build bindings, install deps, then run the
TypeScript example via a specified runner such as `npx ts-node
examples/typescript/basic/main.ts` or `node
dist/examples/typescript/basic/main.js` if compiled) and for "E2E Python"
replace "build bindings then run `examples/python/basic/main.py`" with an
explicit command like `python3 examples/python/basic/main.py` (including any
required venv/installation steps or flags); ensure the updated text names the
required preparatory step (build bindings) and shows a single copy-pasteable
shell command for each platform (E2E JS and E2E Python).
In `@crates/zapcode-core/src/vm/mod.rs`:
- Around line 2297-2300: The start() method currently discards RunResult.trace
by calling self.run(...) and returning only result.state; change the API to
preserve and return the trace (either by: 1) modifying start() to return
Result<RunResult> instead of Result<VmState>, or 2) adding a new
start_with_trace / start_full that returns RunResult while keeping start() as a
convenience wrapper (if backward compatibility is required, have start() call
run() and map/runResult.state but provide the new trace-carrying variant).
Update callers that expect VmState to use the new return type or unwrap the
.state from RunResult, and ensure documentation/comments reference
RunResult.trace and the documented suspension API.
- Around line 2214-2219: The error-path builds an ExecutionTrace
(root_span.finish(TraceStatus::Error)) into a local _trace and then immediately
drops it by returning Err(e); instead preserve and return or attach that trace
to the error so callers can inspect it. Modify the error return at the
parse/compile/execute failure sites (where root_span,
parse_span.finish_error(...), ExecutionTrace and TraceStatus::Error are used and
you currently do return Err(e)) to wrap or augment the error with the
constructed ExecutionTrace (e.g. return Err(ErrorType::with_trace(e, trace)) or
convert to an error type that contains the ExecutionTrace), and update the
function return types/signatures accordingly; apply the same change to the other
two spots that build _trace (the blocks around compile_span and execute_span).
Ensure the chosen approach preserves the original error payload while making the
ExecutionTrace reachable to callers.
In `@crates/zapcode-wasm/src/lib.rs`:
- Around line 350-354: The resume() implementation is discarding tracing by
passing None to vm_state_to_js; instead preserve and forward the trace from the
VM state returned by self.inner.clone().resume(val). After obtaining state in
resume(), extract its trace (e.g., state.trace or similar field) and pass that
(or its cloned/borrowed form as the expected Option<&str>/Option<String>) into
vm_state_to_js rather than None so resumed executions keep end-to-end trace
data; update the call in resume() accordingly while keeping zapcode_err mapping
intact.
In `@examples/python/ai-bedrock/README.md`:
- Around line 5-19: Update the README Setup section to add a prerequisite note
that AWS credentials and model access are required before running the examples:
state that users must configure AWS credentials (environment variables, shared
credentials file, or an IAM role) and ensure their account has permission and
model access for the chosen MODEL_ID in the target AWS_REGION referenced in the
Run snippet (e.g., MODEL_ID and AWS_REGION overrides); keep the instruction
short and add a pointer to common AWS credential methods and to verify model
access in the AWS console or via the Bedrock permissions.
In `@examples/python/debug-tracing/main.py`:
- Line 93: The print statement in main.py uses an unnecessary f-string: change
the print call that currently reads print(f"Debug: ON | AutoFix: ON") to use a
plain string by removing the f prefix so it becomes print("Debug: ON | AutoFix:
ON"); this removes Ruff F541 by eliminating the unused f-string interpolation
while keeping behavior identical.
In `@examples/python/debug-tracing/README.md`:
- Around line 11-25: The README's setup/run instructions omit AWS credential and
region prerequisites required when using Bedrock-backed models (e.g.,
MODEL_ID=anthropic.claude-sonnet-4-20250514 in main.py); update the docs to
instruct users to configure the AWS credential chain (AWS CLI
~/.aws/credentials, environment variables, or IAM role) and to set AWS_REGION
(example: export AWS_REGION=us-east-1) before running python main.py, and
include a short note that missing credentials will cause authentication errors
on first run.
In `@examples/README.md`:
- Around line 5-21: The fenced tree block in examples/README.md is missing a
language tag; update the opening triple-backtick for that block (the line
containing "```" before the tree) to include a language identifier such as
"text" so it reads "```text", keeping the block content unchanged (locate the
fenced block shown under the examples/ tree and add the language tag).
- Around line 27-38: Update the quick-start commands so they are runnable from
the repository root: prefix each example path with "examples/" (e.g., change "cd
typescript/basic" to "cd examples/typescript/basic") and replace the macOS-only
"open wasm/basic/index.html" with a portable invocation such as using "xdg-open"
on Linux or a note to open the file in a browser (or provide both "xdg-open" and
"start" alternatives) so users on different OSes can run the WASM example
without first changing directories.
In `@examples/rust/basic/README.md`:
- Around line 5-8: Update the README entry for the "cargo run --example basic"
command to clarify the required working directory: either add a note "Run this
command from the examples/rust/basic/ directory" or provide the alternative full
command to run from the repo root using the manifest flag (use the Cargo.toml
for the basic example with --manifest-path and keep --example basic) so users
know how to run it from root versus inside the example folder.
In `@examples/typescript/ai-bedrock/README.md`:
- Around line 5-18: Update the README "Setup/Run" section to document the
required AWS authentication and model access before running npm start: explain
that Bedrock requires AWS credentials (e.g., AWS_ACCESS_KEY_ID,
AWS_SECRET_ACCESS_KEY, optional AWS_SESSION_TOKEN or AWS_PROFILE via aws
configure), the correct AWS_REGION, and IAM permissions to call Bedrock and
access the specified MODEL_ID; instruct users how to set these (environment
variables or aws cli config) and note that MODEL_ID and AWS_REGION environment
variables shown (MODEL_ID, AWS_REGION) must correspond to a model the
account/region has access to.
In `@examples/typescript/debug-tracing/README.md`:
- Around line 29-52: The fenced code block that starts with "Model:
global.amazon.nova-2-lite-v1:0 | Region: eu-west-1" is missing a language tag,
causing markdownlint MD040; update the opening fence from ``` to ```text (or
another appropriate language) for that block so the example output is treated as
plain text; ensure the closing ``` remains and do not alter the block contents
(this applies to the fenced example containing the zapcode tool calls like
getWeather and searchFlights).
In `@packages/zapcode-ai-python/src/zapcode_ai/__init__.py`:
- Around line 212-275: The returned ExecutionResult currently only includes the
synthetic exec_span (with tool_call children) and does not attach the Zapcode
engine's per-phase trace; fix by retrieving the underlying Zapcode trace from
the sandbox/state (e.g., call sandbox.get_trace() or use state["trace"] /
snapshot trace if available) after execution and merge or attach it to the
exec_span (or set ExecutionResult.trace to that full trace) so callers get the
parse/compile/execute timing tree; update the code around where exec_span is
created and where the ExecutionResult is returned to combine the engine trace
with exec_span (or replace exec_span) before constructing ExecutionResult.
- Around line 414-418: The session_trace and attempt_count are stored in the
closure created by zapcode(), causing trace state to persist across chats;
modify zapcode()/the returned ZapcodeAI so that session_trace and attempt_count
are reinitialized at the start of each new conversation or API call (e.g., move
their initialization into the per-conversation handler or add an explicit reset
method) and ensure get_trace() reads only the current per-conversation span
created via _create_span("session", ...) (also update any logic in the related
block around the existing 420-434 code to reference the per-call/session
variables rather than the outer closure values).
- Line 185: The f-string prefix on the literal "<1ms" is unnecessary and
triggers Ruff F541; update the assignment to remove the f-prefix so the ternary
assigns a plain string when span.duration_ms < 1 and keeps the f-string for the
else branch (i.e., change the duration assignment that references
span.duration_ms accordingly).
- Around line 247-256: The code calls tool_def.execute(named_args) synchronously
and may receive an awaitable from async tool implementations; detect awaitable
results (use inspect.isawaitable/asyncio.iscoroutine) after calling
tool_def.execute and resolve them before storing to tool_calls and passing to
snapshot.resume: if inside an async context await the result (result = await
result), otherwise run it to completion with the event loop (e.g.,
loop.run_until_complete(result) or asyncio.run(result)) so that tool_calls,
tool_span attribute logging, and snapshot.resume receive the actual resolved
value; update the block around tool_def.execute, tool_calls, tool_span, and
snapshot.resume accordingly.
In `@packages/zapcode-ai/src/index.ts`:
- Around line 455-460: The execOptions object is hardcoding debug:false so
per-attempt traces never get forwarded into executeCode(); change execOptions to
pass the computed tracing flag (or the original debug flag) instead of false so
executeCode() receives debug=true when zapcode({ debug: true }) is used — update
the execOptions declaration (the variable named execOptions) to set its debug
property to tracing (or debug) and ensure any call sites that pass execOptions
into executeCode(...)/executeCode are using that updated object.
- Around line 318-387: The current executeCode path creates a top-level execSpan
via createSpan("execute", ...) which replaces the VM's own run trace so
parse/compile spans never appear; locate the VM/run-level trace produced by
Zapcode (look for the run/root span exposed by Zapcode instance or the
snapshot/state object returned from Zapcode.start()/ZapcodeSnapshotHandle.load —
e.g., a runSpan on the sandbox or state) and make the execSpan a child of that
run-level span instead of a standalone root: create execSpan using createSpan as
before but attach it into runSpan.children, and when ending spans use
endSpan(toolSpan) and endSpan(execSpan) then ensure you add execSpan to the
runSpan.children and pass runSpan into printTrace(debug) so the final trace
shows parse → compile → execute with tool_call children.
---
Nitpick comments:
In `@examples/typescript/ai-agent/package.json`:
- Around line 6-8: The npm scripts "agent", "agent:anthropic", and
"agent:vercel" use "npx tsx ..." which is unnecessary because npm run already
resolves devDependencies; remove the "npx " prefix so the scripts invoke the
pinned local tsx binary directly (update the values for "agent",
"agent:anthropic", and "agent:vercel" to call "tsx ai-agent-zapcode-ai.ts", "tsx
ai-agent-anthropic.ts", and "tsx ai-agent-vercel-ai.ts" respectively).
In `@packages/zapcode-ai-python/README.md`:
- Around line 1-5: Add a minimal install and usage snippet to the package README
so first-time users can get started without visiting the repo README: include a
short "Install" line showing how to install the zapcode-ai package (e.g., pip or
similar), and a concise "Usage" example that demonstrates importing the package,
creating the primary client/entry object, and calling one simple method (e.g.,
run/execute) with a brief comment about expected output; place this under the
existing header in packages/zapcode-ai-python/README.md so readers immediately
see installation and a one-shot usage example.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: cb36cee8-57da-4a76-9f7c-c7762b1ea5ab
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.lockexamples/rust/basic/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (51)
.github/workflows/ci.ymlCONTRIBUTING.mdREADME.mdcrates/zapcode-core/src/lib.rscrates/zapcode-core/src/trace.rscrates/zapcode-core/src/vm/mod.rscrates/zapcode-core/tests/trace.rscrates/zapcode-js/src/lib.rscrates/zapcode-py/src/lib.rscrates/zapcode-wasm/src/lib.rsexamples/README.mdexamples/ai-bedrock/README.mdexamples/python/README.mdexamples/python/ai-agent/README.mdexamples/python/ai-agent/ai_agent_anthropic.pyexamples/python/ai-agent/ai_agent_zapcode_ai.pyexamples/python/ai-agent/pyproject.tomlexamples/python/ai-bedrock/README.mdexamples/python/ai-bedrock/main.pyexamples/python/ai-bedrock/pyproject.tomlexamples/python/basic/README.mdexamples/python/basic/main.pyexamples/python/basic/pyproject.tomlexamples/python/debug-tracing/README.mdexamples/python/debug-tracing/main.pyexamples/python/debug-tracing/pyproject.tomlexamples/rust/README.mdexamples/rust/basic/Cargo.tomlexamples/rust/basic/README.mdexamples/rust/basic/basic.rsexamples/typescript/README.mdexamples/typescript/ai-agent/README.mdexamples/typescript/ai-agent/ai-agent-anthropic.tsexamples/typescript/ai-agent/ai-agent-vercel-ai.tsexamples/typescript/ai-agent/ai-agent-zapcode-ai.tsexamples/typescript/ai-agent/package.jsonexamples/typescript/ai-bedrock/README.mdexamples/typescript/ai-bedrock/main.tsexamples/typescript/ai-bedrock/package.jsonexamples/typescript/basic/README.mdexamples/typescript/basic/main.tsexamples/typescript/basic/package.jsonexamples/typescript/debug-tracing/README.mdexamples/typescript/debug-tracing/main.tsexamples/typescript/debug-tracing/package.jsonexamples/typescript/package.jsonexamples/typescript/tsconfig.jsonexamples/wasm/basic/index.htmlpackages/zapcode-ai-python/README.mdpackages/zapcode-ai-python/src/zapcode_ai/__init__.pypackages/zapcode-ai/src/index.ts
💤 Files with no reviewable changes (6)
- examples/ai-bedrock/README.md
- examples/rust/README.md
- examples/typescript/README.md
- examples/python/README.md
- examples/typescript/package.json
- examples/typescript/tsconfig.json
| Err(e) => { | ||
| root_span.add_child(parse_span.finish_error(&e.to_string())); | ||
| let _trace = ExecutionTrace { | ||
| root: root_span.finish(TraceStatus::Error), | ||
| }; | ||
| return Err(e); |
There was a problem hiding this comment.
The error-path trace never leaves this function.
Lines 2216, 2233, and 2276 build an ExecutionTrace into _trace and immediately drop it by returning Err(e). Callers still cannot inspect parse/compile/execute traces after a failure, even though this code already paid the cost to build them.
Also applies to: 2231-2236, 2274-2279
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/zapcode-core/src/vm/mod.rs` around lines 2214 - 2219, The error-path
builds an ExecutionTrace (root_span.finish(TraceStatus::Error)) into a local
_trace and then immediately drops it by returning Err(e); instead preserve and
return or attach that trace to the error so callers can inspect it. Modify the
error return at the parse/compile/execute failure sites (where root_span,
parse_span.finish_error(...), ExecutionTrace and TraceStatus::Error are used and
you currently do return Err(e)) to wrap or augment the error with the
constructed ExecutionTrace (e.g. return Err(ErrorType::with_trace(e, trace)) or
convert to an error type that contains the ExecutionTrace), and update the
function return types/signatures accordingly; apply the same change to the other
two spots that build _trace (the blocks around compile_span and execute_span).
Ensure the chosen approach preserves the original error payload while making the
ExecutionTrace reachable to callers.
| pub fn start(&self, input_values: Vec<(String, Value)>) -> Result<VmState> { | ||
| let program = crate::parser::parse(&self.source)?; | ||
| let ext_set: HashSet<String> = self.external_functions.iter().cloned().collect(); | ||
| let compiled = crate::compiler::compile_with_externals(&program, ext_set.clone())?; | ||
| let mut vm = Vm::new(compiled, self.limits.clone(), ext_set); | ||
|
|
||
| for (name, value) in input_values { | ||
| vm.globals.insert(name, value); | ||
| } | ||
|
|
||
| vm.run() | ||
| let result = self.run(input_values)?; | ||
| Ok(result.state) | ||
| } |
There was a problem hiding this comment.
start() throws away the new trace.
Lines 2298-2299 call self.run() and then return only result.state. Direct Rust callers using the documented suspension API have no way to access tracing on this path; this needs a trace-carrying start API instead of discarding RunResult.trace.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/zapcode-core/src/vm/mod.rs` around lines 2297 - 2300, The start()
method currently discards RunResult.trace by calling self.run(...) and returning
only result.state; change the API to preserve and return the trace (either by:
1) modifying start() to return Result<RunResult> instead of Result<VmState>, or
2) adding a new start_with_trace / start_full that returns RunResult while
keeping start() as a convenience wrapper (if backward compatibility is required,
have start() call run() and map/runResult.state but provide the new
trace-carrying variant). Update callers that expect VmState to use the new
return type or unwrap the .state from RunResult, and ensure
documentation/comments reference RunResult.trace and the documented suspension
API.
| pub fn resume(&self, return_value: JsValue) -> Result<JsValue, JsError> { | ||
| let val = js_to_value(&return_value)?; | ||
| let state = self.inner.clone().resume(val).map_err(zapcode_err)?; | ||
| vm_state_to_js(state, "") | ||
| vm_state_to_js(state, "", None) | ||
| } |
There was a problem hiding this comment.
resume() drops tracing on the main suspension path.
Lines 353-354 force trace to None, so any execution that suspends loses trace data as soon as it resumes. That leaves multi-step external-call flows without end-to-end tracing in the wasm API.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/zapcode-wasm/src/lib.rs` around lines 350 - 354, The resume()
implementation is discarding tracing by passing None to vm_state_to_js; instead
preserve and forward the trace from the VM state returned by
self.inner.clone().resume(val). After obtaining state in resume(), extract its
trace (e.g., state.trace or similar field) and pass that (or its cloned/borrowed
form as the expected Option<&str>/Option<String>) into vm_state_to_js rather
than None so resumed executions keep end-to-end trace data; update the call in
resume() accordingly while keeping zapcode_err mapping intact.
| exec_span = _create_span("execute", {"zapcode.code": code}) if tracing else None | ||
|
|
||
| result = tool_def.execute(named_args) | ||
| tool_calls.append({"name": fn_name, "args": args, "result": result}) | ||
| try: | ||
| kwargs: dict[str, Any] = {"external_functions": tool_names} | ||
| if time_limit_ms is not None: | ||
| kwargs["time_limit_ms"] = time_limit_ms | ||
| if memory_limit_bytes is not None: | ||
| kwargs["memory_limit_bytes"] = memory_limit_bytes | ||
|
|
||
| snapshot: ZapcodeSnapshot = state["snapshot"] | ||
| state = snapshot.resume(result) | ||
| sandbox = Zapcode(code, **kwargs) | ||
| state = sandbox.start() | ||
|
|
||
| return ExecutionResult( | ||
| output=state.get("output"), | ||
| stdout=state.get("stdout", ""), | ||
| tool_calls=tool_calls, | ||
| ) | ||
| while state.get("suspended"): | ||
| fn_name = state["function_name"] | ||
| args = state["args"] | ||
|
|
||
| tool_def = tool_defs.get(fn_name) | ||
| if not tool_def: | ||
| raise ValueError( | ||
| f"Guest code called unknown function '{fn_name}'. " | ||
| f"Available: {', '.join(tool_names)}" | ||
| ) | ||
|
|
||
| # Build named args from positional args | ||
| param_names = list(tool_def.parameters.keys()) | ||
| named_args = { | ||
| param_names[i]: args[i] | ||
| for i in range(min(len(param_names), len(args))) | ||
| } | ||
|
|
||
| tool_span = _create_span("tool_call", { | ||
| "zapcode.tool.name": fn_name, | ||
| "zapcode.tool.args": json.dumps(args, default=str), | ||
| }) if tracing else None | ||
|
|
||
| result = tool_def.execute(named_args) | ||
| tool_calls.append({"name": fn_name, "args": args, "result": result}) | ||
|
|
||
| if tool_span: | ||
| tool_span.attributes["zapcode.tool.result"] = json.dumps(result, default=str) | ||
| _end_span(tool_span) | ||
| exec_span.children.append(tool_span) | ||
|
|
||
| snapshot: ZapcodeSnapshot = state["snapshot"] | ||
| state = snapshot.resume(result) | ||
|
|
||
| stdout = state.get("stdout", "") | ||
|
|
||
| if exec_span: | ||
| exec_span.attributes["zapcode.output"] = json.dumps(state.get("output"), default=str) | ||
| if stdout: | ||
| exec_span.attributes["zapcode.stdout"] = stdout | ||
| _end_span(exec_span) | ||
|
|
||
| if debug and exec_span: | ||
| _print_trace(exec_span) | ||
|
|
||
| return ExecutionResult( | ||
| code=code, | ||
| output=state.get("output"), | ||
| stdout=stdout, | ||
| tool_calls=tool_calls, | ||
| trace=exec_span, | ||
| ) |
There was a problem hiding this comment.
ExecutionResult.trace drops the per-phase trace this feature is supposed to expose.
The wrapper always returns the synthetic span built here, and its children are only tool_call spans. Nothing in this path threads the underlying Zapcode trace into result.trace, so callers cannot get the documented parse/compile/execute timing tree from get_trace().
🧰 Tools
🪛 Ruff (0.15.5)
[warning] 230-233: Abstract raise to an inner function
(TRY301)
[warning] 230-233: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/zapcode-ai-python/src/zapcode_ai/__init__.py` around lines 212 -
275, The returned ExecutionResult currently only includes the synthetic
exec_span (with tool_call children) and does not attach the Zapcode engine's
per-phase trace; fix by retrieving the underlying Zapcode trace from the
sandbox/state (e.g., call sandbox.get_trace() or use state["trace"] / snapshot
trace if available) after execution and merge or attach it to the exec_span (or
set ExecutionResult.trace to that full trace) so callers get the
parse/compile/execute timing tree; update the code around where exec_span is
created and where the ExecutionResult is returned to combine the engine trace
with exec_span (or replace exec_span) before constructing ExecutionResult.
| result = tool_def.execute(named_args) | ||
| tool_calls.append({"name": fn_name, "args": args, "result": result}) | ||
|
|
||
| if tool_span: | ||
| tool_span.attributes["zapcode.tool.result"] = json.dumps(result, default=str) | ||
| _end_span(tool_span) | ||
| exec_span.children.append(tool_span) | ||
|
|
||
| snapshot: ZapcodeSnapshot = state["snapshot"] | ||
| state = snapshot.resume(result) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find packages/zapcode-ai-python -type f -name "*.py" | head -20Repository: TheUncharted/zapcode
Length of output: 117
🏁 Script executed:
rg "class ToolDefinition" packages/zapcode-ai-python -A 10Repository: TheUncharted/zapcode
Length of output: 984
🏁 Script executed:
rg "def execute" packages/zapcode-ai-python -B 2 -A 10Repository: TheUncharted/zapcode
Length of output: 1084
🏁 Script executed:
sed -n '240,265p' packages/zapcode-ai-python/src/zapcode_ai/__init__.py | cat -nRepository: TheUncharted/zapcode
Length of output: 1192
🏁 Script executed:
rg "def resume" packages/zapcode-ai-python -B 2 -A 15Repository: TheUncharted/zapcode
Length of output: 46
🏁 Script executed:
rg "snapshot.resume" packages/zapcode-ai-python -B 3 -A 3Repository: TheUncharted/zapcode
Length of output: 640
🏁 Script executed:
sed -n '1,50p' packages/zapcode-ai-python/src/zapcode_ai/__init__.py | cat -nRepository: TheUncharted/zapcode
Length of output: 1802
🏁 Script executed:
rg "class ZapcodeSnapshot" packages/zapcode-ai-pythonRepository: TheUncharted/zapcode
Length of output: 46
🏁 Script executed:
rg "ZapcodeSnapshot" packages/zapcode-ai-python | head -20Repository: TheUncharted/zapcode
Length of output: 274
🏁 Script executed:
rg "async def execute" packages/zapcode-ai-pythonRepository: TheUncharted/zapcode
Length of output: 46
🏁 Script executed:
sed -n '195,210p' packages/zapcode-ai-python/src/zapcode_ai/__init__.py | cat -nRepository: TheUncharted/zapcode
Length of output: 704
🏁 Script executed:
sed -n '200,220p' packages/zapcode-ai-python/src/zapcode_ai/__init__.py | cat -nRepository: TheUncharted/zapcode
Length of output: 904
🏁 Script executed:
rg "def handle_tool_call" packages/zapcode-ai-python -A 30Repository: TheUncharted/zapcode
Length of output: 2683
🏁 Script executed:
rg "import inspect" packages/zapcode-ai-pythonRepository: TheUncharted/zapcode
Length of output: 46
🏁 Script executed:
sed -n '225,280p' packages/zapcode-ai-python/src/zapcode_ai/__init__.py | cat -nRepository: TheUncharted/zapcode
Length of output: 2390
🏁 Script executed:
sed -n '60,90p' packages/zapcode-ai-python/src/zapcode_ai/__init__.py | cat -nRepository: TheUncharted/zapcode
Length of output: 1148
🏁 Script executed:
rg "async" packages/zapcode-ai-pythonRepository: TheUncharted/zapcode
Length of output: 46
🏁 Script executed:
find packages/zapcode-ai-python -name "*test*" -o -name "*spec*" | head -10Repository: TheUncharted/zapcode
Length of output: 46
🏁 Script executed:
sed -n '52,65p' packages/zapcode-ai-python/src/zapcode_ai/__init__.py | cat -nRepository: TheUncharted/zapcode
Length of output: 565
🏁 Script executed:
rg "Awaitable" packages/zapcode-ai-python -B 2 -A 2Repository: TheUncharted/zapcode
Length of output: 476
🏁 Script executed:
cat packages/zapcode-ai-python/src/zapcode_ai/__init__.py | grep -n "class ToolDefinition" -A 20Repository: TheUncharted/zapcode
Length of output: 747
🏁 Script executed:
find packages/zapcode-ai-python -type f \( -name "*.py" -o -name "*.md" \) | xargs grep -l "async def" 2>/dev/nullRepository: TheUncharted/zapcode
Length of output: 48
🏁 Script executed:
cat packages/zapcode-ai-python/README.md 2>/dev/null || cat packages/zapcode-ai-python/README.rst 2>/dev/null || echo "No README found"Repository: TheUncharted/zapcode
Length of output: 244
🏁 Script executed:
ls -la packages/zapcode-ai-python/Repository: TheUncharted/zapcode
Length of output: 350
🏁 Script executed:
rg "Awaitable" packages/zapcode-ai-python --no-filenameRepository: TheUncharted/zapcode
Length of output: 107
🏁 Script executed:
grep -n "result = tool_def.execute" packages/zapcode-ai-python/src/zapcode_ai/__init__.pyRepository: TheUncharted/zapcode
Length of output: 117
🏁 Script executed:
python3 << 'EOF'
import inspect
# Verify inspect.isawaitable works as expected
async def async_func():
return 42
def sync_func():
return 42
coro = async_func()
print(f"inspect.isawaitable(coro) = {inspect.isawaitable(coro)}")
print(f"inspect.isawaitable(sync_func()) = {inspect.isawaitable(sync_func())}")
# Clean up
coro.close()
EOFRepository: TheUncharted/zapcode
Length of output: 137
Add guard against async tool implementations passed to synchronous execution.
Line 247 calls tool_def.execute(named_args) synchronously, yet ToolDefinition.execute is documented as returning Any or an awaitable. If a caller supplies an async def tool, the coroutine object gets stored in tool_calls and passed to snapshot.resume(result) instead of being awaited, causing unexpected behavior.
Suggested fix
import json
import time
+import inspect
from dataclasses import dataclass, field result = tool_def.execute(named_args)
+ if inspect.isawaitable(result):
+ raise TypeError(
+ f"Tool '{fn_name}' returned an awaitable, but handle_tool_call() is synchronous. "
+ "Use a synchronous tool or add an async entry point."
+ )
tool_calls.append({"name": fn_name, "args": args, "result": result})📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| result = tool_def.execute(named_args) | |
| tool_calls.append({"name": fn_name, "args": args, "result": result}) | |
| if tool_span: | |
| tool_span.attributes["zapcode.tool.result"] = json.dumps(result, default=str) | |
| _end_span(tool_span) | |
| exec_span.children.append(tool_span) | |
| snapshot: ZapcodeSnapshot = state["snapshot"] | |
| state = snapshot.resume(result) | |
| result = tool_def.execute(named_args) | |
| if inspect.isawaitable(result): | |
| raise TypeError( | |
| f"Tool '{fn_name}' returned an awaitable, but handle_tool_call() is synchronous. " | |
| "Use a synchronous tool or add an async entry point." | |
| ) | |
| tool_calls.append({"name": fn_name, "args": args, "result": result}) | |
| if tool_span: | |
| tool_span.attributes["zapcode.tool.result"] = json.dumps(result, default=str) | |
| _end_span(tool_span) | |
| exec_span.children.append(tool_span) | |
| snapshot: ZapcodeSnapshot = state["snapshot"] | |
| state = snapshot.resume(result) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/zapcode-ai-python/src/zapcode_ai/__init__.py` around lines 247 -
256, The code calls tool_def.execute(named_args) synchronously and may receive
an awaitable from async tool implementations; detect awaitable results (use
inspect.isawaitable/asyncio.iscoroutine) after calling tool_def.execute and
resolve them before storing to tool_calls and passing to snapshot.resume: if
inside an async context await the result (result = await result), otherwise run
it to completion with the event loop (e.g., loop.run_until_complete(result) or
asyncio.run(result)) so that tool_calls, tool_span attribute logging, and
snapshot.resume receive the actual resolved value; update the block around
tool_def.execute, tool_calls, tool_span, and snapshot.resume accordingly.
| session_trace: TraceSpan | None = ( | ||
| _create_span("session", {"zapcode.tools": ", ".join(tools.keys())}) | ||
| if tracing else None | ||
| ) | ||
| attempt_count = 0 |
There was a problem hiding this comment.
Session trace state leaks across separate uses of the same ZapcodeAI instance.
session_trace and attempt_count live in the closure created by zapcode(). If an app reuses one ZapcodeAI across multiple chats, later attempts append onto the earlier trace and keep prior tool args/results reachable via get_trace(). This needs an explicit per-conversation session boundary or reset.
Also applies to: 420-434
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/zapcode-ai-python/src/zapcode_ai/__init__.py` around lines 414 -
418, The session_trace and attempt_count are stored in the closure created by
zapcode(), causing trace state to persist across chats; modify zapcode()/the
returned ZapcodeAI so that session_trace and attempt_count are reinitialized at
the start of each new conversation or API call (e.g., move their initialization
into the per-conversation handler or add an explicit reset method) and ensure
get_trace() reads only the current per-conversation span created via
_create_span("session", ...) (also update any logic in the related block around
the existing 420-434 code to reference the per-call/session variables rather
than the outer closure values).
| const execSpan = tracing ? createSpan("execute", { "zapcode.code": code }) : undefined; | ||
|
|
||
| try { | ||
| const sandbox = new Zapcode(code, { | ||
| externalFunctions: toolNames, | ||
| timeLimitMs: options.timeLimitMs ?? 10_000, | ||
| memoryLimitMb: options.memoryLimitMb ?? 32, | ||
| }); | ||
|
|
||
| let state = sandbox.start(); | ||
| let stdout = ""; | ||
|
|
||
| // Snapshot/resume loop — resolve each tool call as the VM suspends | ||
| while (!state.completed) { | ||
| const { functionName, args } = state; | ||
|
|
||
| const toolDef = toolDefs[functionName]; | ||
| if (!toolDef) { | ||
| throw new Error( | ||
| `Guest code called unknown function '${functionName}'. ` + | ||
| `Available: ${toolNames.join(", ")}` | ||
| ); | ||
| } | ||
|
|
||
| // Build named args from positional args using the parameter schema | ||
| const paramNames = Object.keys(toolDef.parameters); | ||
| const namedArgs: Record<string, unknown> = {}; | ||
| for (let i = 0; i < paramNames.length && i < args.length; i++) { | ||
| namedArgs[paramNames[i]] = args[i]; | ||
| } | ||
|
|
||
| const toolSpan = tracing ? createSpan("tool_call", { | ||
| "zapcode.tool.name": functionName, | ||
| "zapcode.tool.args": JSON.stringify(args), | ||
| }) : undefined; | ||
|
|
||
| const result = await toolDef.execute(namedArgs); | ||
| toolCalls.push({ name: functionName, args, result }); | ||
|
|
||
| if (toolSpan) { | ||
| toolSpan.attributes["zapcode.tool.result"] = JSON.stringify(result); | ||
| endSpan(toolSpan); | ||
| execSpan!.children.push(toolSpan); | ||
| } | ||
|
|
||
| // Resume the VM with the tool's return value | ||
| const snapshot = ZapcodeSnapshotHandle.load(state.snapshot); | ||
| state = snapshot.resume(result); | ||
| } | ||
|
|
||
| const sandbox = new Zapcode(code, { | ||
| externalFunctions: toolNames, | ||
| timeLimitMs: options.timeLimitMs ?? 10_000, | ||
| memoryLimitMb: options.memoryLimitMb ?? 32, | ||
| }); | ||
|
|
||
| let state = sandbox.start(); | ||
| let stdout = ""; | ||
|
|
||
| // Snapshot/resume loop — resolve each tool call as the VM suspends | ||
| while (!state.completed) { | ||
| const { functionName, args } = state; | ||
|
|
||
| const toolDef = toolDefs[functionName]; | ||
| if (!toolDef) { | ||
| throw new Error( | ||
| `Guest code called unknown function '${functionName}'. ` + | ||
| `Available: ${toolNames.join(", ")}` | ||
| ); | ||
| if (state.stdout) { | ||
| stdout = state.stdout; | ||
| } | ||
|
|
||
| // Build named args from positional args using the parameter schema | ||
| const paramNames = Object.keys(toolDef.parameters); | ||
| const namedArgs: Record<string, unknown> = {}; | ||
| for (let i = 0; i < paramNames.length && i < args.length; i++) { | ||
| namedArgs[paramNames[i]] = args[i]; | ||
| if (execSpan) { | ||
| execSpan.attributes["zapcode.output"] = JSON.stringify(state.output); | ||
| if (stdout) execSpan.attributes["zapcode.stdout"] = stdout; | ||
| endSpan(execSpan); | ||
| } | ||
|
|
||
| const result = await toolDef.execute(namedArgs); | ||
| toolCalls.push({ name: functionName, args, result }); | ||
| if (debug && execSpan) { | ||
| printTrace(execSpan); | ||
| } | ||
|
|
||
| // Resume the VM with the tool's return value | ||
| const snapshot = ZapcodeSnapshotHandle.load(state.snapshot); | ||
| state = snapshot.resume(result); | ||
| } | ||
| return { | ||
| code, | ||
| output: state.output, | ||
| stdout, | ||
| toolCalls, | ||
| ...(execSpan ? { trace: execSpan } : {}), |
There was a problem hiding this comment.
This trace tree can never show parse/compile.
executeCode() only creates an execute span plus tool_call children. Nothing in this path adds parse or compile, so printTrace() cannot produce the parse → compile → execute tree shown by the new debug-tracing example. Reuse the lower-level run trace as the attempt root and hang the AI-layer spans under it instead of replacing it.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/zapcode-ai/src/index.ts` around lines 318 - 387, The current
executeCode path creates a top-level execSpan via createSpan("execute", ...)
which replaces the VM's own run trace so parse/compile spans never appear;
locate the VM/run-level trace produced by Zapcode (look for the run/root span
exposed by Zapcode instance or the snapshot/state object returned from
Zapcode.start()/ZapcodeSnapshotHandle.load — e.g., a runSpan on the sandbox or
state) and make the execSpan a child of that run-level span instead of a
standalone root: create execSpan using createSpan as before but attach it into
runSpan.children, and when ending spans use endSpan(toolSpan) and
endSpan(execSpan) then ensure you add execSpan to the runSpan.children and pass
runSpan into printTrace(debug) so the final trace shows parse → compile →
execute with tool_call children.
- Fix debug flag not passed through in zapcode-ai TS (was hardcoded to false, so debug:true had no effect on per-attempt traces) - Remove unnecessary f-strings (Ruff F541) in Python files - Add language tags to fenced code blocks (markdownlint MD040) - Fix cd paths in examples/README.md to work from repo root - Remove unnecessary npx prefix in npm scripts (tsx is a devDep) - Make CONTRIBUTING.md E2E commands copy-pasteable - Add AWS credentials prerequisites to Bedrock/debug-tracing READMEs - Clarify working directory for Rust example
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (4)
packages/zapcode-ai-python/src/zapcode_ai/__init__.py (3)
414-418:⚠️ Potential issue | 🟠 MajorMake trace state per conversation, not per
zapcode()instance.
session_traceandattempt_countlive in the outer closure, so a reusedZapcodeAIinstance keeps appending later chats onto the old session. That leaks prior tool args/results throughget_trace()and keeps mutating a span that may already have been finalized.Also applies to: 420-434, 470-475
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/zapcode-ai-python/src/zapcode_ai/__init__.py` around lines 414 - 418, session_trace and attempt_count are declared in the outer closure so they persist across calls to zapcode(), leaking previous conversation state into get_trace() and mutating finalized spans; move their declarations into the per-conversation scope (e.g., inside the zapcode() function or a new Conversation context) so each invocation gets a fresh session_trace (created via _create_span("session", ...)) and attempt_count initialized to 0, and update any logic that references session_trace/attempt_count (including code paths around _create_span, get_trace(), and span finalization) to use the per-call variables rather than the outer-closure ones.
247-256:⚠️ Potential issue | 🟠 MajorReject or resolve awaitable tool results here.
tool_def.execute()is invoked synchronously, but the public contract still says it may return an awaitable. Anasync deftool will hand a coroutine object tosnapshot.resume()instead of the real tool result.🛠 Suggested guard
+import inspect import json import time @@ - result = tool_def.execute(named_args) + result = tool_def.execute(named_args) + if inspect.isawaitable(result): + raise TypeError( + f"Tool '{fn_name}' returned an awaitable, but handle_tool_call() is synchronous." + ) tool_calls.append({"name": fn_name, "args": args, "result": result})🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/zapcode-ai-python/src/zapcode_ai/__init__.py` around lines 247 - 256, tool_def.execute() can return an awaitable (coroutine/awaitable) which must be resolved before calling snapshot.resume; detect awaitables (e.g. via inspect.isawaitable(result)) after result = tool_def.execute(named_args), await them to get the real value, then use that resolved value for tool_calls, tool_span logging (zapcode.tool.result) and snapshot.resume(result); ensure this change touches the block around tool_def.execute, tool_span handling, and snapshot.resume so that awaitable results for async tools are resolved before resuming the ZapcodeSnapshot.
212-213:⚠️ Potential issue | 🟠 MajorReturn the engine trace, not only
exec_span.
trace=exec_spanstill drops the per-phase trace coming from Zapcode itself, soget_trace()can never show parse/compile/execute timings. This path needs to thread the binding's trace through and attach the wrapper span under that root instead of replacing it.Also applies to: 269-275
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/zapcode-ai-python/src/zapcode_ai/__init__.py` around lines 212 - 213, The current code sets trace=exec_span which overwrites the binding's existing per-phase trace; instead, retrieve the binding's root trace (e.g., binding.get_trace() or the trace object returned/available from the binding call), attach or nest the wrapper span created by _create_span("execute", {"zapcode.code": code}) under that root (e.g., as a child span) and pass the original binding trace as trace to the caller; update both occurrences where exec_span is used (the earlier exec path and the second occurrence around lines 269-275) so you thread the binding trace through and only add the wrapper span rather than replacing the binding's trace.packages/zapcode-ai/src/index.ts (1)
318-319:⚠️ Potential issue | 🟠 MajorUse the Zapcode run trace as
ExecutionResult.trace.
ExecutionResult.tracestill points at the synthetic span created in this wrapper, so callers only ever seeexecute/tool_callnodes. The new tracing feature still won't expose the documented parse → compile → execute tree until this path threads the engine trace through and attaches the AI-layer span under that root.Also applies to: 382-388
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/zapcode-ai/src/index.ts` around lines 318 - 319, The current wrapper creates execSpan via createSpan("execute", { "zapcode.code": code }) and then returns an ExecutionResult whose trace points at that synthetic exec span, preventing callers from seeing the engine's parse→compile→execute tree; update the code path to accept/obtain the engine's trace (the trace produced by the engine run/execute call), set ExecutionResult.trace to that engine trace, and attach the AI-layer execSpan as a child of the engine root (either by creating execSpan with the engine trace/span as parent or by linking execSpan into engineTrace using the engine trace API) so callers receive the full engine trace with the AI-layer span nested under it; apply the same change for the other occurrence referenced (the block around lines 382–388).
🧹 Nitpick comments (1)
examples/python/debug-tracing/main.py (1)
189-200: Exception messages are acceptable for example code.Static analysis flags TRY003 (long messages in exception raises), which suggests moving messages to custom exception classes. For a self-contained example script, inline descriptive messages in
RuntimeErrorare appropriate and improve readability.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/python/debug-tracing/main.py` around lines 189 - 200, The inline descriptive exception messages in the two RuntimeError raises inside main() are acceptable for an example script, so suppress the static analysis warning by adding a per-line noqa pragma (e.g., append "# noqa: TRY003") to each raise statement (the one raising for unexpected stop_reason and the one for exceeding max_steps) so the linter ignores TRY003 here while keeping the helpful inline messages.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CONTRIBUTING.md`:
- Line 64: Update the E2E Python instruction that currently runs "cd
crates/zapcode-py && maturin develop --release && cd ../../examples/python/basic
&& python main.py" to first create/activate a Python virtualenv and install
maturin into it; specifically, add steps to create a venv (python -m venv
.venv), activate it (.venv\\Scripts\\activate on Windows or source
.venv/bin/activate on Unix), and run pip install --upgrade pip maturin before
running maturin develop so that maturin installs into the active environment.
In `@packages/zapcode-ai-python/src/zapcode_ai/__init__.py`:
- Around line 186-190: The debug trace currently omits the generated code
because _print_trace() filters out attributes with keys starting with
"zapcode.code", which contradicts the public docstring promise; update
_print_trace() to include a truncated representation of the code attribute
(e.g., take span.attributes.get("zapcode.code") and include only the first N
characters with an ellipsis) when present instead of filtering it out, while
still avoiding dumping full code content; locate the attrs construction that
iterates span.attributes.items() and add special handling for the "zapcode.code"
key so it is included truncated in the printed attrs string.
In `@packages/zapcode-ai/src/index.ts`:
- Around line 463-466: The sessionTrace and attemptCount are currently captured
once inside zapcode() causing a single TraceSpan and attempt counter to be
reused across requests; move creation of sessionTrace (from createSpan) and
initialization of attemptCount into the per-request execution scope (the
function returned by zapcode()) so each invocation creates a fresh TraceSpan and
resets attemptCount, and ensure a new ZapcodeAIResult is constructed per call
(not reused) so getTrace() only returns the current request's spans and tool
args/results. Also audit related code paths referenced by the comment (the
blocks around where sessionTrace is used and where ZapcodeAIResult is
built/returned) to remove shared mutable state across invocations.
- Around line 349-352: The current tracing code calls JSON.stringify(args) which
can throw on bigints or circular structures; wrap the serialization in a
non-throwing routine before calling createSpan (referencing tracing, toolSpan,
createSpan, "tool_call", functionName, args) — implement a safeSerialize that
tries JSON.stringify(args) in a try/catch and on failure falls back to a safe
replacer (handle bigint by converting to string and track seen objects to avoid
circular errors) or uses a fallback like String(args) or util.inspect, then pass
the safe result into createSpan so tracing cannot throw and will always produce
a string for "zapcode.tool.args".
---
Duplicate comments:
In `@packages/zapcode-ai-python/src/zapcode_ai/__init__.py`:
- Around line 414-418: session_trace and attempt_count are declared in the outer
closure so they persist across calls to zapcode(), leaking previous conversation
state into get_trace() and mutating finalized spans; move their declarations
into the per-conversation scope (e.g., inside the zapcode() function or a new
Conversation context) so each invocation gets a fresh session_trace (created via
_create_span("session", ...)) and attempt_count initialized to 0, and update any
logic that references session_trace/attempt_count (including code paths around
_create_span, get_trace(), and span finalization) to use the per-call variables
rather than the outer-closure ones.
- Around line 247-256: tool_def.execute() can return an awaitable
(coroutine/awaitable) which must be resolved before calling snapshot.resume;
detect awaitables (e.g. via inspect.isawaitable(result)) after result =
tool_def.execute(named_args), await them to get the real value, then use that
resolved value for tool_calls, tool_span logging (zapcode.tool.result) and
snapshot.resume(result); ensure this change touches the block around
tool_def.execute, tool_span handling, and snapshot.resume so that awaitable
results for async tools are resolved before resuming the ZapcodeSnapshot.
- Around line 212-213: The current code sets trace=exec_span which overwrites
the binding's existing per-phase trace; instead, retrieve the binding's root
trace (e.g., binding.get_trace() or the trace object returned/available from the
binding call), attach or nest the wrapper span created by
_create_span("execute", {"zapcode.code": code}) under that root (e.g., as a
child span) and pass the original binding trace as trace to the caller; update
both occurrences where exec_span is used (the earlier exec path and the second
occurrence around lines 269-275) so you thread the binding trace through and
only add the wrapper span rather than replacing the binding's trace.
In `@packages/zapcode-ai/src/index.ts`:
- Around line 318-319: The current wrapper creates execSpan via
createSpan("execute", { "zapcode.code": code }) and then returns an
ExecutionResult whose trace points at that synthetic exec span, preventing
callers from seeing the engine's parse→compile→execute tree; update the code
path to accept/obtain the engine's trace (the trace produced by the engine
run/execute call), set ExecutionResult.trace to that engine trace, and attach
the AI-layer execSpan as a child of the engine root (either by creating execSpan
with the engine trace/span as parent or by linking execSpan into engineTrace
using the engine trace API) so callers receive the full engine trace with the
AI-layer span nested under it; apply the same change for the other occurrence
referenced (the block around lines 382–388).
---
Nitpick comments:
In `@examples/python/debug-tracing/main.py`:
- Around line 189-200: The inline descriptive exception messages in the two
RuntimeError raises inside main() are acceptable for an example script, so
suppress the static analysis warning by adding a per-line noqa pragma (e.g.,
append "# noqa: TRY003") to each raise statement (the one raising for unexpected
stop_reason and the one for exceeding max_steps) so the linter ignores TRY003
here while keeping the helpful inline messages.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 852875e0-aa90-45f9-95c7-e1bb697851a0
📒 Files selected for processing (11)
CONTRIBUTING.mdexamples/README.mdexamples/python/ai-bedrock/README.mdexamples/python/debug-tracing/README.mdexamples/python/debug-tracing/main.pyexamples/rust/basic/README.mdexamples/typescript/ai-agent/package.jsonexamples/typescript/ai-bedrock/README.mdexamples/typescript/debug-tracing/README.mdpackages/zapcode-ai-python/src/zapcode_ai/__init__.pypackages/zapcode-ai/src/index.ts
🚧 Files skipped from review as they are similar to previous changes (7)
- examples/typescript/debug-tracing/README.md
- examples/README.md
- examples/python/ai-bedrock/README.md
- examples/rust/basic/README.md
- examples/python/debug-tracing/README.md
- examples/typescript/ai-agent/package.json
- examples/typescript/ai-bedrock/README.md
| - E2E JS: build bindings then run `examples/typescript/basic.ts` | ||
| - E2E Python: build bindings then run `examples/python/basic.py` | ||
| - E2E JS: `cd crates/zapcode-js && npm install && npx napi build --release --platform --js index.js --dts index.d.ts && cd ../../examples/typescript/basic && npm install && npx tsx main.ts` | ||
| - E2E Python: `cd crates/zapcode-py && maturin develop --release && cd ../../examples/python/basic && python main.py` |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name "CONTRIBUTING.md" -type fRepository: TheUncharted/zapcode
Length of output: 81
🏁 Script executed:
head -n 70 CONTRIBUTING.md | tail -n 20Repository: TheUncharted/zapcode
Length of output: 1138
🏁 Script executed:
# Check if there's a Python environment setup already documented
rg -A 5 -B 5 "E2E Python" CONTRIBUTING.mdRepository: TheUncharted/zapcode
Length of output: 663
🏁 Script executed:
# Check for any existing venv or setup instructions
rg "venv|virtualenv|maturin" CONTRIBUTING.mdRepository: TheUncharted/zapcode
Length of output: 181
🌐 Web query:
maturin develop virtualenv requirements documentation
💡 Result:
- Create/activate a virtualenv (recommended for local dev): maturin’s own tutorial starts by creating a virtual environment, then installing
maturin. [1] - Use
maturin developto install into the current environment:maturin developbuilds your Rust crate and installs it directly into the currently active virtualenv (skipping wheel generation). This is the standard workflow for fast local iteration. [1][2][3] - Docs (Sphinx) “requirements” setup: maturin’s Sphinx integration docs explicitly suggest putting documentation dependencies in
docs/requirements.txt(i.e., Python packages needed to build the docs). [4] - If you’re packaging (pyproject.toml): the tutorial notes you typically declare
maturinas a requirement (with a pinned version) and set it as thebuild-backendper PEP 517. [1]
Sources: [1] Maturin User Guide – Tutorial, [2] Maturin User Guide – Local Development, [3] PyPI maturin (current release page), [4] Maturin User Guide – Sphinx Integration.
🌐 Web query:
maturin installation requirements Python binding
💡 Result:
Maturin (Python binding / Rust-extension builds): installation requirements
- Python environment: a working Python install (for PyO3-based projects, Python ≥ 3.7 is required). [3]
- Rust toolchain:
rustc/cargoinstalled (for PyO3, the minimum Rust version is 1.83). [3] - Maturin itself (recommended install):
pipx install maturin(orpip install maturin). [1][3]
Optional / scenario-specific dependencies
zig: optional, used as a linker to simplify cross-compiling and help with manylinux compliance. [1]patchelf: optional, used to repair wheels that link to other shared libraries (install viamaturin[patchelf]). [1]- If using
cffibindings: ensurecffiis installed (maturin may auto-install it only in some virtualenv cases; otherwise you mustpip install cffi). [2]
Sources: [1] Maturin Installation docs, [2] Maturin Bindings docs, [3] PyO3 “Getting started / Installation” docs.
Document the Python env/bootstrap step before maturin develop.
maturin develop installs into the current virtualenv, and maturin itself must be installed separately. A fresh contributor will hit a failure before the example runs if neither is set up. Please add the venv creation and maturin installation step.
♻️ Suggested doc update
-- E2E Python: `cd crates/zapcode-py && maturin develop --release && cd ../../examples/python/basic && python main.py`
+- E2E Python: `python -m venv .venv && . .venv/bin/activate && pip install maturin && cd crates/zapcode-py && maturin develop --release && cd ../../examples/python/basic && python main.py`📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - E2E Python: `cd crates/zapcode-py && maturin develop --release && cd ../../examples/python/basic && python main.py` | |
| - E2E Python: `python -m venv .venv && . .venv/bin/activate && pip install maturin && cd crates/zapcode-py && maturin develop --release && cd ../../examples/python/basic && python main.py` |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@CONTRIBUTING.md` at line 64, Update the E2E Python instruction that currently
runs "cd crates/zapcode-py && maturin develop --release && cd
../../examples/python/basic && python main.py" to first create/activate a Python
virtualenv and install maturin into it; specifically, add steps to create a venv
(python -m venv .venv), activate it (.venv\\Scripts\\activate on Windows or
source .venv/bin/activate on Unix), and run pip install --upgrade pip maturin
before running maturin develop so that maturin installs into the active
environment.
| attrs = " ".join( | ||
| f"{k}={str(v)[:80]}" for k, v in span.attributes.items() | ||
| if not k.startswith("zapcode.code") # don't dump full code in trace | ||
| ) | ||
| print(f"{prefix}{icon} {span.name} ({duration}){' ' + attrs if attrs else ''}") |
There was a problem hiding this comment.
debug=True currently omits the generated code.
_print_trace() filters out zapcode.code, so the implementation doesn't match the public promise on Lines 376-379. Either print a truncated code attribute here or narrow the docstring.
Also applies to: 376-379
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/zapcode-ai-python/src/zapcode_ai/__init__.py` around lines 186 -
190, The debug trace currently omits the generated code because _print_trace()
filters out attributes with keys starting with "zapcode.code", which contradicts
the public docstring promise; update _print_trace() to include a truncated
representation of the code attribute (e.g., take
span.attributes.get("zapcode.code") and include only the first N characters with
an ellipsis) when present instead of filtering it out, while still avoiding
dumping full code content; locate the attrs construction that iterates
span.attributes.items() and add special handling for the "zapcode.code" key so
it is included truncated in the printed attrs string.
| const toolSpan = tracing ? createSpan("tool_call", { | ||
| "zapcode.tool.name": functionName, | ||
| "zapcode.tool.args": JSON.stringify(args), | ||
| }) : undefined; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's check if the file exists and read it
cd packages/zapcode-ai && wc -l src/index.tsRepository: TheUncharted/zapcode
Length of output: 80
🏁 Script executed:
# Read the specific sections mentioned in the review
cd packages/zapcode-ai && sed -n '340,380p' src/index.ts | cat -nRepository: TheUncharted/zapcode
Length of output: 1620
🏁 Script executed:
# Search for all JSON.stringify calls in the file
cd packages/zapcode-ai && rg 'JSON\.stringify' src/index.ts -n -A 2 -B 2Repository: TheUncharted/zapcode
Length of output: 851
🏁 Script executed:
cd packages/zapcode-ai && rg 'safeStringify|function.*Stringify' src/index.ts -nRepository: TheUncharted/zapcode
Length of output: 46
🏁 Script executed:
# Find the printTrace function to understand the context for the proposed fix
cd packages/zapcode-ai && rg 'function printTrace' src/index.ts -A 5 -nRepository: TheUncharted/zapcode
Length of output: 411
🏁 Script executed:
# Get all JSON.stringify occurrences with context to see if any are already protected
cd packages/zapcode-ai && rg 'JSON\.stringify' src/index.ts -nRepository: TheUncharted/zapcode
Length of output: 364
🏁 Script executed:
# Get more context around the printTrace function
cd packages/zapcode-ai && sed -n '280,310p' src/index.ts | cat -nRepository: TheUncharted/zapcode
Length of output: 1418
🏁 Script executed:
# Check if there are any error handling patterns for tracing in the code
cd packages/zapcode-ai && rg 'tracing|createSpan|endSpan' src/index.ts -n | head -20Repository: TheUncharted/zapcode
Length of output: 812
🏁 Script executed:
# Check the structure to see if this is in a try-catch or similar error handling context
cd packages/zapcode-ai && sed -n '300,380p' src/index.ts | cat -nRepository: TheUncharted/zapcode
Length of output: 3225
Make trace serialization non-throwing.
JSON.stringify() will throw on values like bigint and circular objects. With debug or autoFix enabled, that means tracing can fail an otherwise successful execution path.
🛠 Suggested change
+function safeStringify(value: unknown): string {
+ try {
+ return JSON.stringify(value, (_key, current) =>
+ typeof current === "bigint" ? `${current}n` : current
+ );
+ } catch {
+ return String(value);
+ }
+}
+
function printTrace(span: TraceSpan, indent = 0): void {
const prefix = indent === 0 ? "" : "│ ".repeat(indent - 1) + "├─ ";
const icon = span.status === "error" ? "✗" : "✓";
@@
const toolSpan = tracing ? createSpan("tool_call", {
"zapcode.tool.name": functionName,
- "zapcode.tool.args": JSON.stringify(args),
+ "zapcode.tool.args": safeStringify(args),
}) : undefined;
@@
if (toolSpan) {
- toolSpan.attributes["zapcode.tool.result"] = JSON.stringify(result);
+ toolSpan.attributes["zapcode.tool.result"] = safeStringify(result);
endSpan(toolSpan);
execSpan!.children.push(toolSpan);
}
@@
if (execSpan) {
- execSpan.attributes["zapcode.output"] = JSON.stringify(state.output);
+ execSpan.attributes["zapcode.output"] = safeStringify(state.output);
if (stdout) execSpan.attributes["zapcode.stdout"] = stdout;
endSpan(execSpan);
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/zapcode-ai/src/index.ts` around lines 349 - 352, The current tracing
code calls JSON.stringify(args) which can throw on bigints or circular
structures; wrap the serialization in a non-throwing routine before calling
createSpan (referencing tracing, toolSpan, createSpan, "tool_call",
functionName, args) — implement a safeSerialize that tries JSON.stringify(args)
in a try/catch and on failure falls back to a safe replacer (handle bigint by
converting to string and track seen objects to avoid circular errors) or uses a
fallback like String(args) or util.inspect, then pass the safe result into
createSpan so tracing cannot throw and will always produce a string for
"zapcode.tool.args".
| const sessionTrace: TraceSpan | undefined = tracing | ||
| ? createSpan("session", { "zapcode.tools": Object.keys(toolDefs).join(", ") }) | ||
| : undefined; | ||
| let attemptCount = 0; |
There was a problem hiding this comment.
Don't keep one session trace for the entire wrapper lifetime.
sessionTrace and attemptCount are captured once in zapcode(). Reusing the same ZapcodeAIResult across chats/requests keeps appending new attempts onto the previous session and exposes earlier tool args/results through getTrace().
Also applies to: 469-477, 527-530
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/zapcode-ai/src/index.ts` around lines 463 - 466, The sessionTrace
and attemptCount are currently captured once inside zapcode() causing a single
TraceSpan and attempt counter to be reused across requests; move creation of
sessionTrace (from createSpan) and initialization of attemptCount into the
per-request execution scope (the function returned by zapcode()) so each
invocation creates a fresh TraceSpan and resets attemptCount, and ensure a new
ZapcodeAIResult is constructed per call (not reused) so getTrace() only returns
the current request's spans and tool args/results. Also audit related code paths
referenced by the comment (the blocks around where sessionTrace is used and
where ZapcodeAIResult is built/returned) to remove shared mutable state across
invocations.
Summary
Add new feature related to debugging and tracing
Changes
New Features
Documentation
Test plan
cargo test)Summary by CodeRabbit
Release Notes
New Features
Examples & Documentation