fix: comprehensive bug fixes for agent debugger#1
Conversation
BUG-004: Remove await on sync methods in PersistenceManager - Fixed TypeError by removing await from get_session_ids() and flush() BUG-005: Fix race condition in EventBuffer - Moved all mutations inside async with self._lock - Changed _enforce_bounds to sync method BUG-002: Remove duplicate event publishing in adapters - Removed explicit buffer.publish() calls after _emit_event() - Events are now published only once via context._emit_event() BUG-003: Remove duplicate AGENT_START/END in trace_agent decorator - Decorator no longer emits its own lifecycle events - TraceContext.__aenter__/__aexit__ handles all events BUG-012: Pass exception info to __aexit__ in decorators - trace_tool and trace_llm now pass (type, value, traceback) - Error events properly capture exception info Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
BUG-001: Replace dangerouslySetInnerHTML with safe JsonHighlight - Created safe JsonHighlight component that renders JSON recursively - Eliminates XSS vulnerability from regex-based HTML injection - ToolInspector.tsx now uses React elements instead of innerHTML BUG-007: Align frontend types with backend - EventType now includes agent_start, agent_end, checkpoint - Removed user_prompt (not in backend) - Session interface aligned with backend Session.to_dict() - Status uses 'running' instead of 'active' - Added proper fields: started_at, ended_at, total_tokens, etc. BUG-008: Fix API response type mismatches - getTraceBundle returns TraceBundle with session, events, tree - Types properly aligned with backend API responses Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
BUG-013: Fix LangChain on_tool_start type annotation
- Changed input_str from str to str | dict[str, Any] to match
LangChain's actual behavior
BUG-014: Filter None values from LangChain settings dict
- Settings now excludes None values to prevent serialization issues
BUG-015: Fix memory leak in _run_map
- Changed .get() to .pop() in on_chain_end to clean up entries
- Added _run_map.pop() in on_chain_error for error cleanup
BUG-022: Add offset parameter to list_events
- Added offset parameter to repository list_events() method
- Added offset query param to /api/sessions/{session_id}/traces endpoint
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
BUG-009: Differentiate duplicate replay buttons in SessionReplay - Add onJumpToPrevCheckpoint and onJumpToNextCheckpoint handlers - Replace step-back/step-forward with checkpoint jump buttons - Use distinct SVG icons: double-chevron for checkpoint jumps, single-chevron for single steps - Update titles and aria-labels for clarity BUG-010: Fix stale closure in DecisionTree handleNodeDoubleClick - Add renderTreeRef to break circular dependency between handleNodeDoubleClick and renderTree - handleNodeDoubleClick now calls renderTreeRef.current() instead of renderTree directly - Keep renderTreeRef in sync with renderTree via useEffect BUG-011: Add error recovery to SSE hook - Implement exponential backoff with max 5 retries, max 30s delay - Add 30% jitter to prevent thundering herd - Propagate connection error state via setError() to Zustand store - Return reconnectionState for UI display of reconnection status - Properly clean up timeouts and reset retry count on new session Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
BUG-017: Read CORS origins from AGENT_DEBUGGER_CORS_ORIGINS env var - Supports comma-separated list of origins for production - Defaults to wildcard (*) for development convenience BUG-018: Add input validation to prevent DoS attacks - 1MB max size for data and metadata dicts - 255 character max for name field - Uses Pydantic field_validator for early rejection Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR targets a broad set of “agent debugger” fixes across the FastAPI backend, Python SDK tracing/decorators, and a new Vite/React frontend, with a focus on correctness (duplicate events, race conditions, error propagation) and security hardening (XSS + ingest limits), plus pagination support.
Changes:
- Backend: add pagination
offsetfor event listing and make CORS origins configurable viaAGENT_DEBUGGER_CORS_ORIGINS. - SDK/collector: fix event duplication paths, improve concurrency in
EventBuffer, and propagate exception info through tracing decorators/context. - Frontend: add a Vite + React UI with timeline/tree/replay/inspectors and typed API client.
Reviewed changes
Copilot reviewed 29 out of 30 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| storage/repository.py | Adds offset parameter to event pagination query. |
| api/main.py | Configurable CORS origins + exposes offset on traces endpoint. |
| collector/server.py | Adds ingest input size validation for name/data/metadata. |
| collector/buffer.py | Makes bounds enforcement synchronous and wraps publish flow in a lock. |
| collector/persistence.py | Removes await from sync buffer methods during flush. |
| agent_debugger_sdk/core/decorators.py | Removes duplicate agent start/end events; passes exception info to context exit. |
| agent_debugger_sdk/core/context.py | Adds stack trace handling when exiting context on error. |
| agent_debugger_sdk/adapters/pydantic_ai.py | Removes duplicate event publishing to the buffer. |
| agent_debugger_sdk/adapters/langchain.py | Filters None settings, fixes tool input typing, and cleans run map entries. |
| agent_debugger_sdk/adapters/tests/test_bug_fixes.py | Adds regression tests for the phase-1 bug fixes. |
| frontend/package.json | Introduces the frontend package (Vite/React/d3/zustand). |
| frontend/package-lock.json | Locks frontend dependencies. |
| frontend/vite.config.ts | Adds Vite dev server config with /api proxy. |
| frontend/tsconfig.json | Adds frontend TS config for app code. |
| frontend/tsconfig.node.json | Adds TS config for Vite config typing/build. |
| frontend/index.html | Adds Vite entry HTML. |
| frontend/.gitignore | Adds frontend-specific ignore rules. |
| frontend/src/main.tsx | Adds React entrypoint rendering <App />. |
| frontend/src/api/client.ts | Adds typed REST client + SSE EventSource creator. |
| frontend/src/types/index.ts | Adds core frontend types aligned with backend events/sessions/replay. |
| frontend/src/stores/sessionStore.ts | Adds Zustand session/trace state store. |
| frontend/src/hooks/useTrace.ts | Adds trace bundle loading hook. |
| frontend/src/hooks/useSSE.ts | Adds SSE connection with exponential backoff retries. |
| frontend/src/components/TraceTimeline.tsx | Adds timeline list UI for events. |
| frontend/src/components/ToolInspector.tsx | Adds tool call/result inspector with safe JSON rendering. |
| frontend/src/components/LLMViewer.tsx | Adds request/response viewer for LLM interactions. |
| frontend/src/components/SessionReplay.tsx | Adds playback controls with checkpoint navigation. |
| frontend/src/components/DecisionTree.tsx | Adds D3-based decision tree visualization. |
| frontend/src/App.tsx | Adds main console UI wiring sessions, replay, tree, timeline, inspectors. |
| frontend/src/App.css | Adds styling for the new console UI. |
Files not reviewed (1)
- frontend/package-lock.json: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| status = "completed" | ||
| if exc_type is not None: | ||
| status = "error" | ||
| # Get stack trace from traceback object | ||
| import traceback as tb_module | ||
|
|
||
| tb_str = tb_module.format_exc() if exc_tb else None | ||
|
|
||
| await self.record_error( | ||
| error_type=exc_type.__name__, | ||
| error_message=str(exc_val), | ||
| stack_trace=tb_str, | ||
| ) |
There was a problem hiding this comment.
TraceContext.aexit computes stack_trace via traceback.format_exc(), which ignores the provided exc_type/exc_val/exc_tb. When aexit is invoked manually (as in trace_tool/trace_llm when own_context), there may be no active exception, so format_exc() can return an unhelpful "NoneType: None" instead of the real traceback. Use traceback.format_exception(exc_type, exc_val, exc_tb) (and join) or format_exception_only when exc_tb is None, so stack_trace reliably matches the passed exception info.
| // handleNodeDoubleClick uses renderTreeRef.current to avoid stale closure | ||
| const handleNodeDoubleClick = useCallback( | ||
| (event: MouseEvent, d: d3.HierarchyNode<D3TreeNode>) => { | ||
| event.stopPropagation() | ||
| if (d.data.children.length > 0) { | ||
| d.data._collapsed = !d.data._collapsed | ||
| renderTreeRef.current() | ||
| } | ||
| }, | ||
| [] |
There was a problem hiding this comment.
Node collapse state is toggled on the current D3 hierarchy node (d.data._collapsed), but renderTree reconstructs a fresh hierarchy via convertToD3Tree(tree) on every render. That loses the mutated _collapsed flag, so double-click collapsing won’t persist or visibly change the tree. Persist collapsed state outside the ephemeral hierarchy (e.g., a ref/map keyed by event id) and apply it during convertToD3Tree / hierarchy construction.
| const collapsedNode = g.select(`.node[data-id="${child.data.id}"]`) | ||
| collapsedNode | ||
| .append('text') | ||
| .attr('x', 20) | ||
| .attr('dy', 4) | ||
| .attr('fill', '#9ca3af') | ||
| .attr('font-size', '11px') | ||
| .text(`+${child.data.children.length}`) |
There was a problem hiding this comment.
Collapsed-node indicator selects .node[data-id="..."], but the node elements never set a data-id attribute. As a result, collapsedNode is empty and the "+N" label will never render. Add an attribute (e.g., .attr('data-id', d => d.data.id)) or adjust the selector to match existing attributes.
| const collapsedNode = g.select(`.node[data-id="${child.data.id}"]`) | |
| collapsedNode | |
| .append('text') | |
| .attr('x', 20) | |
| .attr('dy', 4) | |
| .attr('fill', '#9ca3af') | |
| .attr('font-size', '11px') | |
| .text(`+${child.data.children.length}`) | |
| const collapsedNode = nodes.filter((d) => d.data.id === child.data.id) | |
| if (!collapsedNode.empty()) { | |
| collapsedNode | |
| .append('text') | |
| .attr('x', 20) | |
| .attr('dy', 4) | |
| .attr('fill', '#9ca3af') | |
| .attr('font-size', '11px') | |
| .text(`+${child.data.children.length}`) | |
| } |
| # Read CORS origins from environment variable for production configurability | ||
| # Default to wildcard for development convenience | ||
| cors_origins_str = os.environ.get("AGENT_DEBUGGER_CORS_ORIGINS", "*") | ||
| cors_origins = [origin.strip() for origin in cors_origins_str.split(",")] if cors_origins_str != "*" else ["*"] | ||
|
|
||
| app.add_middleware( | ||
| CORSMiddleware, | ||
| allow_origins=["*"], | ||
| allow_origins=cors_origins, | ||
| allow_methods=["*"], | ||
| allow_headers=["*"], | ||
| ) |
There was a problem hiding this comment.
CORS origin parsing can yield an empty-string origin when AGENT_DEBUGGER_CORS_ORIGINS is set to an empty value or contains trailing commas (e.g., "https://a.com,"). That produces allow_origins entries that won't match any real Origin header and can cause confusing behavior. Consider filtering out empty strings after splitting/stripping.
| @field_validator("data") | ||
| @classmethod | ||
| def validate_data_size(cls, v: dict[str, Any]) -> dict[str, Any]: | ||
| """Validate data dict size to prevent DoS attacks.""" | ||
| import json | ||
|
|
||
| try: | ||
| size = len(json.dumps(v)) | ||
| except (TypeError, ValueError): | ||
| size = len(str(v)) | ||
| if size > MAX_DATA_SIZE_BYTES: | ||
| raise ValueError(f"data must be {MAX_DATA_SIZE_BYTES} bytes or less") | ||
| return v | ||
|
|
||
| @field_validator("metadata") | ||
| @classmethod | ||
| def validate_metadata_size(cls, v: dict[str, Any]) -> dict[str, Any]: | ||
| """Validate metadata dict size to prevent DoS attacks.""" | ||
| import json | ||
|
|
||
| try: | ||
| size = len(json.dumps(v)) | ||
| except (TypeError, ValueError): | ||
| size = len(str(v)) | ||
| if size > MAX_METADATA_SIZE_BYTES: | ||
| raise ValueError(f"metadata must be {MAX_METADATA_SIZE_BYTES} bytes or less") | ||
| return v |
There was a problem hiding this comment.
The data/metadata size checks use len(json.dumps(v)), which measures Python string characters, not bytes. For multi-byte characters this undercounts relative to the stated "bytes" limit, and the error message becomes misleading. Compute the byte size instead (e.g., len(json.dumps(v, ensure_ascii=False).encode('utf-8'))) so the limit is actually enforced in bytes.
|
@copilot open a new pull request to apply changes based on the comments in this thread |
…, DecisionTree collapse (#2) * Initial plan * fix: address PR review comments - traceback, CORS, byte sizes, DecisionTree collapse Co-authored-by: acailic <12486087+acailic@users.noreply.github.com> Agent-Logs-Url: https://github.com/acailic/agent_debugger/sessions/40fb79e4-9e45-40b9-aeaa-110d748e671b --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: acailic <12486087+acailic@users.noreply.github.com>
Cover search returning no results (#1), error count semantic mismatch (#2), null optional fields (#3), drift detection data requirements (#4), empty behavior alerts (#5), and zero cost/tokens (#6). 29 tests across 3 files: 27 passing, 2 xfail documenting real bugs where session.errors is not auto-computed and the tokens-implies-cost invariant is not enforced. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Fixes 18 verified bugs across backend (Python/FastAPI), SDK (tracing library), and frontend (React/TypeScript):
Phase 1: Critical Backend Fixes
awaiton sync methods in PersistenceManager (fixes TypeError crashes)Phase 2: Critical Frontend Fixes
dangerouslySetInnerHTMLwith safeJsonHighlightcomponent (XSS fix)Phase 3: Medium Backend Fixes
on_tool_starttype annotation_run_mapmemory leak with.pop()instead of.get()offsetparameter for pagination inlist_eventsPhase 4: Medium Frontend Fixes
Phase 5: Low-Priority Security Fixes
AGENT_DEBUGGER_CORS_ORIGINSenv varTest plan
🤖 Generated with Claude Code