Skip to content

Agent debugger: step through an agent's tool calls and intermediate results #131

@jaylfc

Description

@jaylfc

When an agent produces a bad answer, today's option is to stare at logs. A proper debugger would let the user step through the agent's execution — see the LLM call, the tool invocation, the intermediate state, the next LLM call — and pause at any step.

What needs to ship:

  • Record every agent execution as a structured trace: LLM call → tool call → result → next LLM call
  • Debugger UI panel in the agent chat window that shows the trace as a tree
  • Click a node to see inputs/outputs, token counts, wall time
  • "Replay from here" — fork the agent from an intermediate step with a modified input and see how the rest plays out differently
  • "Re-run with different model" — swap the LLM backend at a specific step and re-execute downstream
  • Exports a replay file for sharing bug reports

User impact: debugging agents today is "cross your fingers and re-run"; a real debugger would turn agent development from vibes-based to engineering.

Metadata

Metadata

Assignees

No one assigned

    Labels

    agentsAgent frameworks and deploymentenhancementNew feature or requestfeatureNew featurekilo-auto-fixAuto-generated label by Kilokilo-triagedAuto-generated label by Kilo

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions