Debug, replay, and audit every AI agent run.
Agent Flight Recorder is a local observability tool for AI coding agents and MCP-powered workflows. It records what the agent was asked to do, what tools it called, what files changed, what commands ran, what errors happened, and what the final result looked like.
Think of it as:
Chrome DevTools for AI agentsAI coding agents can make useful changes quickly, but the path they take is often hard to inspect after the fact.
Developers need answers to questions like:
- What prompt started this run?
- Which agent and model were used?
- What files did the agent read or edit?
- What commands did it run?
- Did tests fail before they passed?
- Did the agent retry or loop?
- What did the final diff contain?
- How long did the run take?
- What did the run cost?
Agent Flight Recorder turns an agent session into a local timeline.
npm install -g @agentopssec/agent-flight-recorderOr run it without installing:
npx -y @agentopssec/agent-flight-recorder listagent-flight update # check the registry, prompt before installing
agent-flight update --yes # update without promptingAgent Flight Recorder starts by wrapping an agent command:
agent-flight run -- codex "fix the failing tests"The workflow should do three things well:
- Record the agent command and timeline.
- Capture files, commands, errors, and diffs.
- Produce a local report that can be inspected later.
agent-flight run "codex fix the failing tests"
agent-flight run -- codex "fix the failing tests"
agent-flight run -- gemini "refactor this component"
agent-flight inspect latest
agent-flight inspect run_001
agent-flight replay run_001
agent-flight export run_001 --html
agent-flight list
agent-flight diff run_001
agent-flight update [--yes]Agent Flight Recorder runs on its own as a command wrapper:
agent-flight run -- codex "fix the failing tests"
agent-flight inspect latestWhen used with the full AgentOpsSec stack, its local run logs can feed Agent Review and Agent Cost Lens without either tool importing Agent Flight Recorder code:
agent-review --from-agent-flight latest
agent-cost monthAgent Flight Recorder captures a timeline of an agent run, including:
- User prompt
- Agent command
- Model and provider metadata when available
- Token usage when available
- Estimated cost
- Latency
- Tool calls observed in agent output or provided through structured metadata
- MCP server calls observed in agent output or provided through structured metadata
- Shell commands
- File reads
- File writes
- Git diffs
- Test runs
- Errors
- Retries
- Final response
- Final code diff
Agent Flight Recorder includes lightweight adapters for common CLI names such as Codex, Claude, Gemini, and OpenCode. Exact tool-call and token capture depends on what the wrapped CLI prints or provides through environment metadata.
Agent Flight Recorder Run by github.com/AgentOpsSec
Run: fix-failing-tests-0425
1. Prompt received
2. Model call: 12.8s, 18,400 tokens
3. Tool call: filesystem.read package.json
4. Tool call: shell.exec npm test
5. Error detected: TypeScript compile failure
6. File edit: src/lib/parser.ts
7. File edit: package.json
8. Tool call: shell.exec npm install
9. Tool call: shell.exec npm test
10. Tests passed
11. Final diff: 6 files changed
12. Estimated cost: $0.42{
"tool": {
"name": "Agent Flight Recorder",
"by": "github.com/AgentOpsSec",
"repository": "github.com/AgentOpsSec/agent-flight-recorder"
},
"runId": "run_001",
"startedAt": "2026-04-25T10:00:00Z",
"endedAt": "2026-04-25T10:04:12Z",
"agent": "codex",
"model": "unknown",
"prompt": "fix the failing tests",
"events": [
{
"type": "shell.exec",
"command": "npm test",
"exitCode": 1,
"durationMs": 12944
},
{
"type": "file.change",
"path": "src/lib/parser.ts",
"changeType": "modified"
}
],
"gitDiffSummary": {
"filesChanged": 3,
"insertions": 44,
"deletions": 18
},
"estimatedCost": 0.18
}Reports are designed for both terminal inspection and automation:
- Local JSON run logs
- Terminal run summaries
- Final diff summaries
- HTML exports
- Run list and latest-run inspection
- Cost and latency summaries where available
- Local-first
- Open-source
- No telemetry by default
- Complete local run history
- Useful terminal summaries
- Exportable reports
- Replayable context
- Practical cost visibility
The initial release includes command wrapping, run timeline capture, git diff capture, local reports, HTML export, and cost estimation.
- Wrap local agent commands
- Capture the requested prompt and command
- Record start and end timestamps
- Detect the current repository
- Store each run under a local run ID
- Print a clear terminal summary when the run finishes
- Capture shell commands when available
- Capture file changes before and after the run
- Capture git diff summaries
- Record errors and non-zero exits
- Track test command execution
- Store structured JSON run logs
- Inspect the latest run
- List previous runs
- View final diffs for a run
- Export a run as HTML
- Estimate token and provider cost where possible
- Compare basic run metadata across sessions
Reports use plain-language status words rather than raw exit codes:
ok— the step ran successfully (green).failed (exit N)— the step exited non-zero (red); the original code is preserved.skipped (reason)— the step was not applicable (dim).
Severity colors follow the AgentOpsSec palette (safe = green, warning = amber, risk = red). The palette honors NO_COLOR and FORCE_COLOR, and JSON / CSV output stays plain.
- Repo: https://github.com/AgentOpsSec/agent-flight-recorder
- npm: https://www.npmjs.com/package/@agentopssec/agent-flight-recorder
- AgentOpsSec stack: https://github.com/AgentOpsSec/stack
- Website: https://AgentOpsSec.com
Created and developed by Aunt Gladys Nephew.
- Website: https://auntgladysnephew.com
- GitHub: https://github.com/auntgladysnephew
- X: https://x.com/AGNonX