Codex Blackbox

Codex runs can be hard to judge after the fact.

It may finish, but you still do not know what happened: which model answered, whether the response actually completed, how many tokens were used, whether cached input helped, or whether the run is worth continuing.

Codex Blackbox gives you a local postmortem for a Codex CLI session. It turns the run into a short report with the outcome, model, token use, cost estimate, important signals, and a practical next step.

It is built for local debugging. The database, metrics, dashboard, and CLI run on your machine.

Quick Start

Install:

curl -fsSL https://raw.githubusercontent.com/softcane/codex-blackbox/main/install.sh | sh

Start the local stack:

codex-blackbox doctor
codex-blackbox up

Run Codex normally through the wrapper:

codex-blackbox run --watch -- codex

Read the latest report:

codex-blackbox postmortem last

For a quick one-shot check instead of an interactive Codex session:

codex-blackbox run --watch -- codex exec --sandbox read-only "Read README.md and summarize this repo. Do not edit files."

Open Grafana:

http://127.0.0.1:3000/d/codex-blackbox-main

What You Get

The postmortem is redacted by default. It shows:

the session id
whether the run completed, failed, or ended incomplete
the requested model and served model
input, cached input, uncached input, output, and reasoning tokens
local token and cost estimates
important signals, like high context use or model mismatch
tool calls the model tried to make

Example:

# Codex Responses Postmortem

## Snapshot
- Session: 019e0743-63c2-7c61-b326-8088e4ae0c7b
- State: final or persisted snapshot
- Outcome: Likely Completed
- Requested Model: gpt-5.5
- Served Model: gpt-5.5
- Turns: 3
- Tokens: input 54231, cached 41600, uncached 12631, output 610, reasoning 445, local total 54841
- Local Estimate: $0.10

## Recommendations
- Continue from the latest response summary if it matches the intended task.

For a specific session:

codex-blackbox postmortem <session_id>

For local debugging without redaction:

codex-blackbox postmortem last --no-redact

To write the report to a file:

codex-blackbox postmortem last --output report.md

What It Can Tell You

Codex Blackbox can report what it observed during the model run:

did the model response complete, fail, or stop incomplete?
which model was requested, and which model answered?
how many input, cached input, uncached input, output, and reasoning tokens were used?
what was the local cost estimate?
did the run show context pressure, model mismatch, or accounting oddities?
which tools did the model try to call?

Common Commands

codex-blackbox doctor
codex-blackbox up
codex-blackbox watch --url http://127.0.0.1:9091
codex-blackbox sessions --limit 20 --days 7
codex-blackbox postmortem last
codex-blackbox postmortem last --no-redact

API shortcuts:

curl -s 'http://127.0.0.1:9091/api/sessions?limit=5'
curl -s 'http://127.0.0.1:9091/api/postmortem/last'
curl -s http://127.0.0.1:9091/metrics

Testing

Local fake tests:

./test/validate-openai-config.sh
./test/e2e-openai-responses-full.sh

These tests use fake Responses fixtures. They do not contact OpenAI, and they do not prove live Codex support.

Live or dogfood evidence means a real Codex CLI run went through codex-blackbox run -- codex ... and Codex Blackbox saved at least one real Codex Responses request for that run.

Development

Developer notes:

docs/reference/developing.md

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github/workflows		.github/workflows
codex-blackbox-cli		codex-blackbox-cli
codex-blackbox-core		codex-blackbox-core
docs		docs
envoy		envoy
grafana		grafana
prometheus		prometheus
test		test
.dockerignore		.dockerignore
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Codex Blackbox

Quick Start

What You Get

What It Can Tell You

Common Commands

Testing

Development

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

Codex Blackbox

Quick Start

What You Get

What It Can Tell You

Common Commands

Testing

Development

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 1

Languages

Packages