What version of Codex CLI is running?
codex-cli 0.105.0
What subscription do you have?
Pro
Which model were you using?
gpt-5.3-codex
What platform is your computer?
Darwin 24.6.0 arm64 arm
What terminal emulator and version are you using (if applicable)?
Ghostty
What issue are you seeing?
When I enable voice transcription/realtime voice in Codex, token usage appears to spike much faster than expected and I can hit usage limits very quickly.
Reproduction
- Start Codex CLI.
- Enable voice transcription / realtime voice mode.
- Speak a short prompt.
- Let it run for a bit (or do a few short voice turns).
Notes
This looks like it may involve repeated transcript delegation/echo behavior in realtime voice flows (possibly assistant text being re-ingested as new transcript input), causing extra
turns/tokens.
I also noticed voice transcription context may include prior draft text, which could further increase billed tokens in some auth modes.
What steps can reproduce the bug?
- Start Codex CLI.
- Enable voice transcription / realtime voice mode.
- Speak a short prompt.
- Let it run for a bit (or do a few short voice turns).
What is the expected behavior?
Expected
Token usage should scale roughly with actual spoken/transcribed content and normal assistant output.
Actual
Usage grows disproportionately fast, and I can hit rate/usage limits much sooner than expected.
Additional information
- Root cause in realtime flow: echoed spawn_transcript items could be re-delegated into new/steered turns, creating a feedback loop that can burn tokens quickly. The delegation path is in
realtime_conversation.rs:245 (/Users/baron/projects/codex/codex-rs/core/src/realtime_conversation.rs:245) and routing goes through Op::UserInput.
- I added an echo guard cache for outbound mirrored text and suppression for matching inbound spawn_transcript echoes in realtime_conversation.rs:178 (/Users/baron/projects/codex/codex-
rs/core/src/realtime_conversation.rs:178), realtime_conversation.rs:348 (/Users/baron/projects/codex/codex-rs/core/src/realtime_conversation.rs:348), and realtime_conversation.rs:443 (/
Users/baron/projects/codex/codex-rs/core/src/realtime_conversation.rs:443).
- Added regression coverage for this exact loop in realtime_conversation.rs:844 (/Users/baron/projects/codex/codex-rs/core/tests/suite/realtime_conversation.rs:844) plus new unit tests in
realtime_conversation.rs:487 (/Users/baron/projects/codex/codex-rs/core/src/realtime_conversation.rs:487).
- just fmt ran successfully.
- cargo test -p codex-core realtime_conversation passed (including the new test).
- cargo test -p codex-core mostly passed but had 2 unrelated seatbelt sandbox failures (Operation not permitted in existing seatbelt tests).
Additional finding (not patched yet): API-key voice transcription currently sends composer context as prompt on every transcription call, added on Feb 23, 2026, at chat_composer.rs:3822
(/Users/baron/projects/codex/codex-rs/tui/src/bottom_pane/chat_composer.rs:3822) and voice.rs:785 (/Users/baron/projects/codex/codex-rs/tui/src/voice.rs:785). That can also inflate usage
for API-key users.
What version of Codex CLI is running?
codex-cli 0.105.0
What subscription do you have?
Pro
Which model were you using?
gpt-5.3-codex
What platform is your computer?
Darwin 24.6.0 arm64 arm
What terminal emulator and version are you using (if applicable)?
Ghostty
What issue are you seeing?
When I enable voice transcription/realtime voice in Codex, token usage appears to spike much faster than expected and I can hit usage limits very quickly.
Reproduction
Notes
This looks like it may involve repeated transcript delegation/echo behavior in realtime voice flows (possibly assistant text being re-ingested as new transcript input), causing extra
turns/tokens.
I also noticed voice transcription context may include prior draft text, which could further increase billed tokens in some auth modes.
What steps can reproduce the bug?
What is the expected behavior?
Expected
Token usage should scale roughly with actual spoken/transcribed content and normal assistant output.
Actual
Usage grows disproportionately fast, and I can hit rate/usage limits much sooner than expected.
Additional information
realtime_conversation.rs:245 (/Users/baron/projects/codex/codex-rs/core/src/realtime_conversation.rs:245) and routing goes through Op::UserInput.
rs/core/src/realtime_conversation.rs:178), realtime_conversation.rs:348 (/Users/baron/projects/codex/codex-rs/core/src/realtime_conversation.rs:348), and realtime_conversation.rs:443 (/
Users/baron/projects/codex/codex-rs/core/src/realtime_conversation.rs:443).
realtime_conversation.rs:487 (/Users/baron/projects/codex/codex-rs/core/src/realtime_conversation.rs:487).
Additional finding (not patched yet): API-key voice transcription currently sends composer context as prompt on every transcription call, added on Feb 23, 2026, at chat_composer.rs:3822
(/Users/baron/projects/codex/codex-rs/tui/src/bottom_pane/chat_composer.rs:3822) and voice.rs:785 (/Users/baron/projects/codex/codex-rs/tui/src/voice.rs:785). That can also inflate usage
for API-key users.