Setting model_context_window in config.toml breaks auto-compaction (fill_to_context_window resets token counter)

### What version of Codex CLI is running?

codex-cli 0.117.0 (also reproduced on 0.116.0)

### What subscription do you have?

Pro

### Which model were you using?

gpt-5.3-codex

### What platform is your computer?

macOS Darwin 25.3.0 (Apple Silicon)

### What issue are you seeing?

Setting `model_context_window` in `~/.codex/config.toml` (to any value: 300K, 350K, or 400K) causes auto-compaction to fail permanently after the first context overflow. Removing the setting and using defaults (272K from server metadata) works correctly.

**Reproduction rate:** 100% with any custom `model_context_window` value. 0% with defaults.

### Root cause (traced in source)

After the first `ContextWindowExceeded` error, `fill_to_context_window()` in `protocol/src/protocol.rs:1940-1950` sets `last_token_usage.total_tokens` to a delta value (context_window - previous_total), which can be near zero:

```rust
fn fill_to_context_window(&mut self, context_window: i64) {
    let previous_total = self.total_token_usage.total_tokens;
    let delta = (context_window - previous_total).max(0);
    self.last_token_usage = TokenUsage {
        total_tokens: delta,  // <-- near zero when previous_total ~ context_window
        ..TokenUsage::default()
    };
}
```

But the compaction trigger at `core/src/codex.rs:5878` calls `get_total_token_usage()` which reads `last_token_usage.total_tokens` (from `context_manager/history.rs:300-305`):

```rust
let last_tokens = self.token_info
    .as_ref()
    .map(|info| info.last_token_usage.total_tokens)  // reads the delta, not cumulative
    .unwrap_or(0);
```

So after any overflow, the compaction check sees ~0 tokens used and never triggers compaction. Every subsequent retry also overflows, creating a permanent crash loop.

### Why defaults work but custom values don't

The bundled `models.json` reports `context_window: 272000` for gpt-5.3-codex. With defaults:
- Compaction threshold: 272K * 90% = 244K
- The model actually accepts more input than 272K (confirmed by other users, see #14133)
- Compaction triggers at 244K, well before the real limit, so `fill_to_context_window` is never reached

With custom `model_context_window = 400000` (the actual model context window per OpenAI docs):
- Config overrides server metadata (`models_manager/model_info.rs:30-35`)
- Compaction threshold becomes 400K * 90% = 360K
- But remote compaction sends the full history to `/responses/compact`, which may reject oversized payloads
- Once overflow happens, `fill_to_context_window` poisons the token counter permanently

Even `model_context_window = 300000` breaks because:
- Any first overflow (for any reason) triggers `fill_to_context_window`
- After that, token counter reads ~0, compaction never fires again

### Additional factors

1. **Remote compaction has no fallback.** OpenAI models use remote compaction (`compact.rs:50-52`). If it fails, error propagates immediately with no retry or fallback to local compaction (`compact_remote.rs:127-139`).

2. **Pre-compaction trim only removes Codex-generated items** (`compact_remote.rs:287-292`). If history is dominated by user/tool content, trimming stops early and oversized payload is still sent to `/responses/compact`.

3. **A fuller token estimate exists but is not used for compaction decisions.** `estimated_token_count` is computed at `codex.rs:5881-5882` but only logged. The compaction gate at `codex.rs:5895` uses `total_usage_tokens` from `get_total_token_usage()` which can be stale/poisoned.

### Related issues

- #13769 — same symptom: "zero compaction events, then 100% left / 0 used after overflow"
- #16042 — regression >=0.115 in compaction behavior
- #14133 — confirms model accepts >272K input with config override
- #13653 — proposes context tier presets with paired compaction limits

### What steps can reproduce the bug?

1. Add `model_context_window = 400000` to `~/.codex/config.toml`
2. Start a session that reads many files (e.g., `codex exec "Read all source files in this repo"`)
3. Agent accumulates context past the threshold
4. First overflow triggers `fill_to_context_window`
5. All subsequent turns see ~0 tokens, compaction never fires, permanent crash loop

Remove `model_context_window` from config.toml and the same workload completes successfully with compaction working.

### What is the expected behavior?

`model_context_window` should work correctly at any value. After context overflow, `get_total_token_usage()` should return the actual context size, not a poisoned delta value. Compaction should fire and recover.

### Suggested fix

`get_total_token_usage()` should use `total_token_usage.total_tokens` (cumulative actual usage) instead of `last_token_usage.total_tokens` (incremental delta) for the compaction threshold comparison. Alternatively, `fill_to_context_window` should set `last_token_usage.total_tokens = context_window` (the full value) rather than the delta.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setting model_context_window in config.toml breaks auto-compaction (fill_to_context_window resets token counter) #16068

What version of Codex CLI is running?

What subscription do you have?

Which model were you using?

What platform is your computer?

What issue are you seeing?

Root cause (traced in source)

Why defaults work but custom values don't

Additional factors

Related issues

What steps can reproduce the bug?

What is the expected behavior?

Suggested fix

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Setting model_context_window in config.toml breaks auto-compaction (fill_to_context_window resets token counter) #16068

Description

What version of Codex CLI is running?

What subscription do you have?

Which model were you using?

What platform is your computer?

What issue are you seeing?

Root cause (traced in source)

Why defaults work but custom values don't

Additional factors

Related issues

What steps can reproduce the bug?

What is the expected behavior?

Suggested fix

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions