fix: show heartbeat model bleed hint on context overflow error#67381
fix: show heartbeat model bleed hint on context overflow error#67381Knightmare6890 wants to merge 3 commits intoopenclaw:mainfrom
Conversation
When context limit is hit with a smaller context model than the session's primary model, suggest heartbeat.isolatedSession/lightContext instead of misleading reserveTokensFloor advice. Three-tier hint logic: 1. Small model + model mismatch → heartbeat override bleed detected 2. Small model, no mismatch → suggest larger context model 3. Normal/large model → existing reserveTokensFloor advice Fixes openclaw#67314
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7225181a65
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| kind: "final", | ||
| payload: { | ||
| text: "⚠️ Context limit exceeded. I've reset our conversation to start fresh - please try again.\n\nTo prevent this, increase your compaction buffer by setting `agents.defaults.compaction.reserveTokensFloor` to 20000 or higher in your config.", | ||
| text: `⚠️ Context limit exceeded. I've reset our conversation to start fresh - please try again.${buildContextOverflowHint({ primaryModel: params.followupRun.run.model ?? "unknown", currentModel: fallbackModel ?? params.followupRun.run.model ?? "unknown", provider: params.followupRun.run.provider, cfg: runtimeConfig })}`, |
There was a problem hiding this comment.
Use the active provider when resolving overflow context window
buildContextOverflowHint resolves context size from a (provider, model) pair, but this call passes params.followupRun.run.provider even when the turn actually ran on a fallback provider/model. In cross-provider fallback runs, that mismatched provider can return the wrong window (or undefined), so the new hint logic may miss the small-window condition and show incorrect guidance. Please pass the effective provider for the run (e.g. fallbackProvider) alongside fallbackModel here (and in the similar compaction branch).
Useful? React with 👍 / 👎.
Greptile SummaryAdds a three-tier hint to both context-overflow error messages in Two improvements worth a follow-up:
Confidence Score: 4/5Safe to merge; the fix is an unambiguous improvement — the only regression risk is silently falling back to the pre-existing "reserveTokensFloor" hint in specific edge cases. All findings are P2. The embedded-overflow-error path (the scenario the PR targets) works correctly. Two gaps — wrong provider for context-window lookup and src/auto-reply/reply/agent-runner-execution.ts — specifically the Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-execution.ts
Line: 1446
Comment:
**`fallbackModel` equals `primaryModel` on first-attempt compaction failures**
When `runWithModelFallback` throws on the very first attempt, execution jumps to the `catch` block before line 1306 (`fallbackModel = fallbackResult.model`) is ever reached. Because `fallbackModel` is initialized to `params.followupRun.run.model` at line 689, both `currentModel` and `primaryModel` resolve to the same value, so `modelMismatch` is always `false` here — the heartbeat bleed hint is silently suppressed and users still see the old "increase `reserveTokensFloor`" advice.
This is the less-common path (compaction throw vs. embedded error), but it's exactly the scenario described in the PR: the heartbeat model causes overflow during compaction.
A snapshot of the "true primary model" should be captured before the loop starts (before any `LiveSessionModelSwitchError` retry can mutate `params.followupRun.run.model`) and used as `primaryModel` at both call sites:
```typescript
const primaryModel = params.followupRun.run.model;
// ... later in the catch block ...
text: `...${buildContextOverflowHint({ primaryModel: primaryModel ?? "unknown", currentModel: fallbackModel ?? ... })}`,
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-execution.ts
Line: 396-401
Comment:
**Wrong provider passed for heartbeat model context-window lookup**
`buildContextOverflowHint` always receives `params.followupRun.run.provider` (the primary session's provider, e.g. `"bailian"`), but `currentModel` may come from `fallbackModel` which belongs to a different provider (e.g. the `"ollama"` heartbeat model). Inside `resolveContextTokensForModel`, the explicit `provider` drives:
- `resolveConfiguredProviderContextTokens(cfg, "bailian", heartbeatModel)` → likely `undefined`
- the qualified cache probe `"bailian/heartbeatModel"` → cache miss
If the heartbeat model was stored only under `"ollama/qwen3.5-9b-32k:latest"`, both lookups miss and `currentWindow` returns `undefined` → `isSmallModel = false` → the hint silently degrades to "increase `reserveTokensFloor`" even when the heartbeat model is the culprit.
Use `fallbackProvider` at both call sites so the lookup targets the right provider when `currentModel` came from the fallback result:
```typescript
buildContextOverflowHint({
primaryModel: params.followupRun.run.model ?? "unknown",
currentModel: fallbackModel ?? params.followupRun.run.model ?? "unknown",
provider: fallbackProvider ?? params.followupRun.run.provider, // ← use fallback provider
cfg: runtimeConfig,
})
```
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix: show heartbeat model bleed hint on ..." | Re-trigger Greptile |
| kind: "final", | ||
| payload: { | ||
| text: "⚠️ Context limit exceeded during compaction. I've reset our conversation to start fresh - please try again.\n\nTo prevent this, increase your compaction buffer by setting `agents.defaults.compaction.reserveTokensFloor` to 20000 or higher in your config.", | ||
| text: `⚠️ Context limit exceeded during compaction. I've reset our conversation to start fresh - please try again.${buildContextOverflowHint({ primaryModel: params.followupRun.run.model ?? "unknown", currentModel: fallbackModel ?? params.followupRun.run.model ?? "unknown", provider: params.followupRun.run.provider, cfg: runtimeConfig })}`, |
There was a problem hiding this comment.
fallbackModel equals primaryModel on first-attempt compaction failures
When runWithModelFallback throws on the very first attempt, execution jumps to the catch block before line 1306 (fallbackModel = fallbackResult.model) is ever reached. Because fallbackModel is initialized to params.followupRun.run.model at line 689, both currentModel and primaryModel resolve to the same value, so modelMismatch is always false here — the heartbeat bleed hint is silently suppressed and users still see the old "increase reserveTokensFloor" advice.
This is the less-common path (compaction throw vs. embedded error), but it's exactly the scenario described in the PR: the heartbeat model causes overflow during compaction.
A snapshot of the "true primary model" should be captured before the loop starts (before any LiveSessionModelSwitchError retry can mutate params.followupRun.run.model) and used as primaryModel at both call sites:
const primaryModel = params.followupRun.run.model;
// ... later in the catch block ...
text: `...${buildContextOverflowHint({ primaryModel: primaryModel ?? "unknown", currentModel: fallbackModel ?? ... })}`,Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-execution.ts
Line: 1446
Comment:
**`fallbackModel` equals `primaryModel` on first-attempt compaction failures**
When `runWithModelFallback` throws on the very first attempt, execution jumps to the `catch` block before line 1306 (`fallbackModel = fallbackResult.model`) is ever reached. Because `fallbackModel` is initialized to `params.followupRun.run.model` at line 689, both `currentModel` and `primaryModel` resolve to the same value, so `modelMismatch` is always `false` here — the heartbeat bleed hint is silently suppressed and users still see the old "increase `reserveTokensFloor`" advice.
This is the less-common path (compaction throw vs. embedded error), but it's exactly the scenario described in the PR: the heartbeat model causes overflow during compaction.
A snapshot of the "true primary model" should be captured before the loop starts (before any `LiveSessionModelSwitchError` retry can mutate `params.followupRun.run.model`) and used as `primaryModel` at both call sites:
```typescript
const primaryModel = params.followupRun.run.model;
// ... later in the catch block ...
text: `...${buildContextOverflowHint({ primaryModel: primaryModel ?? "unknown", currentModel: fallbackModel ?? ... })}`,
```
How can I resolve this? If you propose a fix, please make it concise.| const currentWindow = resolveContextTokensForModel({ | ||
| cfg, | ||
| provider, | ||
| model: currentModel, | ||
| allowAsyncLoad: false, | ||
| }); |
There was a problem hiding this comment.
Wrong provider passed for heartbeat model context-window lookup
buildContextOverflowHint always receives params.followupRun.run.provider (the primary session's provider, e.g. "bailian"), but currentModel may come from fallbackModel which belongs to a different provider (e.g. the "ollama" heartbeat model). Inside resolveContextTokensForModel, the explicit provider drives:
resolveConfiguredProviderContextTokens(cfg, "bailian", heartbeatModel)→ likelyundefined- the qualified cache probe
"bailian/heartbeatModel"→ cache miss
If the heartbeat model was stored only under "ollama/qwen3.5-9b-32k:latest", both lookups miss and currentWindow returns undefined → isSmallModel = false → the hint silently degrades to "increase reserveTokensFloor" even when the heartbeat model is the culprit.
Use fallbackProvider at both call sites so the lookup targets the right provider when currentModel came from the fallback result:
buildContextOverflowHint({
primaryModel: params.followupRun.run.model ?? "unknown",
currentModel: fallbackModel ?? params.followupRun.run.model ?? "unknown",
provider: fallbackProvider ?? params.followupRun.run.provider, // ← use fallback provider
cfg: runtimeConfig,
})Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-execution.ts
Line: 396-401
Comment:
**Wrong provider passed for heartbeat model context-window lookup**
`buildContextOverflowHint` always receives `params.followupRun.run.provider` (the primary session's provider, e.g. `"bailian"`), but `currentModel` may come from `fallbackModel` which belongs to a different provider (e.g. the `"ollama"` heartbeat model). Inside `resolveContextTokensForModel`, the explicit `provider` drives:
- `resolveConfiguredProviderContextTokens(cfg, "bailian", heartbeatModel)` → likely `undefined`
- the qualified cache probe `"bailian/heartbeatModel"` → cache miss
If the heartbeat model was stored only under `"ollama/qwen3.5-9b-32k:latest"`, both lookups miss and `currentWindow` returns `undefined` → `isSmallModel = false` → the hint silently degrades to "increase `reserveTokensFloor`" even when the heartbeat model is the culprit.
Use `fallbackProvider` at both call sites so the lookup targets the right provider when `currentModel` came from the fallback result:
```typescript
buildContextOverflowHint({
primaryModel: params.followupRun.run.model ?? "unknown",
currentModel: fallbackModel ?? params.followupRun.run.model ?? "unknown",
provider: fallbackProvider ?? params.followupRun.run.provider, // ← use fallback provider
cfg: runtimeConfig,
})
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 50ad9c882e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (isSmallModel && modelMismatch) { | ||
| const windowLabel = Math.round(currentWindow / 1024); | ||
| return ( | ||
| `\n\nThe session is using ${currentModel} (${windowLabel}k context) ` + | ||
| `instead of ${primaryModel}. This may be caused by a heartbeat model ` + |
There was a problem hiding this comment.
Avoid heartbeat diagnosis for generic model fallback mismatches
This branch treats any currentModel !== primaryModel overflow on a small-context model as heartbeat bleed, but currentModel is updated by normal runtime fallback/live-switch paths too, so a fallback from a large model to a smaller model will incorrectly instruct users to change heartbeat.* settings even when heartbeat is unrelated. Because primaryModel is just the turn-start model and not a heartbeat-specific baseline, this condition can both misdiagnose regular fallback overflows and miss true persisted heartbeat bleed cases; the hint should require heartbeat-specific evidence or use neutral mismatch wording.
Useful? React with 👍 / 👎.
Summary
Describe the problem and fix in 2–5 bullets:
heartbeat.modeluses a small-context local model (e.g. 32k Ollama), the model override bleeds into the main session. The next context overflow triggers the misleading "increase reserveTokensFloor" error message.heartbeat.isolatedSession: true,heartbeat.lightContext: true).Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Root Cause (if applicable)
heartbeat.modelis set to a small-context model (e.g.ollama/qwen3.5-9b-32k:latest), the heartbeat bootstrap-context reload switches the session's active model. The original model (e.g.qwen3.6-pluswith 1M context) gets replaced by the 32k model. When accumulated conversation exceeds 32k tokens, OpenClaw triggers overflow and resets the session.isHeartbeatflag only covers overflow during the heartbeat turn. In practice, the model persists after the heartbeat fires, so subsequent user messages (not heartbeats) hit the limit. Detection must compare current vs primary model at error time.Regression Test Plan (if applicable)
User-visible / Behavior Changes
heartbeat.isolatedSession: true/heartbeat.lightContext: trueinstead ofreserveTokensFloor.Diagram (if applicable)
text
Before:
Context overflow → "increase reserveTokensFloor" (wrong advice for heartbeat bleed)
After:
Context overflow → detect current model vs primary model
→ mismatch + small window → "heartbeat model override bleeding into session"
→ no mismatch + small window → "model has small context window"
→ normal/large window → "increase reserveTokensFloor" (existing advice)
Security Impact (required)
No)No)No)No)No)Repro + Verification
Environment
bailian/qwen3.6-plus(1M), Heartbeatollama/qwen3.5-9b-32k:latest(32k)json
{
"agents": {
"list": [{
"id": "agent",
"model": "qwen3.6-plus",
"heartbeat": {
"every": "30m",
"model": "ollama/qwen3.5-9b-32k:latest"
}
}]
}
}
Steps
Expected
Actual (before fix)
reserveTokensFloor, which does not fix the problem.Evidence
Gateway log model-snapshot timeline from session
6589d855.jsonl:09:16 — qwen3.6-plus (1M) ✅ normal operation
09:36 — qwen3.5-9b-32k (32k) ← heartbeat fired, model switched
09:46 — qwen3.6-plus (1M) ✅ compaction switched back
09:49 — qwen3.5-9b-32k (32k) ← heartbeat fired again
09:52 — qwen3.6-plus (1M) ✅ compaction switched back
09:56 — qwen3.5-9b-32k (32k) ← 💥 "Context limit exceeded" — session reset
Same pattern repeated at 11:02.
Human Verification (required)
resolveContextTokensForModelis already imported and available inagent-runner-execution.ts. ConfirmedruntimeConfigis in scope at both error locations.Review Conversations
Compatibility / Migration
Yes)No)No)Risks and Mitigations
None. This is a pure error message text change — no logic, behavior, or API changes.