fix: show heartbeat model bleed hint on context overflow error by Knightmare6890 · Pull Request #67381 · openclaw/openclaw

Knightmare6890 · 2026-04-15T20:49:09Z

Summary

Describe the problem and fix in 2–5 bullets:

Problem: When heartbeat.model uses a small-context local model (e.g. 32k Ollama), the model override bleeds into the main session. The next context overflow triggers the misleading "increase reserveTokensFloor" error message.
Why it matters: Users following the advice don't fix the problem because the root cause is heartbeat model bleed, not compaction tuning. Sessions keep crashing every ~30 minutes on each heartbeat.
What changed: Replaced both hardcoded "Context limit exceeded" error messages with a three-tier hint that detects model mismatch and suggests the real fix (heartbeat.isolatedSession: true, heartbeat.lightContext: true).
What did NOT change (scope boundary): Only the user-facing error text changed. No compaction logic, heartbeat scheduling, or model-switching behavior was modified.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Root Cause (if applicable)

Root cause: When heartbeat.model is set to a small-context model (e.g. ollama/qwen3.5-9b-32k:latest), the heartbeat bootstrap-context reload switches the session's active model. The original model (e.g. qwen3.6-plus with 1M context) gets replaced by the 32k model. When accumulated conversation exceeds 32k tokens, OpenClaw triggers overflow and resets the session.
Missing detection / guardrail: The error message never surfaced which model actually hit the limit or that it differed from the session's primary model.
Contributing context (if known): The isHeartbeat flag only covers overflow during the heartbeat turn. In practice, the model persists after the heartbeat fires, so subsequent user messages (not heartbeats) hit the limit. Detection must compare current vs primary model at error time.

Regression Test Plan (if applicable)

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file: N/A — this is an error message improvement, not behavior change.
Scenario the test should lock in: N/A
Why this is the smallest reliable guardrail: N/A
Existing test that already covers this (if any): N/A
If no new test is added, why not: The change is purely user-facing text — the overflow detection and session reset logic are unchanged.

User-visible / Behavior Changes

The "Context limit exceeded" error now shows which model was active and whether it differs from the primary model.
When a small-context model (< 64k) is active instead of the primary model, users are directed to heartbeat.isolatedSession: true / heartbeat.lightContext: true instead of reserveTokensFloor.

Diagram (if applicable)

text
Before:
Context overflow → "increase reserveTokensFloor" (wrong advice for heartbeat bleed)

After:
Context overflow → detect current model vs primary model
→ mismatch + small window → "heartbeat model override bleeding into session"
→ no mismatch + small window → "model has small context window"
→ normal/large window → "increase reserveTokensFloor" (existing advice)

Security Impact (required)

New permissions/capabilities? (No)
Secrets/tokens handling changed? (No)
New/changed network calls? (No)
Command/tool execution surface changed? (No)
Data access scope changed? (No)

Repro + Verification

Environment

OS: WSL2 (Linux 6.6.87.2-microsoft-standard-WSL2)
Runtime/container: Node v22.22.1
Model/provider: Primary bailian/qwen3.6-plus (1M), Heartbeat ollama/qwen3.5-9b-32k:latest (32k)
Integration/channel (if any): WhatsApp + Discord
Relevant config (redacted):

json
{
"agents": {
"list": [{
"id": "agent",
"model": "qwen3.6-plus",
"heartbeat": {
"every": "30m",
"model": "ollama/qwen3.5-9b-32k:latest"
}
}]
}
}

Steps

Configure an agent with a large-context primary model and a small-context heartbeat model.
Have a long-running conversation that accumulates > 32k tokens.
Wait for heartbeat to fire (or trigger manually).
Observe session reset with "Context limit exceeded" error.

Expected

Error message identifies the small-context model as the cause and suggests heartbeat isolation.

Actual (before fix)

Error message points to reserveTokensFloor, which does not fix the problem.

Evidence

Trace/log snippets

Gateway log model-snapshot timeline from session 6589d855.jsonl:

09:16 — qwen3.6-plus (1M) ✅ normal operation
09:36 — qwen3.5-9b-32k (32k) ← heartbeat fired, model switched
09:46 — qwen3.6-plus (1M) ✅ compaction switched back
09:49 — qwen3.5-9b-32k (32k) ← heartbeat fired again
09:52 — qwen3.6-plus (1M) ✅ compaction switched back
09:56 — qwen3.5-9b-32k (32k) ← 💥 "Context limit exceeded" — session reset
Same pattern repeated at 11:02.

Human Verification (required)

Verified scenarios: Confirmed resolveContextTokensForModel is already imported and available in agent-runner-execution.ts. Confirmed runtimeConfig is in scope at both error locations.
Edge cases checked: Three-tier branching covers (1) model mismatch + small window, (2) small window no mismatch, (3) normal/large window.
What you did not verify: Runtime test on a live OpenClaw instance — this requires rebuilding the project.

Review Conversations

N/A — first PR submission

Compatibility / Migration

Backward compatible? (Yes)
Config/env changes? (No)
Migration needed? (No)

Risks and Mitigations

None. This is a pure error message text change — no logic, behavior, or API changes.

When context limit is hit with a smaller context model than the session's primary model, suggest heartbeat.isolatedSession/lightContext instead of misleading reserveTokensFloor advice. Three-tier hint logic: 1. Small model + model mismatch → heartbeat override bleed detected 2. Small model, no mismatch → suggest larger context model 3. Normal/large model → existing reserveTokensFloor advice Fixes openclaw#67314

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7225181a65

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-15T20:52:33Z

          kind: "final",
          payload: {
-            text: "⚠️ Context limit exceeded. I've reset our conversation to start fresh - please try again.\n\nTo prevent this, increase your compaction buffer by setting `agents.defaults.compaction.reserveTokensFloor` to 20000 or higher in your config.",
+            text: `⚠️ Context limit exceeded. I've reset our conversation to start fresh - please try again.${buildContextOverflowHint({ primaryModel: params.followupRun.run.model ?? "unknown", currentModel: fallbackModel ?? params.followupRun.run.model ?? "unknown", provider: params.followupRun.run.provider, cfg: runtimeConfig })}`,


Use the active provider when resolving overflow context window

buildContextOverflowHint resolves context size from a (provider, model) pair, but this call passes params.followupRun.run.provider even when the turn actually ran on a fallback provider/model. In cross-provider fallback runs, that mismatched provider can return the wrong window (or undefined), so the new hint logic may miss the small-window condition and show incorrect guidance. Please pass the effective provider for the run (e.g. fallbackProvider) alongside fallbackModel here (and in the similar compaction branch).

Useful? React with 👍 / 👎.

greptile-apps · 2026-04-15T20:53:55Z

Greptile Summary

Adds a three-tier hint to both context-overflow error messages in agent-runner-execution.ts — detecting heartbeat model bleed, a generic small-context-window case, and falling back to the existing reserveTokensFloor advice. The embedded-overflow-error path (where fallbackModel is set by a completed run) works correctly for the primary scenario described in the PR.

Two improvements worth a follow-up:

provider at both call sites should be fallbackProvider so the context-window lookup targets the heartbeat model's own provider rather than the primary session's provider; with a bare model ID and wrong provider the lookup returns undefined → isSmallModel = false → wrong hint.
In the compaction-failure throw path, fallbackModel is still set to params.followupRun.run.model when the run throws on its first attempt, making currentModel === primaryModel → modelMismatch = false → the heartbeat hint is suppressed on that path.

Confidence Score: 4/5

Safe to merge; the fix is an unambiguous improvement — the only regression risk is silently falling back to the pre-existing "reserveTokensFloor" hint in specific edge cases.

All findings are P2. The embedded-overflow-error path (the scenario the PR targets) works correctly. Two gaps — wrong provider for context-window lookup and fallbackModel equality on first-attempt throw — can cause the hint to degrade to the old message, not to incorrect or harmful output. No logic, behavior, or security changes; purely user-facing text improvement.

src/auto-reply/reply/agent-runner-execution.ts — specifically the provider argument at both buildContextOverflowHint call sites and the fallbackModel initialization semantics in the compaction-failure catch path.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-execution.ts
Line: 1446

Comment:
**`fallbackModel` equals `primaryModel` on first-attempt compaction failures**

When `runWithModelFallback` throws on the very first attempt, execution jumps to the `catch` block before line 1306 (`fallbackModel = fallbackResult.model`) is ever reached. Because `fallbackModel` is initialized to `params.followupRun.run.model` at line 689, both `currentModel` and `primaryModel` resolve to the same value, so `modelMismatch` is always `false` here — the heartbeat bleed hint is silently suppressed and users still see the old "increase `reserveTokensFloor`" advice.

This is the less-common path (compaction throw vs. embedded error), but it's exactly the scenario described in the PR: the heartbeat model causes overflow during compaction.

A snapshot of the "true primary model" should be captured before the loop starts (before any `LiveSessionModelSwitchError` retry can mutate `params.followupRun.run.model`) and used as `primaryModel` at both call sites:
```typescript
const primaryModel = params.followupRun.run.model;
// ... later in the catch block ...
text: `...${buildContextOverflowHint({ primaryModel: primaryModel ?? "unknown", currentModel: fallbackModel ?? ... })}`,
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-execution.ts
Line: 396-401

Comment:
**Wrong provider passed for heartbeat model context-window lookup**

`buildContextOverflowHint` always receives `params.followupRun.run.provider` (the primary session's provider, e.g. `"bailian"`), but `currentModel` may come from `fallbackModel` which belongs to a different provider (e.g. the `"ollama"` heartbeat model). Inside `resolveContextTokensForModel`, the explicit `provider` drives:
- `resolveConfiguredProviderContextTokens(cfg, "bailian", heartbeatModel)` → likely `undefined`
- the qualified cache probe `"bailian/heartbeatModel"` → cache miss

If the heartbeat model was stored only under `"ollama/qwen3.5-9b-32k:latest"`, both lookups miss and `currentWindow` returns `undefined` → `isSmallModel = false` → the hint silently degrades to "increase `reserveTokensFloor`" even when the heartbeat model is the culprit.

Use `fallbackProvider` at both call sites so the lookup targets the right provider when `currentModel` came from the fallback result:
```typescript
buildContextOverflowHint({
  primaryModel: params.followupRun.run.model ?? "unknown",
  currentModel: fallbackModel ?? params.followupRun.run.model ?? "unknown",
  provider: fallbackProvider ?? params.followupRun.run.provider, // ← use fallback provider
  cfg: runtimeConfig,
})
```

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "fix: show heartbeat model bleed hint on ..." | Re-trigger Greptile}

greptile-apps · 2026-04-15T20:53:59Z

          kind: "final",
          payload: {
-            text: "⚠️ Context limit exceeded during compaction. I've reset our conversation to start fresh - please try again.\n\nTo prevent this, increase your compaction buffer by setting `agents.defaults.compaction.reserveTokensFloor` to 20000 or higher in your config.",
+            text: `⚠️ Context limit exceeded during compaction. I've reset our conversation to start fresh - please try again.${buildContextOverflowHint({ primaryModel: params.followupRun.run.model ?? "unknown", currentModel: fallbackModel ?? params.followupRun.run.model ?? "unknown", provider: params.followupRun.run.provider, cfg: runtimeConfig })}`,


fallbackModel equals primaryModel on first-attempt compaction failures

When runWithModelFallback throws on the very first attempt, execution jumps to the catch block before line 1306 (fallbackModel = fallbackResult.model) is ever reached. Because fallbackModel is initialized to params.followupRun.run.model at line 689, both currentModel and primaryModel resolve to the same value, so modelMismatch is always false here — the heartbeat bleed hint is silently suppressed and users still see the old "increase reserveTokensFloor" advice.

This is the less-common path (compaction throw vs. embedded error), but it's exactly the scenario described in the PR: the heartbeat model causes overflow during compaction.

A snapshot of the "true primary model" should be captured before the loop starts (before any LiveSessionModelSwitchError retry can mutate params.followupRun.run.model) and used as primaryModel at both call sites:

const primaryModel = params.followupRun.run.model; // ... later in the catch block ... text: `...${buildContextOverflowHint({ primaryModel: primaryModel ?? "unknown", currentModel: fallbackModel ?? ... })}`,

Prompt To Fix With AI

This is a comment left during a code review. Path: src/auto-reply/reply/agent-runner-execution.ts Line: 1446 Comment: **`fallbackModel` equals `primaryModel` on first-attempt compaction failures** When `runWithModelFallback` throws on the very first attempt, execution jumps to the `catch` block before line 1306 (`fallbackModel = fallbackResult.model`) is ever reached. Because `fallbackModel` is initialized to `params.followupRun.run.model` at line 689, both `currentModel` and `primaryModel` resolve to the same value, so `modelMismatch` is always `false` here — the heartbeat bleed hint is silently suppressed and users still see the old "increase `reserveTokensFloor`" advice. This is the less-common path (compaction throw vs. embedded error), but it's exactly the scenario described in the PR: the heartbeat model causes overflow during compaction. A snapshot of the "true primary model" should be captured before the loop starts (before any `LiveSessionModelSwitchError` retry can mutate `params.followupRun.run.model`) and used as `primaryModel` at both call sites: ```typescript const primaryModel = params.followupRun.run.model; // ... later in the catch block ... text: `...${buildContextOverflowHint({ primaryModel: primaryModel ?? "unknown", currentModel: fallbackModel ?? ... })}`, ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-04-15T20:54:00Z

+  const currentWindow = resolveContextTokensForModel({
+    cfg,
+    provider,
+    model: currentModel,
+    allowAsyncLoad: false,
+  });


Wrong provider passed for heartbeat model context-window lookup

buildContextOverflowHint always receives params.followupRun.run.provider (the primary session's provider, e.g. "bailian"), but currentModel may come from fallbackModel which belongs to a different provider (e.g. the "ollama" heartbeat model). Inside resolveContextTokensForModel, the explicit provider drives:

resolveConfiguredProviderContextTokens(cfg, "bailian", heartbeatModel) → likely undefined

the qualified cache probe "bailian/heartbeatModel" → cache miss

If the heartbeat model was stored only under "ollama/qwen3.5-9b-32k:latest", both lookups miss and currentWindow returns undefined → isSmallModel = false → the hint silently degrades to "increase reserveTokensFloor" even when the heartbeat model is the culprit.

Use fallbackProvider at both call sites so the lookup targets the right provider when currentModel came from the fallback result:

buildContextOverflowHint({ primaryModel: params.followupRun.run.model ?? "unknown", currentModel: fallbackModel ?? params.followupRun.run.model ?? "unknown", provider: fallbackProvider ?? params.followupRun.run.provider, // ← use fallback provider cfg: runtimeConfig, })

Prompt To Fix With AI

This is a comment left during a code review. Path: src/auto-reply/reply/agent-runner-execution.ts Line: 396-401 Comment: **Wrong provider passed for heartbeat model context-window lookup** `buildContextOverflowHint` always receives `params.followupRun.run.provider` (the primary session's provider, e.g. `"bailian"`), but `currentModel` may come from `fallbackModel` which belongs to a different provider (e.g. the `"ollama"` heartbeat model). Inside `resolveContextTokensForModel`, the explicit `provider` drives: - `resolveConfiguredProviderContextTokens(cfg, "bailian", heartbeatModel)` → likely `undefined` - the qualified cache probe `"bailian/heartbeatModel"` → cache miss If the heartbeat model was stored only under `"ollama/qwen3.5-9b-32k:latest"`, both lookups miss and `currentWindow` returns `undefined` → `isSmallModel = false` → the hint silently degrades to "increase `reserveTokensFloor`" even when the heartbeat model is the culprit. Use `fallbackProvider` at both call sites so the lookup targets the right provider when `currentModel` came from the fallback result: ```typescript buildContextOverflowHint({ primaryModel: params.followupRun.run.model ?? "unknown", currentModel: fallbackModel ?? params.followupRun.run.model ?? "unknown", provider: fallbackProvider ?? params.followupRun.run.provider, // ← use fallback provider cfg: runtimeConfig, }) ``` How can I resolve this? If you propose a fix, please make it concise.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 50ad9c882e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-15T21:26:01Z

+  if (isSmallModel && modelMismatch) {
+    const windowLabel = Math.round(currentWindow / 1024);
+    return (
+      `\n\nThe session is using ${currentModel} (${windowLabel}k context) ` +
+      `instead of ${primaryModel}. This may be caused by a heartbeat model ` +


Avoid heartbeat diagnosis for generic model fallback mismatches

This branch treats any currentModel !== primaryModel overflow on a small-context model as heartbeat bleed, but currentModel is updated by normal runtime fallback/live-switch paths too, so a fallback from a large model to a smaller model will incorrectly instruct users to change heartbeat.* settings even when heartbeat is unrelated. Because primaryModel is just the turn-start model and not a heartbeat-specific baseline, this condition can both misdiagnose regular fallback overflows and miss true persisted heartbeat bleed cases; the hint should require heartbeat-specific evidence or use neutral mismatch wording.

Useful? React with 👍 / 👎.

openclaw-barnacle bot added the size: S label Apr 15, 2026

chatgpt-codex-connector bot reviewed Apr 15, 2026

View reviewed changes

greptile-apps bot reviewed Apr 15, 2026

View reviewed changes

Knightmare6890 and others added 2 commits April 15, 2026 23:08

reviewed changes as suggested by codex and greptile bots

f310bf8

Merge branch 'main' into fix/misleading-context-overflow-error

50ad9c8

chatgpt-codex-connector bot reviewed Apr 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: show heartbeat model bleed hint on context overflow error#67381

fix: show heartbeat model bleed hint on context overflow error#67381
Knightmare6890 wants to merge 3 commits intoopenclaw:mainfrom
Knightmare6890:fix/misleading-context-overflow-error

Knightmare6890 commented Apr 15, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Apr 15, 2026

Uh oh!

greptile-apps bot commented Apr 15, 2026

Uh oh!

greptile-apps bot Apr 15, 2026

Uh oh!

greptile-apps bot Apr 15, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Knightmare6890 commented Apr 15, 2026

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Root Cause (if applicable)

Regression Test Plan (if applicable)

User-visible / Behavior Changes

Diagram (if applicable)

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual (before fix)

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Risks and Mitigations

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Apr 15, 2026

Greptile Summary

Confidence Score: 4/5

Uh oh!

greptile-apps bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant