Skip to content

fix: show heartbeat model bleed hint on context overflow error#67381

Open
Knightmare6890 wants to merge 3 commits intoopenclaw:mainfrom
Knightmare6890:fix/misleading-context-overflow-error
Open

fix: show heartbeat model bleed hint on context overflow error#67381
Knightmare6890 wants to merge 3 commits intoopenclaw:mainfrom
Knightmare6890:fix/misleading-context-overflow-error

Conversation

@Knightmare6890
Copy link
Copy Markdown

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: When heartbeat.model uses a small-context local model (e.g. 32k Ollama), the model override bleeds into the main session. The next context overflow triggers the misleading "increase reserveTokensFloor" error message.
  • Why it matters: Users following the advice don't fix the problem because the root cause is heartbeat model bleed, not compaction tuning. Sessions keep crashing every ~30 minutes on each heartbeat.
  • What changed: Replaced both hardcoded "Context limit exceeded" error messages with a three-tier hint that detects model mismatch and suggests the real fix (heartbeat.isolatedSession: true, heartbeat.lightContext: true).
  • What did NOT change (scope boundary): Only the user-facing error text changed. No compaction logic, heartbeat scheduling, or model-switching behavior was modified.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Root Cause (if applicable)

  • Root cause: When heartbeat.model is set to a small-context model (e.g. ollama/qwen3.5-9b-32k:latest), the heartbeat bootstrap-context reload switches the session's active model. The original model (e.g. qwen3.6-plus with 1M context) gets replaced by the 32k model. When accumulated conversation exceeds 32k tokens, OpenClaw triggers overflow and resets the session.
  • Missing detection / guardrail: The error message never surfaced which model actually hit the limit or that it differed from the session's primary model.
  • Contributing context (if known): The isHeartbeat flag only covers overflow during the heartbeat turn. In practice, the model persists after the heartbeat fires, so subsequent user messages (not heartbeats) hit the limit. Detection must compare current vs primary model at error time.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: N/A — this is an error message improvement, not behavior change.
  • Scenario the test should lock in: N/A
  • Why this is the smallest reliable guardrail: N/A
  • Existing test that already covers this (if any): N/A
  • If no new test is added, why not: The change is purely user-facing text — the overflow detection and session reset logic are unchanged.

User-visible / Behavior Changes

  • The "Context limit exceeded" error now shows which model was active and whether it differs from the primary model.
  • When a small-context model (< 64k) is active instead of the primary model, users are directed to heartbeat.isolatedSession: true / heartbeat.lightContext: true instead of reserveTokensFloor.

Diagram (if applicable)

text
Before:
Context overflow → "increase reserveTokensFloor" (wrong advice for heartbeat bleed)

After:
Context overflow → detect current model vs primary model
→ mismatch + small window → "heartbeat model override bleeding into session"
→ no mismatch + small window → "model has small context window"
→ normal/large window → "increase reserveTokensFloor" (existing advice)

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)

Repro + Verification

Environment

  • OS: WSL2 (Linux 6.6.87.2-microsoft-standard-WSL2)
  • Runtime/container: Node v22.22.1
  • Model/provider: Primary bailian/qwen3.6-plus (1M), Heartbeat ollama/qwen3.5-9b-32k:latest (32k)
  • Integration/channel (if any): WhatsApp + Discord
  • Relevant config (redacted):

json
{
"agents": {
"list": [{
"id": "agent",
"model": "qwen3.6-plus",
"heartbeat": {
"every": "30m",
"model": "ollama/qwen3.5-9b-32k:latest"
}
}]
}
}

Steps

  1. Configure an agent with a large-context primary model and a small-context heartbeat model.
  2. Have a long-running conversation that accumulates > 32k tokens.
  3. Wait for heartbeat to fire (or trigger manually).
  4. Observe session reset with "Context limit exceeded" error.

Expected

  • Error message identifies the small-context model as the cause and suggests heartbeat isolation.

Actual (before fix)

  • Error message points to reserveTokensFloor, which does not fix the problem.

Evidence

  • Trace/log snippets

Gateway log model-snapshot timeline from session 6589d855.jsonl:

09:16 — qwen3.6-plus (1M) ✅ normal operation
09:36 — qwen3.5-9b-32k (32k) ← heartbeat fired, model switched
09:46 — qwen3.6-plus (1M) ✅ compaction switched back
09:49 — qwen3.5-9b-32k (32k) ← heartbeat fired again
09:52 — qwen3.6-plus (1M) ✅ compaction switched back
09:56 — qwen3.5-9b-32k (32k) ← 💥 "Context limit exceeded" — session reset
Same pattern repeated at 11:02.

Human Verification (required)

  • Verified scenarios: Confirmed resolveContextTokensForModel is already imported and available in agent-runner-execution.ts. Confirmed runtimeConfig is in scope at both error locations.
  • Edge cases checked: Three-tier branching covers (1) model mismatch + small window, (2) small window no mismatch, (3) normal/large window.
  • What you did not verify: Runtime test on a live OpenClaw instance — this requires rebuilding the project.

Review Conversations

  • N/A — first PR submission

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)

Risks and Mitigations

None. This is a pure error message text change — no logic, behavior, or API changes.

When context limit is hit with a smaller context model than the session's
primary model, suggest heartbeat.isolatedSession/lightContext instead of
misleading reserveTokensFloor advice.

Three-tier hint logic:
1. Small model + model mismatch → heartbeat override bleed detected
2. Small model, no mismatch → suggest larger context model
3. Normal/large model → existing reserveTokensFloor advice
Fixes openclaw#67314
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7225181a65

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

kind: "final",
payload: {
text: "⚠️ Context limit exceeded. I've reset our conversation to start fresh - please try again.\n\nTo prevent this, increase your compaction buffer by setting `agents.defaults.compaction.reserveTokensFloor` to 20000 or higher in your config.",
text: `⚠️ Context limit exceeded. I've reset our conversation to start fresh - please try again.${buildContextOverflowHint({ primaryModel: params.followupRun.run.model ?? "unknown", currentModel: fallbackModel ?? params.followupRun.run.model ?? "unknown", provider: params.followupRun.run.provider, cfg: runtimeConfig })}`,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use the active provider when resolving overflow context window

buildContextOverflowHint resolves context size from a (provider, model) pair, but this call passes params.followupRun.run.provider even when the turn actually ran on a fallback provider/model. In cross-provider fallback runs, that mismatched provider can return the wrong window (or undefined), so the new hint logic may miss the small-window condition and show incorrect guidance. Please pass the effective provider for the run (e.g. fallbackProvider) alongside fallbackModel here (and in the similar compaction branch).

Useful? React with 👍 / 👎.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 15, 2026

Greptile Summary

Adds a three-tier hint to both context-overflow error messages in agent-runner-execution.ts — detecting heartbeat model bleed, a generic small-context-window case, and falling back to the existing reserveTokensFloor advice. The embedded-overflow-error path (where fallbackModel is set by a completed run) works correctly for the primary scenario described in the PR.

Two improvements worth a follow-up:

  • provider at both call sites should be fallbackProvider so the context-window lookup targets the heartbeat model's own provider rather than the primary session's provider; with a bare model ID and wrong provider the lookup returns undefinedisSmallModel = false → wrong hint.
  • In the compaction-failure throw path, fallbackModel is still set to params.followupRun.run.model when the run throws on its first attempt, making currentModel === primaryModelmodelMismatch = false → the heartbeat hint is suppressed on that path.

Confidence Score: 4/5

Safe to merge; the fix is an unambiguous improvement — the only regression risk is silently falling back to the pre-existing "reserveTokensFloor" hint in specific edge cases.

All findings are P2. The embedded-overflow-error path (the scenario the PR targets) works correctly. Two gaps — wrong provider for context-window lookup and fallbackModel equality on first-attempt throw — can cause the hint to degrade to the old message, not to incorrect or harmful output. No logic, behavior, or security changes; purely user-facing text improvement.

src/auto-reply/reply/agent-runner-execution.ts — specifically the provider argument at both buildContextOverflowHint call sites and the fallbackModel initialization semantics in the compaction-failure catch path.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-execution.ts
Line: 1446

Comment:
**`fallbackModel` equals `primaryModel` on first-attempt compaction failures**

When `runWithModelFallback` throws on the very first attempt, execution jumps to the `catch` block before line 1306 (`fallbackModel = fallbackResult.model`) is ever reached. Because `fallbackModel` is initialized to `params.followupRun.run.model` at line 689, both `currentModel` and `primaryModel` resolve to the same value, so `modelMismatch` is always `false` here — the heartbeat bleed hint is silently suppressed and users still see the old "increase `reserveTokensFloor`" advice.

This is the less-common path (compaction throw vs. embedded error), but it's exactly the scenario described in the PR: the heartbeat model causes overflow during compaction.

A snapshot of the "true primary model" should be captured before the loop starts (before any `LiveSessionModelSwitchError` retry can mutate `params.followupRun.run.model`) and used as `primaryModel` at both call sites:
```typescript
const primaryModel = params.followupRun.run.model;
// ... later in the catch block ...
text: `...${buildContextOverflowHint({ primaryModel: primaryModel ?? "unknown", currentModel: fallbackModel ?? ... })}`,
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-execution.ts
Line: 396-401

Comment:
**Wrong provider passed for heartbeat model context-window lookup**

`buildContextOverflowHint` always receives `params.followupRun.run.provider` (the primary session's provider, e.g. `"bailian"`), but `currentModel` may come from `fallbackModel` which belongs to a different provider (e.g. the `"ollama"` heartbeat model). Inside `resolveContextTokensForModel`, the explicit `provider` drives:
- `resolveConfiguredProviderContextTokens(cfg, "bailian", heartbeatModel)` → likely `undefined`
- the qualified cache probe `"bailian/heartbeatModel"` → cache miss

If the heartbeat model was stored only under `"ollama/qwen3.5-9b-32k:latest"`, both lookups miss and `currentWindow` returns `undefined``isSmallModel = false` → the hint silently degrades to "increase `reserveTokensFloor`" even when the heartbeat model is the culprit.

Use `fallbackProvider` at both call sites so the lookup targets the right provider when `currentModel` came from the fallback result:
```typescript
buildContextOverflowHint({
  primaryModel: params.followupRun.run.model ?? "unknown",
  currentModel: fallbackModel ?? params.followupRun.run.model ?? "unknown",
  provider: fallbackProvider ?? params.followupRun.run.provider, // ← use fallback provider
  cfg: runtimeConfig,
})
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix: show heartbeat model bleed hint on ..." | Re-trigger Greptile

kind: "final",
payload: {
text: "⚠️ Context limit exceeded during compaction. I've reset our conversation to start fresh - please try again.\n\nTo prevent this, increase your compaction buffer by setting `agents.defaults.compaction.reserveTokensFloor` to 20000 or higher in your config.",
text: `⚠️ Context limit exceeded during compaction. I've reset our conversation to start fresh - please try again.${buildContextOverflowHint({ primaryModel: params.followupRun.run.model ?? "unknown", currentModel: fallbackModel ?? params.followupRun.run.model ?? "unknown", provider: params.followupRun.run.provider, cfg: runtimeConfig })}`,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 fallbackModel equals primaryModel on first-attempt compaction failures

When runWithModelFallback throws on the very first attempt, execution jumps to the catch block before line 1306 (fallbackModel = fallbackResult.model) is ever reached. Because fallbackModel is initialized to params.followupRun.run.model at line 689, both currentModel and primaryModel resolve to the same value, so modelMismatch is always false here — the heartbeat bleed hint is silently suppressed and users still see the old "increase reserveTokensFloor" advice.

This is the less-common path (compaction throw vs. embedded error), but it's exactly the scenario described in the PR: the heartbeat model causes overflow during compaction.

A snapshot of the "true primary model" should be captured before the loop starts (before any LiveSessionModelSwitchError retry can mutate params.followupRun.run.model) and used as primaryModel at both call sites:

const primaryModel = params.followupRun.run.model;
// ... later in the catch block ...
text: `...${buildContextOverflowHint({ primaryModel: primaryModel ?? "unknown", currentModel: fallbackModel ?? ... })}`,
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-execution.ts
Line: 1446

Comment:
**`fallbackModel` equals `primaryModel` on first-attempt compaction failures**

When `runWithModelFallback` throws on the very first attempt, execution jumps to the `catch` block before line 1306 (`fallbackModel = fallbackResult.model`) is ever reached. Because `fallbackModel` is initialized to `params.followupRun.run.model` at line 689, both `currentModel` and `primaryModel` resolve to the same value, so `modelMismatch` is always `false` here — the heartbeat bleed hint is silently suppressed and users still see the old "increase `reserveTokensFloor`" advice.

This is the less-common path (compaction throw vs. embedded error), but it's exactly the scenario described in the PR: the heartbeat model causes overflow during compaction.

A snapshot of the "true primary model" should be captured before the loop starts (before any `LiveSessionModelSwitchError` retry can mutate `params.followupRun.run.model`) and used as `primaryModel` at both call sites:
```typescript
const primaryModel = params.followupRun.run.model;
// ... later in the catch block ...
text: `...${buildContextOverflowHint({ primaryModel: primaryModel ?? "unknown", currentModel: fallbackModel ?? ... })}`,
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +396 to +401
const currentWindow = resolveContextTokensForModel({
cfg,
provider,
model: currentModel,
allowAsyncLoad: false,
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Wrong provider passed for heartbeat model context-window lookup

buildContextOverflowHint always receives params.followupRun.run.provider (the primary session's provider, e.g. "bailian"), but currentModel may come from fallbackModel which belongs to a different provider (e.g. the "ollama" heartbeat model). Inside resolveContextTokensForModel, the explicit provider drives:

  • resolveConfiguredProviderContextTokens(cfg, "bailian", heartbeatModel) → likely undefined
  • the qualified cache probe "bailian/heartbeatModel" → cache miss

If the heartbeat model was stored only under "ollama/qwen3.5-9b-32k:latest", both lookups miss and currentWindow returns undefinedisSmallModel = false → the hint silently degrades to "increase reserveTokensFloor" even when the heartbeat model is the culprit.

Use fallbackProvider at both call sites so the lookup targets the right provider when currentModel came from the fallback result:

buildContextOverflowHint({
  primaryModel: params.followupRun.run.model ?? "unknown",
  currentModel: fallbackModel ?? params.followupRun.run.model ?? "unknown",
  provider: fallbackProvider ?? params.followupRun.run.provider, // ← use fallback provider
  cfg: runtimeConfig,
})
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-execution.ts
Line: 396-401

Comment:
**Wrong provider passed for heartbeat model context-window lookup**

`buildContextOverflowHint` always receives `params.followupRun.run.provider` (the primary session's provider, e.g. `"bailian"`), but `currentModel` may come from `fallbackModel` which belongs to a different provider (e.g. the `"ollama"` heartbeat model). Inside `resolveContextTokensForModel`, the explicit `provider` drives:
- `resolveConfiguredProviderContextTokens(cfg, "bailian", heartbeatModel)` → likely `undefined`
- the qualified cache probe `"bailian/heartbeatModel"` → cache miss

If the heartbeat model was stored only under `"ollama/qwen3.5-9b-32k:latest"`, both lookups miss and `currentWindow` returns `undefined``isSmallModel = false` → the hint silently degrades to "increase `reserveTokensFloor`" even when the heartbeat model is the culprit.

Use `fallbackProvider` at both call sites so the lookup targets the right provider when `currentModel` came from the fallback result:
```typescript
buildContextOverflowHint({
  primaryModel: params.followupRun.run.model ?? "unknown",
  currentModel: fallbackModel ?? params.followupRun.run.model ?? "unknown",
  provider: fallbackProvider ?? params.followupRun.run.provider, // ← use fallback provider
  cfg: runtimeConfig,
})
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 50ad9c882e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +405 to +409
if (isSmallModel && modelMismatch) {
const windowLabel = Math.round(currentWindow / 1024);
return (
`\n\nThe session is using ${currentModel} (${windowLabel}k context) ` +
`instead of ${primaryModel}. This may be caused by a heartbeat model ` +
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid heartbeat diagnosis for generic model fallback mismatches

This branch treats any currentModel !== primaryModel overflow on a small-context model as heartbeat bleed, but currentModel is updated by normal runtime fallback/live-switch paths too, so a fallback from a large model to a smaller model will incorrectly instruct users to change heartbeat.* settings even when heartbeat is unrelated. Because primaryModel is just the turn-start model and not a heartbeat-specific baseline, this condition can both misdiagnose regular fallback overflows and miss true persisted heartbeat bleed cases; the hint should require heartbeat-specific evidence or use neutral mismatch wording.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Misleading "Context limit exceeded" error points to compaction when root cause is heartbeat model bleed

1 participant