Skip to content

Auto-reply: honor compaction model for memory flush#51159

Closed
johnkhagler wants to merge 466 commits intoopenclaw:mainfrom
johnkhagler:main
Closed

Auto-reply: honor compaction model for memory flush#51159
johnkhagler wants to merge 466 commits intoopenclaw:mainfrom
johnkhagler:main

Conversation

@johnkhagler
Copy link
Copy Markdown

@johnkhagler johnkhagler commented Mar 20, 2026

Summary

Describe the problem and fix in 2–5 bullets:

Problem: memory flush used generic model fallback resolution instead of compaction-aware resolution.
Why it matters: when agents.defaults.compaction.model is set, memory flush could still run on the primary fallback path, causing inconsistent behavior.
What changed: memory flush now resolves model fallbacks through compaction-aware logic, with focused helper and test updates.
What did NOT change: no non-memory-flush execution paths changed, no config key renames, no auth/token flow changes, no UI changes.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #
  • Related #

User-visible / Behavior Changes

  • Memory flush now honors agents.defaults.compaction.model fallback behavior, including provider/model and model-only override forms.
  • No default values were changed.

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: Linux
  • Runtime/container: local openclaw CLI/gateway dev workspace
  • Model/provider: ollama (llama3.1:8b during validation)
  • Integration/channel: none required for unit verification
  • Relevant config: agents.defaults.compaction.model set, memory flush enabled

Steps

  1. Configure compaction model override in config.
  2. Run focused resolver unit tests.
  3. Confirm compaction override tests pass and fallback behavior remains covered.

Expected

  • Memory flush resolves model fallback options through compaction-aware logic.

Actual

  • Focused tests validate compaction-aware resolution and existing fallback compatibility.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: Agent-assisted local validation was run in this workspace (build plus focused unit tests for the changed resolver path).
  • Edge cases checked: provider/model override and model-only override fallback behavior in focused tests.
  • What I did not verify: I did not personally run an end-to-end manual runtime reproduction in this pass.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
    -If yes, exact upgrade steps: N/A

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: revert this PR commits.
  • Files/config to restore: resolver usage in src/auto-reply/reply/agent-runner-memory.ts, helper logic in src/auto-reply/reply/agent-runner-utils.ts, and related tests in src/auto-reply/reply/agent-runner-utils.test.ts.
  • Known bad symptoms reviewers should watch for: memory flush appears to ignore compaction model override and reverts to primary fallback behavior.

Risks and Mitigations

List only real risks for this PR. Add/remove entries as needed. If none, write None.

  • Risk: compaction model parsing could behave unexpectedly for uncommon override formats.
  • Mitigation: focused tests cover provider/model and model-only forms, and fallback behavior remains intact when override is absent.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 20, 2026

Greptile Summary

This PR fixes memory flush to use compaction-aware model fallback resolution (resolveCompactionModelFallbackOptions) instead of the generic resolveModelFallbackOptions, ensuring that agents.defaults.compaction.model overrides are honored during memory flush runs. The implementation is clean and well-scoped.

  • Core fix (agent-runner-memory.ts): single call-site swap in runMemoryFlushIfNeeded; no other execution paths are affected.
  • New helper (agent-runner-utils.ts): resolveCompactionModelFallbackOptions correctly handles both "provider/model" and model-only override forms by delegating to resolveModelFallbackOptions first, then overlaying the compaction override.
  • Tests (agent-runner-utils.test.ts): refactored from mock-based to integration-style, adding two new cases for the compaction path. An unrelated assertion (currentThreadTs: "928") was dropped from a Telegram threading test outside the PR's stated scope — this silently reduces coverage for that field and warrants explanation.

Confidence Score: 4/5

  • Safe to merge with minor open questions on one unrelated test change and a low-probability edge case in model string parsing.
  • The core fix is minimal, correct, and well-tested for the main use cases (provider/model split and model-only override). The only concerns are a dropped test assertion for currentThreadTs that is outside the PR scope (reduces confidence in test coverage) and a benign edge case where a model name starting with / would pass through unchanged with a leading slash. Neither is a runtime blocker for normal usage.
  • src/auto-reply/reply/agent-runner-utils.test.ts — the removal of the currentThreadTs assertion from the Telegram threading test should be explained or reverted.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-utils.test.ts
Line: 238-241

Comment:
**Unrelated test assertion removed — reduces coverage**

The removal of `currentThreadTs: "928"` from this `toMatchObject` expectation is outside the stated scope of this PR (memory flush model resolution). The PR description explicitly says "no non-memory-flush execution paths changed," but `buildThreadingToolContext` is unrelated to memory flush.

Since `toMatchObject` only checks for _inclusion_, the removal of this assertion means the test no longer verifies that `currentThreadTs` is correctly set from `MessageThreadId: 928`. If the Telegram plugin stopped populating `currentThreadTs`, this test would still pass silently.

Was this intentional? If so, it should be in its own commit/PR with an explanation. If not, consider restoring the assertion:

```suggestion
    expect(context).toMatchObject({
      currentChannelId: "telegram:-1003841603622",
      currentThreadTs: "928",
      currentMessageId: "2284",
    });
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-utils.ts
Line: 178-189

Comment:
**Edge case: model starting with `/` falls through to model-only path with leading slash**

If `agents.defaults.compaction.model` is set to a value like `"/some-model"`, `slashIdx` will be `0`, which fails the `> 0` guard and falls through to:

```ts
return { ...resolved, model: override }; // model === "/some-model"
```

The resulting model string would contain a leading `/`, which is almost certainly not the intended behavior. This is an unlikely config value, but it could produce a confusing runtime error.

Consider using `slashIdx > 0 && provider && model` with an explicit check, or document the `> 0` constraint in a comment to clarify the intent.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "Tests: align telegra..."

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8b5c2f91cb

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 934a3ffcba

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@johnkhagler
Copy link
Copy Markdown
Author

Addressed bot feedback in follow-up commit 934a3ff:

  • rebased model-only fallbacks onto compaction provider
  • recomputed memory-flush CLI gating from resolved compaction provider/model
  • added focused tests for compaction fallback behavior

I also replied to/resolved applicable bot threads. This is ready for maintainer review.

@johnkhagler johnkhagler force-pushed the main branch 2 times, most recently from 1dcd841 to fb22d98 Compare March 25, 2026 06:14
@johnkhagler
Copy link
Copy Markdown
Author

Refreshed this PR branch to current upstream main while preserving the memory-flush compaction-model fix.

What changed:

  • Replayed the fix commit on top of latest upstream main
  • PR now contains a single clean commit: fb22d98

Validation I ran locally in sandbox runtime:

  • Built a fresh tarball from tag v2026.3.23-2 + this fix
  • Installed it in NemoClaw sandbox
  • Verified compiled runtime includes the fix path:
    • resolveCompactionModelFallbackOptions
    • memory flush uses ...resolveCompactionModelFallbackOptions(params.followupRun.run)
  • Gateway/channel probes are healthy after install (Telegram + Slack)

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fb22d98f66

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

try {
await runWithModelFallback({
...resolveModelFallbackOptions(params.followupRun.run),
...resolveCompactionModelFallbackOptions(params.followupRun.run),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Gate memory flush using the compaction model window

This call now executes memory flush on agents.defaults.compaction.model, but flush gating is still computed earlier from params.followupRun.run.model via resolveMemoryFlushContextWindowTokens/shouldRunMemoryFlush. When the compaction override has a smaller context window than the primary model (e.g., primary 200k, compaction 32k), flush can trigger too late and this compaction run can overflow/fail before memory is written. The gating model/window should be derived from the same resolved compaction model used here.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 15ed1fbaa2

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +125 to +129
const trimmed = fallback.trim();
if (!trimmed || trimmed.includes("/")) {
return trimmed;
}
return `${provider}/${trimmed}`;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve alias fallbacks when rebasing compaction provider

This rewrite prefixes every slash-less fallback with the compaction provider, which changes alias entries into literal provider/model strings. In runWithModelFallback, alias resolution only happens for raw fallback strings without /, so a config like fallbacks: ["cheap"] (where cheap is a configured alias) no longer resolves after compaction.model switches provider. In memory-flush fallback scenarios this can skip valid alias targets and fail over incorrectly.

Useful? React with 👍 / 👎.

@johnkhagler
Copy link
Copy Markdown
Author

Rebased onto latest upstream/main (38adeb8) as of 2026-03-27. No conflicts. Fix logic and tests are unchanged — the compaction model override for memory flush still applies cleanly. Happy to answer any questions.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1b42eca97d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

(params.sessionKey ? params.sessionStore?.[params.sessionKey] : undefined);
const contextWindowTokens = resolveMemoryFlushContextWindowTokens({
modelId: params.followupRun.run.model ?? params.defaultModel,
modelId: compactionFallbackOptions.model ?? params.defaultModel,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Normalize compaction override model before threshold lookup

This line feeds resolveMemoryFlushContextWindowTokens with compactionFallbackOptions.model, but that value is the raw suffix from agents.defaults.compaction.model (for example aurora-alpha from openrouter/aurora-alpha). lookupContextTokens is exact-key based, while the fallback runner normalizes some provider/model pairs (e.g. OpenRouter native ids become openrouter/aurora-alpha), so flush gating can miss the real context window and fall back to agentCfgContextTokens/DEFAULT_CONTEXT_TOKENS. In practice this can trigger memory flush too late (risking overflow on smaller-window models) or too early (extra flush churn) whenever the compaction override needs model normalization.

Useful? React with 👍 / 👎.

jacobtomlinson and others added 14 commits March 29, 2026 15:57
* Net: block missing IPv6 special-use ranges

* Tests: refresh public IPv6 pinning fixtures
* Extensions: require admin for config write commands

* Tests: cover phone control disarm auth
* Gateway: align chat.send reset scope checks

* Gateway: tighten chat.send reset regression test

* Gateway: honor internal provider reset scope
* Discord: gate text approvals by approver policy

* Discord: require approvers for plugin text approvals

* Discord: preserve legacy text approval fallback
* nextcloud-talk: throttle repeated webhook auth failures

Co-authored-by: Brian Mendonca <208517100+bmendonca3@users.noreply.github.com>

* nextcloud-talk: scope webhook auth limiter per server

* nextcloud-talk: limit repeated webhook auth failures only

---------

Co-authored-by: Brian Mendonca <208517100+bmendonca3@users.noreply.github.com>
* Discord: apply component interaction policy gates

Co-authored-by: nexrin <268879349+nexrin@users.noreply.github.com>

* Discord: pass carbon rate limit request

* Discord: reply to blocked component interactions

---------

Co-authored-by: nexrin <268879349+nexrin@users.noreply.github.com>
* Telegram: tighten media SSRF policy

* Telegram: restrict media downloads to configured hosts

* Telegram: preserve custom media apiRoot hosts
@openclaw-barnacle
Copy link
Copy Markdown

Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch.

1 similar comment
@openclaw-barnacle
Copy link
Copy Markdown

Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7f8b4d6094

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +126 to +129
if (!trimmed || trimmed.includes("/")) {
return trimmed;
}
return `${provider}/${trimmed}`;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve alias fallbacks when rebasing compaction provider

Rewriting every slash-less fallback to ${provider}/${fallback} breaks configured model aliases during memory-flush fallback. In runWithModelFallback, alias resolution only runs for raw strings without a slash, so an alias like "cheap" now becomes "openrouter/cheap" and is treated as a literal model id instead of resolving through agents.defaults.models[*].alias. This makes valid compaction fallback chains fail specifically when agents.defaults.compaction.model switches providers and fallbacks include aliases.

Useful? React with 👍 / 👎.

const canAttemptFlush = memoryFlushWritable && !params.isHeartbeat && !isCli;
let entry =
params.sessionEntry ??
(params.sessionKey ? params.sessionStore?.[params.sessionKey] : undefined);
const contextWindowTokens = resolveMemoryFlushContextWindowTokens({
modelId: params.followupRun.run.model ?? params.defaultModel,
modelId: compactionFallbackOptions.model ?? params.defaultModel,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Normalize compaction model before context-window lookup

Memory-flush gating now looks up context tokens from compactionFallbackOptions.model, but that value is the raw suffix from agents.defaults.compaction.model and may be unnormalized (for example "aurora-alpha" from "openrouter/aurora-alpha"). lookupContextTokens is exact-key based, while runtime model resolution normalizes some provider/model pairs (notably OpenRouter native IDs), so this can miss the true context window and compute a wrong flush threshold, causing late flushes (overflow risk) or premature flushes.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling app: android App: android app: ios App: ios app: macos App: macos app: web-ui App: web-ui channel: bluebubbles Channel integration: bluebubbles channel: discord Channel integration: discord channel: feishu Channel integration: feishu channel: googlechat Channel integration: googlechat channel: imessage Channel integration: imessage channel: irc channel: line Channel integration: line channel: matrix Channel integration: matrix channel: mattermost Channel integration: mattermost channel: msteams Channel integration: msteams channel: nextcloud-talk Channel integration: nextcloud-talk channel: nostr Channel integration: nostr channel: signal Channel integration: signal channel: slack Channel integration: slack channel: telegram Channel integration: telegram channel: tlon Channel integration: tlon channel: twitch Channel integration: twitch channel: voice-call Channel integration: voice-call channel: whatsapp-web Channel integration: whatsapp-web channel: zalo Channel integration: zalo channel: zalouser Channel integration: zalouser cli CLI command changes commands Command implementations docker Docker and sandbox tooling docs Improvements or additions to documentation extensions: acpx extensions: anthropic extensions: byteplus extensions: cloudflare-ai-gateway extensions: copilot-proxy Extension: copilot-proxy extensions: deepseek extensions: device-pair extensions: diagnostics-otel Extension: diagnostics-otel extensions: duckduckgo extensions: fal extensions: huggingface extensions: kilocode extensions: kimi-coding extensions: llm-task Extension: llm-task extensions: lobster Extension: lobster extensions: memory-core Extension: memory-core extensions: memory-lancedb Extension: memory-lancedb extensions: minimax extensions: modelstudio extensions: moonshot extensions: nvidia extensions: open-prose Extension: open-prose extensions: openai extensions: phone-control extensions: qianfan extensions: synthetic extensions: talk-voice extensions: tavily extensions: together extensions: venice extensions: vercel-ai-gateway extensions: volcengine extensions: xiaomi gateway Gateway runtime scripts Repository scripts size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.