Skip to content

fix: preserve runtime token budget in deferred context-engine maintenance#66820

Merged
jalehman merged 2 commits intoopenclaw:mainfrom
jalehman:josh/pass-context-budget-to-deferred-maintenance
Apr 14, 2026
Merged

fix: preserve runtime token budget in deferred context-engine maintenance#66820
jalehman merged 2 commits intoopenclaw:mainfrom
jalehman:josh/pass-context-budget-to-deferred-maintenance

Conversation

@jalehman
Copy link
Copy Markdown
Contributor

@jalehman jalehman commented Apr 14, 2026

Summary

  • Problem: deferred context-engine maintenance rebuilt runtimeContext without the active model's token budget, so maintenance fell back to a synthetic default budget instead of the real one from the turn that queued the work.
  • Why it matters: background maintenance could make prompt-size and compaction decisions against the wrong window, causing unnecessary maintenance pressure and inconsistent behavior between inline turn handling and deferred maintenance.
  • What changed: buildAfterTurnRuntimeContext() now carries forward tokenBudget, and it also forwards a best-effort currentTokenCount when the latest call usage total is available; deferred maintenance tests now lock in those fields.
  • What did NOT change (scope boundary): this does not change maintenance policy, cache heuristics, or compaction thresholds; it only preserves runtime context fidelity between the foreground turn path and the deferred maintenance path.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #
  • Related #
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: the runtime context builder used after a turn did not preserve contextTokenBudget, so deferred maintenance later ran with no budget in runtimeContext and fell back to the context engine's internal default.
  • Missing detection / guardrail: tests covered direct afterTurn() token budget wiring but did not assert that deferred maintenance received the same runtime budget context.
  • Contributing context (if known): deferred maintenance reuses the stored runtime context instead of the direct afterTurn() call arguments, so omissions in the runtime-context shape only show up once work is replayed later.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/agents/pi-embedded-runner/context-engine-maintenance.test.ts and src/agents/pi-embedded-runner/run/attempt.test.ts
  • Scenario the test should lock in: the runtime context produced after a turn includes tokenBudget and currentTokenCount, and deferred maintenance receives those same values when it invokes the context engine.
  • Why this is the smallest reliable guardrail: the bug is in the runtime-context handoff layer, so unit tests on the builder and maintenance caller exercise the exact omission without needing a full provider-backed run.
  • Existing test that already covers this (if any): direct afterTurn() budget handling was already exercised indirectly; the deferred maintenance path was not.
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

None.

Diagram (if applicable)

Before:
[turn completes with real model budget]
  -> [runtimeContext drops tokenBudget]
  -> [deferred maintenance runs later]
  -> [maintenance falls back to synthetic budget]

After:
[turn completes with real model budget]
  -> [runtimeContext preserves tokenBudget/currentTokenCount]
  -> [deferred maintenance runs later]
  -> [maintenance evaluates against the same budget context]

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: local Node/Vitest
  • Model/provider: N/A
  • Integration/channel (if any): embedded runner / context engine
  • Relevant config (redacted): defaults

Steps

  1. Build the post-turn runtime context for a turn with a non-default contextTokenBudget.
  2. Queue deferred context-engine maintenance using that runtime context.
  3. Inspect the arguments passed into maintain().

Expected

  • Deferred maintenance receives the same runtime tokenBudget as the completed turn, plus currentTokenCount when available.

Actual

  • Before this change, deferred maintenance received no tokenBudget in runtimeContext and relied on the engine fallback.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: ran the targeted unit tests for the after-turn runtime-context builder and deferred context-engine maintenance handoff; verified the new fields are present and forwarded.
  • Edge cases checked: currentTokenCount is only included when a finite positive total is available; existing runtime-context fields remain intact.
  • What you did not verify: a full end-to-end provider-backed run in a live OpenClaw session.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps:

Risks and Mitigations

  • Risk: currentTokenCount could be stale or absent on some runs.
    • Mitigation: it is passed as best-effort metadata only when a finite positive total is already available; maintenance still behaves as before when it is missing.

@openclaw-barnacle openclaw-barnacle bot added agents Agent runtime and tooling size: XS maintainer Maintainer-authored PR labels Apr 14, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 14, 2026

Greptile Summary

This PR fixes deferred context-engine maintenance falling back to a synthetic 128k token budget by threading tokenBudget (and a best-effort currentTokenCount) through buildAfterTurnRuntimeContext into the ContextEngineRuntimeContext that maintain() receives. The change is purely additive — two new optional fields on the function signature and on the ContextEngineRuntimeContext type — and is covered by focused new tests that verify the fields are propagated end-to-end through the deferred background-task path.

Confidence Score: 5/5

Safe to merge — focused, additive fix with no breaking changes and good test coverage of the previously-unthreaded token budget path.

All changed surfaces are purely additive (new optional parameters and explicit type fields), existing call-sites pass the values correctly, the spread propagation through buildContextEngineMaintenanceRuntimeContext is unchanged and correct, and the new tests directly verify the end-to-end deferred maintenance context. No P0 or P1 findings.

No files require special attention.

Reviews (1): Last reviewed commit: "fix(context-engine): pass deferred maint..." | Re-trigger Greptile

@jalehman jalehman changed the title fix: pass deferred maintenance token budget fix: preserve runtime token budget in deferred context-engine maintenance Apr 14, 2026
@jalehman jalehman self-assigned this Apr 14, 2026
Thread tokenBudget through the after-turn runtime context so background context-engine maintenance reuses the real model context window instead of falling back to 128k. Also pass through a best-effort currentTokenCount from the latest call total and make the runtime context type explicit about both fields.

Regeneration-Prompt: |
  OpenClaw already passed the real context token budget into direct context-engine calls like afterTurn and assemble, but deferred maintain() reused only the runtimeContext object and that object did not carry tokenBudget. Lossless Claw therefore fell back to 128k during background maintenance, which made budget-trigger fire much more aggressively than the live model context warranted. Thread the real contextTokenBudget into buildAfterTurnRuntimeContext so deferred maintenance receives the same budget, and pass a straightforward best-effort currentTokenCount from the latest call total while the relevant data is already in scope. Keep the change additive, update the runtime-context type, and cover the background maintenance/runtime-context behavior with focused tests.
@jalehman jalehman force-pushed the josh/pass-context-budget-to-deferred-maintenance branch from eabef29 to 95c5d3c Compare April 14, 2026 22:19
@aisle-research-bot
Copy link
Copy Markdown

aisle-research-bot bot commented Apr 14, 2026

🔒 Aisle Security Analysis

We found 1 potential security issue(s) in this PR:

# Severity Title
1 🟡 Medium Unvalidated token usage telemetry in derivePromptTokens can cause incorrect/abusive token counts
1. 🟡 Unvalidated token usage telemetry in derivePromptTokens can cause incorrect/abusive token counts
Property Value
Severity Medium
CWE CWE-20
Location src/agents/usage.ts:181-194

Description

derivePromptTokens() sums token usage fields without validating they are finite, non-negative, or within reasonable bounds. Although normalizeUsage() uses asFiniteNumber(), it does not clamp cacheRead/cacheWrite (and allows negative finite values), and derivePromptTokens() itself accepts any numbers.

Security/robustness impact when provider/SDK telemetry is malformed or attacker-controlled:

  • Negative values (e.g., cacheRead: -1e9) can make the sum <= 0 and return undefined, potentially causing downstream maintenance/compaction bookkeeping to treat the current token count as unknown and skip/alter maintenance decisions.
  • Extremely large values (e.g., cacheWrite: 1e12) will be propagated as currentTokenCount, potentially triggering unexpected maintenance/compaction behavior (e.g., repeated compaction, excessive work) and creating a denial-of-service vector if an untrusted provider/plugin can influence usage objects.

Vulnerable code:

export function derivePromptTokens(usage?: {
  input?: number;
  cacheRead?: number;
  cacheWrite?: number;
}): number | undefined {
  if (!usage) {
    return undefined;
  }
  const input = usage.input ?? 0;
  const cacheRead = usage.cacheRead ?? 0;
  const cacheWrite = usage.cacheWrite ?? 0;
  const sum = input + cacheRead + cacheWrite;
  return sum > 0 ? sum : undefined;
}

Recommendation

Harden derivePromptTokens() (and/or normalizeUsage()) to treat usage as untrusted:

  • Coerce each field to a finite number
  • Clamp each component to >= 0
  • Optionally cap the final value to a reasonable maximum (e.g., tokenBudget or tokenBudget * k) to prevent pathological provider telemetry from causing excessive maintenance work.

Example fix:

function clampNonNegativeFinite(v: unknown): number {
  return typeof v === "number" && Number.isFinite(v) && v > 0 ? v : 0;
}

export function derivePromptTokens(usage?: {
  input?: unknown;
  cacheRead?: unknown;
  cacheWrite?: unknown;
}, opts?: { max?: number }): number | undefined {
  if (!usage) return undefined;

  const input = clampNonNegativeFinite(usage.input);
  const cacheRead = clampNonNegativeFinite(usage.cacheRead);
  const cacheWrite = clampNonNegativeFinite(usage.cacheWrite);

  let sum = input + cacheRead + cacheWrite;
  if (opts?.max !== undefined) sum = Math.min(sum, opts.max);

  return sum > 0 ? Math.floor(sum) : undefined;
}

If you prefer centralizing sanitation, clamp cacheRead/cacheWrite to >= 0 in normalizeUsage() as well, so other consumers cannot accidentally propagate negative token counts.


Analyzed PR: #66820 at commit 95c5d3c

Last updated on: 2026-04-14T22:21:47Z

@jalehman jalehman merged commit 75e7fc9 into openclaw:main Apr 14, 2026
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling maintainer Maintainer-authored PR size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant