Skip to content

feat: add Agent.keepAlive() to prevent idle eviction#1029

Merged
threepointone merged 2 commits intomainfrom
keepalive
Mar 1, 2026
Merged

feat: add Agent.keepAlive() to prevent idle eviction#1029
threepointone merged 2 commits intomainfrom
keepalive

Conversation

@threepointone
Copy link
Contributor

Summary

Add a public keepAlive() method to the base Agent class that prevents Durable Object eviction during long-running work. The method creates a 30-second heartbeat via the existing scheduling system and returns a disposer function to stop it.

Problem

Durable Objects are evicted after ~70–140 seconds of inactivity. During long-running operations — streaming LLM responses, waiting on tool calls, running multi-step workflows — the DO can be evicted mid-flight, losing in-progress work. Users currently have no built-in way to prevent this.

Solution

const dispose = await this.keepAlive();
try {
  // ... long-running work that must not be interrupted ...
} finally {
  dispose();
}

keepAlive() uses scheduleEvery() to create an interval schedule with an internal _cf_keepAliveHeartbeat callback. The alarm fires every 30 seconds, which is well within the eviction window. When the work is done, calling the disposer cancels the schedule.

Why use the scheduling system?

  • Survives hibernation — the schedule row persists in SQLite, so the heartbeat resumes even after a cold start
  • No alarm conflicts — the scheduling system already multiplexes multiple schedules through a single DO alarm slot. A raw setAlarm() approach would conflict with user-created schedules
  • Observability — heartbeats show up in getSchedules() and emit standard schedule:execute events
  • Multiple callers — each keepAlive() call gets its own schedule ID, so concurrent long-running operations don't interfere with each other

AIChatAgent integration

AIChatAgent._reply() now automatically wraps every LLM stream with keepAlive(). The heartbeat starts before streaming begins and is disposed in a .finally() block, so it cleans up on both success and error. This means every AI chat response is now protected from idle eviction by default — no user code changes needed.

Changes

File What changed
packages/agents/src/index.ts Added KEEP_ALIVE_INTERVAL_MS (30s), keepAlive() method, _cf_keepAliveHeartbeat() no-op callback
packages/ai-chat/src/index.ts _reply() acquires keepAlive() before streaming, disposes in .finally()
packages/agents/src/experimental/forever.ts Removed duplicated keepAlive() — now inherits from Agent. Override _cf_keepAliveHeartbeat for internal recovery logic. Removed standalone keepAlive export (experimental, unstable API)
packages/agents/src/tests/keep-alive.test.ts 4 new tests on a plain Agent (no mixins)
packages/agents/src/tests/agents/keep-alive.ts Test agent for keepAlive
packages/agents/src/tests/agents/fiber.ts Updated SQL query to match renamed callback
Test infra (worker.ts, wrangler.jsonc, index.ts) Registered TestKeepAliveAgent

Notes for reviewers

  • @experimental tagkeepAlive() is marked experimental. The API surface is small (returns a disposer), so it's unlikely to change much, but the tag gives us room.
  • 30-second interval — well within the ~70–140s eviction window, with enough margin to tolerate a missed tick. Not so frequent that the SQLite overhead matters.
  • The disposer is idempotent — calling it multiple times is safe (guarded by a disposed flag). The void prefix on cancelSchedule is intentional — fire-and-forget cleanup.
  • _cf_keepAliveHeartbeat is a no-op on the base Agent — subclasses can override it to add work on each heartbeat tick (the experimental module does this for recovery logic).
  • ai-chat integration is minimal — 3 lines added to _reply(). The .finally() ensures cleanup even if streaming errors out.
  • No new dependencies, no new exportskeepAlive() is a method on Agent, not a separate import.

Testing

  • 4 new tests covering: schedule creation (verifies callback name, type, interval), disposal, idempotent disposal, multiple concurrent calls
  • All 290+ existing tests pass (including existing scheduling and alarm tests)
  • Build passes for all packages

Introduce an experimental Agent.keepAlive API that schedules a 30s alarm heartbeat to keep Durable Objects alive and returns a disposer to cancel it. Add an internal no-op callback _cf_keepAliveHeartbeat and switch fiber-related cleanup and tests to use the new callback name. Remove the standalone keepAlive implementation from the experimental forever mixin and centralize the API on Agent. Update AIChatAgent to call keepAlive() during streaming to prevent idle eviction. Add TestKeepAliveAgent, tests (keep-alive.test.ts), and test worker/wrangler entries to validate schedule creation, disposal, idempotence, and concurrent calls. Includes a changeset describing the minor/patch bump.
@changeset-bot
Copy link

changeset-bot bot commented Mar 1, 2026

🦋 Changeset detected

Latest commit: 1639dab

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
agents Minor
@cloudflare/ai-chat Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pkg-pr-new
Copy link

pkg-pr-new bot commented Mar 1, 2026

Open in StackBlitz

npm i https://pkg.pr.new/agents@1029
npm i https://pkg.pr.new/@cloudflare/ai-chat@1029
npm i https://pkg.pr.new/@cloudflare/codemode@1029
npm i https://pkg.pr.new/hono-agents@1029

commit: a08d5d0

Add keepAlive to the Agent/AIChatAgent type picks and remove the local keepAlive noop/override in ai-chat. The ai-chat DurableChatAgent no longer imports or overrides keepAlive/_cf_streamKeepAlive and instead relies on the base Agent implementation, removing duplicate code and keeping types consistent.
@threepointone threepointone merged commit c898308 into main Mar 1, 2026
3 checks passed
@threepointone threepointone deleted the keepalive branch March 1, 2026 10:56
@github-actions github-actions bot mentioned this pull request Mar 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant