feat(gastown): persist Mayor conversation across container restarts by jrf0110 · Pull Request #1494 · Kilo-Org/cloud

jrf0110 · 2026-03-24T18:56:36Z

Summary

Implements Tier 1 of #1236: when the container restarts, the Mayor now recovers its prior conversation history from AgentDO's persisted streaming events instead of starting a blank session.

Changes:

Fix checkpoint propagation: sendMayorMessage and ensureMayor now pass agents.readCheckpoint() instead of checkpoint: null, so any checkpoint data survives restarts
Conversation reconstruction (conversation.ts): New module that reassembles {role, content} turns from message_part.updated and message.updated events in rig_agent_events. Groups streaming deltas by part ID (keeping the latest), resolves message roles, and orders chronologically
Prompt injection: The reconstructed transcript is formatted with <prior-conversation> XML tags and prepended to the agent's initial prompt via buildPrompt(), giving the Mayor semantic continuity
Context window management: Truncates to the last 50 turns with a 40k character budget (~10k tokens, well within 20% of a 200k-token context window)

This does not address Tier 1.5 (SIGTERM drain) or Tier 2 (DO-backed SDK persistence) — those are follow-up work.

Verification

pnpm vitest run test/unit/conversation.test.ts — 19 tests pass (reconstruction, streaming delta dedup, role inference, legacy event types, truncation, prompt formatting)

Visual Changes

N/A

Reviewer Notes

The reconstruction is intentionally lossy — it extracts text-type parts only (no tool calls, reasoning, etc.). The goal is semantic continuity, not byte-level replay. Tool state and reasoning are less valuable for continuation context and would bloat the prompt.
User messages often lack message_part.updated events (the user's text comes from the prompt/beadTitle, not the SDK event stream). The reconstruction includes them only when text parts exist; otherwise the assistant's replies provide sufficient context.
The buildPrompt change places conversation history before the bead title so the LLM sees prior context first, then the new instruction.

cloudflare-gastown/src/dos/Town.do.ts

cloudflare-gastown/src/dos/town/conversation.ts

kilo-code-bot · 2026-03-24T19:05:24Z

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Files Reviewed (4 files)

cloudflare-gastown/src/dos/Agent.do.ts
cloudflare-gastown/src/dos/Town.do.ts
cloudflare-gastown/src/dos/town/conversation.ts
cloudflare-gastown/test/unit/conversation.test.ts

_{Reviewed by gpt-5.4-20260305 · 452,703 tokens}

…ia AgentDO event reconstruction Implement Tier 1 of #1236: reconstruct conversation history from persisted AgentDO streaming events and inject it into the Mayor's prompt on re-dispatch after container restart. - Fix checkpoint propagation: sendMayorMessage and ensureMayor now read the Mayor's checkpoint instead of passing null - Add conversation.ts module that reconstructs {role, content} turns from message_part.updated and message.updated events stored in rig_agent_events - Inject formatted transcript into buildPrompt as prior-conversation context - Truncate to last 50 turns with a 40k character budget (~10k tokens) - Add 19 unit tests covering reconstruction, streaming delta deduplication, role inference, legacy event types, truncation, and prompt formatting

…g tags Address PR review comments: - sendMayorMessage cold-start path now uses combinedMessage (with system-reminder UI context) instead of raw message - Escape </prior-conversation> in turn content to prevent XML injection from prior assistant output breaking the wrapper format

cloudflare-gastown/src/dos/town/conversation.ts

cloudflare-gastown/src/dos/Town.do.ts

…transcript Address PR review comments: - Move reconstructConversation into AgentDO so the event reduction runs in the agent's own DO instead of burdening the TownDO with fetching and processing thousands of events - Switch from User:/Assistant: line protocol to JSON serialization inside the <prior-conversation> wrapper, eliminating both closing-tag and fake-turn injection vectors - Escape </ in JSON payload to prevent literal closing tags

…1494) * feat(gastown): persist Mayor conversation across container restarts via AgentDO event reconstruction Implement Tier 1 of #1236: reconstruct conversation history from persisted AgentDO streaming events and inject it into the Mayor's prompt on re-dispatch after container restart. - Fix checkpoint propagation: sendMayorMessage and ensureMayor now read the Mayor's checkpoint instead of passing null - Add conversation.ts module that reconstructs {role, content} turns from message_part.updated and message.updated events stored in rig_agent_events - Inject formatted transcript into buildPrompt as prior-conversation context - Truncate to last 50 turns with a 40k character budget (~10k tokens) - Add 19 unit tests covering reconstruction, streaming delta deduplication, role inference, legacy event types, truncation, and prompt formatting * fix: use combinedMessage on cold-start and sanitize transcript closing tags Address PR review comments: - sendMayorMessage cold-start path now uses combinedMessage (with system-reminder UI context) instead of raw message - Escape </prior-conversation> in turn content to prevent XML injection from prior assistant output breaking the wrapper format * fix: move reconstruction into AgentDO and use JSON serialization for transcript Address PR review comments: - Move reconstructConversation into AgentDO so the event reduction runs in the agent's own DO instead of burdening the TownDO with fetching and processing thousands of events - Switch from User:/Assistant: line protocol to JSON serialization inside the <prior-conversation> wrapper, eliminating both closing-tag and fake-turn injection vectors - Escape </ in JSON payload to prevent literal closing tags

Reconstruct conversation history from AgentDO events during model hot-swap, using the same mechanism as container restarts (PR #1494). The TownDO reconstructs the transcript and passes it through the PATCH /agents/:id/model endpoint to the container, where it is prepended to the startup prompt so the mayor retains context.

* feat(gastown): add role_models field to TownConfig schema and admin router Add optional role_models field with per-role (mayor, refinery, polecat) model overrides to TownConfigSchema, TownConfigUpdateSchema, and the admin router's TownConfigRecord mirror. Backward compatible — existing towns without role_models continue to parse correctly. Refs: #1512 * feat(gastown): implement per-role model resolution in resolveModel() Check role_models[role] before falling back to default_model. Priority: role_models[role] → default_model → hardcoded fallback. Rename _role to role now that the parameter is used. * fix(gastown): remove 'as' cast from resolveModel, use widened type annotation Replace unused _role parameter with active role-based lookup using Record<string, string | undefined> type annotation instead of 'as' cast. This is functionally identical but satisfies the coding style rule against TypeScript 'as' operator. * feat(gastown): use per-role model resolution for mayor hot-swap Update updateTownConfig to compare the mayor's effective model (resolved via role_models.mayor → default_model → fallback) instead of only comparing default_model. This ensures mayor session restarts when role_models.mayor is added, changed, or removed. * feat(gastown): add per-role model selectors and max polecats slider to town settings - Add per-role model overrides (mayor, refinery, polecat) in accordion UI - Each role selector has 'Use default' placeholder and clear (X) button - Save logic sends role_models with empty strings as undefined (fallback) - Page reloads when the mayor's effective model changes - Replace max polecats input with slider (1-50, 100% width) - Update tRPC type declarations with role_models field * fix(gastown): align slider max with Zod schema and regenerate trpc dist - Update max_polecats_per_rig Zod validation from .max(20) to .max(50) in both TownConfigSchema and TownConfigUpdateSchema to match the UI slider range (1-50) per the feature spec - Regenerate packages/trpc/dist/index.d.ts from feature branch source, removing previously included unrelated changes (Discord, billing promo, etc.) that were artifacts of a stale build * test(gastown): add resolveModel per-role model resolution tests Cover the full resolution priority chain (role override → default_model → hardcoded fallback), all eight specified test cases, and backward compatibility with legacy TownConfig objects that lack role_models. * fix(gastown): pass plugin env vars during mayor model hot-swap updateAgentModel was calling ensureSDKServer with an empty env dict, so the gastown plugin could not identify itself as a mayor agent and registered zero tools. Reconstruct the required GASTOWN_* env vars from the ManagedAgent record so the plugin initializes identically to the initial dispatch. Also fix accordion chevron direction: use ChevronRight rotating to 90° (down) on open, instead of ChevronDown rotating to 180° (up). * fix(gastown): replay full startup env during model hot-swap Instead of manually reconstructing a subset of env vars for the gastown plugin, store the complete buildAgentEnv dict on ManagedAgent at initial dispatch and replay it during model hot-swap. This preserves GIT_AUTHOR_*, GIT_COMMITTER_*, KILOCODE_TOKEN, GH_TOKEN, and all other env vars the SDK server needs. KILO_CONFIG_CONTENT and OPENCODE_CONFIG_CONTENT are excluded from the replay since updateAgentModel already rebuilds them with the new model. * fix(gastown): use live container token during model hot-swap GASTOWN_CONTAINER_TOKEN rotates via /refresh-token after initial dispatch. Prefer the current process.env value over the stale startupEnv snapshot when building the hot-swap env dict. * feat(gastown): preserve conversation history across mayor model changes Reconstruct conversation history from AgentDO events during model hot-swap, using the same mechanism as container restarts (PR #1494). The TownDO reconstructs the transcript and passes it through the PATCH /agents/:id/model endpoint to the container, where it is prepended to the startup prompt so the mayor retains context. * fix(gastown): use live config env vars and re-derive GH_TOKEN during hot-swap syncConfigToContainer updates process.env when settings change, but the model hot-swap was replaying the stale startupEnv snapshot. Now prefer the live process.env for all vars that syncConfigToContainer can update at runtime (GIT_TOKEN, GITHUB_CLI_PAT, GITLAB_TOKEN, git identity vars, etc.). Also re-derive GH_TOKEN from the live GITHUB_CLI_PAT > GIT_TOKEN > GITHUB_TOKEN priority chain, matching buildAgentEnv's logic. This fixes gh CLI auth loss after a model change when the user has a GitHub CLI PAT configured. * fix(gastown): allow clearing settings values like GitHub CLI PAT The UI was omitting empty fields from the config update, so the server merge logic preserved the old value. Now send empty strings for clearable fields (github_cli_pat, git_auth tokens, gitlab URL, default_model, git identity) so the server correctly clears them. Also add mask-preservation for github_cli_pat on the server side, matching the existing pattern for git_auth tokens and env_vars. * fix(gastown): restart mayor SDK server when auth config changes The kilo serve child process captures process.env at spawn time, so clearing or changing the GitHub CLI PAT, git tokens, etc. only takes effect after an SDK server restart. Now detect auth-relevant config changes (github_cli_pat, github_token, gitlab_token) and trigger updateMayorModel to restart the SDK server, even when the model itself hasn't changed. The hot-swap path re-derives GH_TOKEN from the live process.env, so the new kilo serve process gets the correct fallback to the integration token. * fix(gastown): sync fresh config into container process.env before SDK restart syncConfigToContainer updates the TownContainer DO's stored env vars, but those only take effect on the next container boot — the running container's process.env is not updated. When the SDK server restarts for auth config changes, it was still reading stale process.env. Now the PATCH /agents/:id/model endpoint receives fresh town config via X-Town-Config header and syncs it into process.env before the SDK server restart. This ensures the new kilo serve child process inherits the correct GITHUB_CLI_PAT, GIT_TOKEN, git identity, etc. * fix(gastown): auto-reload page when auth credentials change The server restarts the mayor's SDK server when auth config changes, creating a new session. But the UI only reloaded for model changes, leaving the frontend connected to the stale session. Now detect changes to github_cli_pat, github_token, and gitlab_token and trigger the same delayed page reload. * fix(gastown): address review comments — stale GH_TOKEN, blank model, gitlab URL - Delete GH_TOKEN from hotSwapEnv when all auth sources are cleared, preventing stale credentials from surviving auth removal - Revert default_model to conditional spread so empty string doesn't override resolveModel()'s hardcoded fallback - Include gitlab_instance_url in the auth change detection on both the server (tRPC router) and client (settings page reload) so switching GitLab hosts triggers an SDK server restart and page reload * fix(gastown): allow clearing default_model back to hardcoded fallback Send default_model: '' from the UI when the selector is blank, and normalize it to undefined server-side in updateTownConfig. This allows resolveModel()'s nullish-coalescing fallback to kick in, restoring the hardcoded default. Previously, omitting the field preserved the old value, and sending '' bypassed the ?? fallback.

jrf0110 self-assigned this Mar 24, 2026

kilo-code-bot bot reviewed Mar 24, 2026

View reviewed changes

cloudflare-gastown/src/dos/Town.do.ts Outdated Show resolved Hide resolved

cloudflare-gastown/src/dos/town/conversation.ts Outdated Show resolved Hide resolved

jrf0110 force-pushed the 1236-persist-agent-conversation branch from f76d0f5 to d56bc03 Compare March 24, 2026 19:07

kilo-code-bot bot reviewed Mar 24, 2026

View reviewed changes

cloudflare-gastown/src/dos/town/conversation.ts Outdated Show resolved Hide resolved

pandemicsyn approved these changes Mar 25, 2026

View reviewed changes

jrf0110 commented Mar 25, 2026

View reviewed changes

cloudflare-gastown/src/dos/Town.do.ts Outdated Show resolved Hide resolved

jrf0110 merged commit 258408f into main Mar 25, 2026
19 checks passed

jrf0110 deleted the 1236-persist-agent-conversation branch March 25, 2026 03:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(gastown): persist Mayor conversation across container restarts#1494

feat(gastown): persist Mayor conversation across container restarts#1494
jrf0110 merged 3 commits intomainfrom
1236-persist-agent-conversation

jrf0110 commented Mar 24, 2026

Uh oh!

Uh oh!

Uh oh!

kilo-code-bot bot commented Mar 24, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jrf0110 commented Mar 24, 2026

Summary

Verification

Visual Changes

Reviewer Notes

Uh oh!

Uh oh!

Uh oh!

kilo-code-bot bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kilo-code-bot bot commented Mar 24, 2026 •

edited

Loading