Skip to content

feat(server): resume sessions across agent URL query changes#3410

Merged
Sayt-0 merged 1 commit into
docker:mainfrom
gtardif:allow_agent_switch_continue_session
Jul 2, 2026
Merged

feat(server): resume sessions across agent URL query changes#3410
Sayt-0 merged 1 commit into
docker:mainfrom
gtardif:allow_agent_switch_continue_session

Conversation

@gtardif

@gtardif gtardif commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Problem

Sessions started via docker-agent serve api <url> record the full URL-encoded agent reference as their source key, query parameters included.
When the server is relaunched with a different tag (e.g. different version of an agent yaml), the exact key recorded by the client no longer exists in Sources. Resuming a stored session then fails:

500  error="failed to run session: agent not found: http..."

It works for the first message only because a live runtime is cached in runtimeSessions (checked before Sources); after a restart that cache is empty, so the lookup must rebuild from Sources and misses.

Note: viewing/listing old sessions was never affected — those are keyed by session ID in the store. This PR fixes continuing them.

Fix

Add a resume-time fallback in the session manager. loadTeam / loadTeamWithConfig now share a resolveSource helper that:

  1. Prefers an exact key match — all existing behaviour (side-by-side variants, /api/agents, directory mode) is unchanged.
  2. On a miss, matches on a stable identity via a new config.StableSourceKey. For URL references the identity is the path (scheme + host + path); the entire query string and fragment are treated as volatile. This is a deliberate rule rather than an enumerated denylist, so any current or future query parameter is ignored without further code changes.
  3. Only resolves when unambiguous — if several live sources share the requested stable identity, it declines (returns agent not found) rather than guessing, so side-by-side variants are never silently mis-selected.

task build, task lint (0 issues), and the affected packages' tests all pass.

Sessions record the full URL-encoded agent reference as their source
key, including query parameters such as gordonTag. When the API server
is relaunched with a different tag (e.g. v9-light -> v9-dev), the exact
key no longer exists in Sources, so resuming a stored session failed
with "agent not found" (HTTP 500), even though the underlying agent is
the same.

Add a resume-time fallback: resolveSource prefers an exact key match and,
on a miss, matches on a stable identity computed by StableSourceKey. For
URL references the identity is the path (scheme + host + path); the entire
query string and fragment are treated as volatile, so any current or
future query parameter is ignored without an enumerated denylist. The
fallback only resolves when unambiguous, so side-by-side variants are
never silently mis-selected.

Viewing/listing old sessions was already unaffected (keyed by session ID
in the store); this fixes continuing them.
@gtardif gtardif requested a review from a team as a code owner July 2, 2026 09:56

@docker-agent docker-agent left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟢 APPROVE

The session-resumption fallback logic is well-designed and correctly implemented.

What was reviewed:

  • StableSourceKey in pkg/config/resolve.go: URL decoding + query/fragment stripping is symmetric — both stored map keys and client-supplied keys are url.QueryEscape-encoded, so both sides go through the same decode→strip path, producing equal stable keys. Non-URL keys (local files, OCI refs, builtins) fall through unchanged. Error handling on malformed inputs is sound (return key unchanged on any parse failure).
  • resolveSource in pkg/server/session_manager.go: Exact-match-first logic is correct. The ambiguity guard (matches > 1 → decline) prevents silent mis-selection of side-by-side variants. The matches == 0 path also correctly returns agent-not-found.
  • Integration into loadTeam / loadTeamWithConfig: straightforward delegation with unchanged error propagation.

No high or medium severity bugs were found in the changed code.

@aheritier aheritier added area/config For configuration parsing, YAML, environment variables area/sessions For features/issues/fixes related to session lifecycle (resume, persistence, export) kind/feat PR adds a new feature (maps to feat:). Use on PRs only. labels Jul 2, 2026
@Sayt-0 Sayt-0 merged commit 60a6500 into docker:main Jul 2, 2026
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/config For configuration parsing, YAML, environment variables area/sessions For features/issues/fixes related to session lifecycle (resume, persistence, export) kind/feat PR adds a new feature (maps to feat:). Use on PRs only.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants