feat: unload local models when switching agents by dgageot · Pull Request #2660 · docker/docker-agent

dgageot · 2026-05-06T12:05:28Z

Closes #2636.

What

Adds an opt-in mechanism for local inference engines (DMR, ollama, ramalama, ...) to release a model from memory when the active agent switches to one that uses a different model.

User-facing surface

Two new pieces of config (latest only — older versions are frozen):

`provider_opts.unload_on_switch: true` on a model — the runtime calls the engine's unload endpoint when switching away from any agent using this model.
`unload_api: ` on a `ProviderConfig` — endpoint the runtime hits with `POST {scheme://host}{unload_api}` and body `{"model": ""}`.

For DMR specifically, the default `/_unload` endpoint is auto-derived from the OpenAI base URL (mirroring how `_configure` is derived), so DMR users don't need to set `unload_api` unless they want to override it.

```yaml
providers:
local_dmr:
provider: dmr
base_url: http://model-runner.docker.internal/engines/llama.cpp/v1
unload_api: /engines/_unload # optional for DMR; defaults to /_unload

models:
qwen:
provider: local_dmr
model: ai/qwen3
provider_opts:
unload_on_switch: true
```

A runnable example lives at `examples/unload_on_switch.yaml`.

Wiring

The unload runs at every agent-switch entry point:

`swapCurrentAgent` (transfer_task forward + return)
`handleHandoff` (handoff)
`SetCurrentAgent` (TUI agent picker — async, see below)

Best-effort: each call is wrapped in a 10s timeout, providers that don't implement `Unloader` are silently skipped, and any error is logged but never propagated, so a slow or unreachable engine cannot break agent switching.

Architecture

`latest.ProviderConfig.UnloadAPI` — new field; merged into `ModelConfig.ProviderOpts["unload_api"]` so provider implementations don't need a back-reference to the parent config.
`latest.ModelConfig.UnloadOnSwitch()` / `UnloadAPI()` — small accessor methods next to the existing `DisplayOrModel()`.
`provider.Unloader` — optional interface in `pkg/model/provider/provider.go` (sibling of `RerankingProvider`).
`base.PostUnloadModel` / `base.JoinHostAndPath` — shared HTTP + URL helpers in `pkg/model/provider/base`. DMR and OpenAI both already import `base`, so no new package and no import cycle.
`dmr.Client.Unload` — DMR-specific URL derivation (`/v1` → `/_unload`) plus the shared HTTP helper.
`openai.Client.Unload` — no-op when no `unload_api` is configured (cloud providers don't have one).
`runtime.LocalRuntime.unloadOnSwitch` — iterates the previous agent's configured models and calls `Unload` on opted-in ones.

TUI freeze fix

`SetCurrentAgent` is called from the bubbletea Update loop, so a synchronous unload would freeze the TUI for up to 10s. The runtime spawns the unload in a goroutine for that path only — fire-and-forget is safe because the new agent's model isn't loaded until the user sends a message, by which time the unload almost certainly finished. The other two switch paths (`swapCurrentAgent`, `handleHandoff`) stay synchronous because there a new model IS loaded immediately and we want unload to complete first.

Tests

`pkg/config/latest/unload_test.go` — `UnloadOnSwitch` / `UnloadAPI` accessors.
`pkg/model/provider/base/unload_test.go` — URL resolution + HTTP behaviour (happy path, non-2xx, nil client fallback).
`pkg/model/provider/dmr/unload_test.go` — default `_unload` derivation, custom path override, error propagation, no-op when nothing is configured.
`pkg/model/provider/openai/unload_test.go` — no-op without `unload_api`, happy path, error path.
`pkg/model/provider/provider_defaults_test.go` — three new cases for the `UnloadAPI` plumbing.
`pkg/runtime/unload_test.go` — opt-in respected, no-op when not opted in or same agent or nil prev, errors don't propagate, providers without `Unloader` are skipped, and `TestSetCurrentAgent_UnloadIsAsync` (uses a blocking unloader) asserts the picker returns before `Unload` completes. Runs cleanly under `-race`.

Validation

`mise build` ✓
`mise lint` ✓ (`golangci-lint` and the in-tree `./lint` cop both clean)
`mise test` ✓

- SetCurrentAgent runs Unload in a goroutine so the TUI agent picker, which calls it from the bubbletea Update loop, isn't frozen while the engine acknowledges the unload (up to 10s). - unloadOnSwitch now skips nil providers defensively. - New TestSetCurrentAgent_UnloadIsAsync asserts the picker returns before Unload completes.

rumpl · 2026-05-06T12:27:51Z

Why isn't this implemented as a hook?

dgageot · 2026-05-06T15:48:47Z

@rumpl I'll give it a try

aheritier

Overall LGTM — clean architecture, safe defaults, and comprehensive tests. Left a few inline nits below.

aheritier · 2026-05-06T20:27:26Z

+		return err
+	}
+	return base.PostUnloadModel(ctx, httpclient.NewHTTPClient(ctx), endpoint, c.ModelConfig.Model)
+}


This creates a fresh *http.Client on every Unload call, which means any http_headers configured in provider_opts won't be forwarded to the unload endpoint. For the current use-case (unauthenticated local engines) this is fine, but it's asymmetric with the DMR path that reuses c.httpClient.

Worth a short comment so future maintainers don't wonder why custom headers aren't forwarded here:

// httpclient.NewHTTPClient is used instead of reusing the SDK client because // the openai.Client wraps its transport and there's no clean way to extract a // raw *http.Client from it. For local engines (the only use-case for unload_api) // this is fine — they typically don't require auth headers.

aheritier · 2026-05-06T20:27:26Z

+		unloader, ok := m.(provider.Unloader)
+		if !ok {
+			continue
+		}


Minor: cancel() is called after Unload returns, which is correct, but defer cancel() would be the idiomatic Go pattern and would also handle a potential panic inside Unload without leaking the timer goroutine. Up to you — the current form is functionally fine since Unload implementations don't panic.

aheritier · 2026-05-06T20:27:26Z

+		return
+	}
+	for _, m := range prev.ConfiguredModels() {
+		if m == nil {


Nit: FallbackModels() isn't iterated here. That's probably intentional — a fallback model may never have been loaded — but worth a one-liner comment to make the deliberate choice clear:

// Only ConfiguredModels are considered; FallbackModels are skipped because // they may never have been loaded and unloading an absent model is harmless // but wasteful.

aheritier · 2026-05-06T20:27:26Z

+	go r.unloadOnSwitch(context.Background(), prev, next)
+	return nil
 }



If the TUI agent picker fires two rapid switches (A→B→C), two goroutines can be in flight simultaneously: one unloading A, one unloading B. If A and C happen to share a model, that model gets unloaded just before C needs it, causing an extra reload. Not a correctness issue given the best-effort semantics, but worth a comment so it's an acknowledged trade-off rather than an oversight.

aheritier

LGTM. Clean architecture: optional Unloader interface, shared HTTP/URL helpers in base, best-effort semantics with 10s timeout, the async path for the TUI picker is a good catch and well-justified.

The four inline nits I left earlier are all non-blocking — feel free to address them in a follow-up if you prefer:

defer cancel() in unloadOnSwitch (idiom)
comment about ConfiguredModels() only, skipping FallbackModels()
comment about the rapid A→B→C switch race
comment about why the OpenAI path can't reuse the SDK client

CI all green, comprehensive test coverage including -race.

dgageot · 2026-05-07T08:57:36Z

@rumpl do you prefer this one? #2684

dgageot · 2026-05-07T14:25:57Z

Closing in favour of #2684

dgageot added 2 commits May 6, 2026 11:42

feat: unload local models when switching agents

6e28588

dgageot requested a review from a team as a code owner May 6, 2026 12:05

aheritier reviewed May 6, 2026

View reviewed changes

aheritier approved these changes May 7, 2026

View reviewed changes

dgageot mentioned this pull request May 7, 2026

feat: add unload on_agent_switch builtin hook #2684

Merged

dgageot closed this May 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: unload local models when switching agents#2660

feat: unload local models when switching agents#2660
dgageot wants to merge 2 commits intodocker:mainfrom
dgageot:unload-on-switch

dgageot commented May 6, 2026

Uh oh!

rumpl commented May 6, 2026

Uh oh!

dgageot commented May 6, 2026

Uh oh!

aheritier left a comment

Uh oh!

aheritier May 6, 2026

Uh oh!

aheritier May 6, 2026

Uh oh!

aheritier May 6, 2026

Uh oh!

aheritier May 6, 2026

Uh oh!

aheritier left a comment

Uh oh!

dgageot commented May 7, 2026

Uh oh!

dgageot commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dgageot commented May 6, 2026

What

User-facing surface

Wiring

Architecture

TUI freeze fix

Tests

Validation

Uh oh!

rumpl commented May 6, 2026

Uh oh!

dgageot commented May 6, 2026

Uh oh!

aheritier left a comment

Choose a reason for hiding this comment

Uh oh!

aheritier May 6, 2026

Choose a reason for hiding this comment

Uh oh!

aheritier May 6, 2026

Choose a reason for hiding this comment

Uh oh!

aheritier May 6, 2026

Choose a reason for hiding this comment

Uh oh!

aheritier May 6, 2026

Choose a reason for hiding this comment

Uh oh!

aheritier left a comment

Choose a reason for hiding this comment

Uh oh!

dgageot commented May 7, 2026

Uh oh!

dgageot commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants