Compose shellout: missing --no-recreate flag destroys container state across docker daemon restarts

## Summary

The shellout compose backend (`runtime/docker/compose.go` → `docker compose up -d`) does not pass `--no-recreate`, which causes docker compose to **destroy and recreate** primary-service containers whenever it detects config drift — even when the caller passed `Recreate=false` and an existing container is present.

The container's writable layer (and anything in `$HOME` inside the container, e.g. `~/.claude/projects/<encoded>/<id>.jsonl`) is lost as a result.

Upstream [`devcontainers/cli`](https://github.com/devcontainers/cli/blob/main/src/spec-node/dockerCompose.ts) gates `--no-recreate` on whether a container already exists:

```ts
const args = ['--project-name', projectName, ...composeGlobalArgs];
args.push('up', '-d');
if (container || params.expectExistingContainer) {
    args.push('--no-recreate');
}
```

Our shellout path (`runtime/docker/compose.go:75`) builds only `up -d <services>` — never `--no-recreate`. The lib's outer code (`up.go:189-206`) already knows whether `existing != nil`, but never threads that signal into the compose argv.

## How we hit it

DAP workspaces are k8s pods with the docker data-root persisted on a PVC (`/workspace/docker`). When a session's pod is destroyed (idle timeout, deploy, eviction) and a new pod boots from the same PVC, dockerd restores the prior containers from the PVC and the runtime calls `Engine.Up` with `Recreate=false` (session resume). Expected behavior: the existing app container restarts and `~/.claude/projects/...` survives so the Claude SDK can resume the conversation.

Observed: the primary service container is recreated, its writable layer is gone, and the Claude SDK fails with `"No conversation found with session ID: <id>"`.

Sidecar services with a restart policy (e.g. mailcatcher) keep the same container ID across pod restarts because dockerd auto-starts them — so by the time the orchestrator inspects them they're already `Running` and config-hash drift doesn't trigger a recreate. The primary service has no restart policy, sits `Exited`, and gets caught by docker compose's default recreate-on-drift behavior.

## Reproduction (in DAP context, but mechanism is generic)

1. Cold-start a compose-based devcontainer workspace in a k8s pod with `/var/lib/docker` (or equivalent) on a PVC.
2. Touch a file in the container, e.g. `docker exec <primary> sh -c 'echo hi > /home/<user>/marker'`.
3. Kill the pod. Wait for a new pod to come up against the same PVC.
4. Call `Engine.Up` again with `Recreate=false`.
5. Observe: primary service container has a new ID; `/home/<user>/marker` is gone.

## Root cause

`runtime/docker/compose.go:75-80`:
```go
func buildUpArgs(spec runtime.ComposeUpSpec) []string {
    args := composeArgs(spec.ProjectName, spec.Files)
    args = append(args, "up", "-d")
    args = append(args, spec.Services...)
    return args
}
```

Without `--no-recreate`, compose recreates on any drift in:
- generated `dc-run.yaml` override content (any pod-scoped env in `ExtraEnvironment`)
- the resolved image digest stamped on the existing container vs the newly resolved one
- normalized project hash differences

Even when the user's intent (`opts.Recreate=false`) is unambiguous, the lib can't communicate it to compose.

## Fix

Mirror upstream:

1. Add `NoRecreate bool` to `runtime.ComposeUpSpec`.
2. `buildUpArgs` appends `--no-recreate` when `spec.NoRecreate` is set.
3. `upComposeShellout` (`up.go:597`) sets `NoRecreate: existing != nil`, where `existing` is the value already computed at `up.go:167`. Requires threading `existing` (or a bool derived from it) into `upComposeShellout`.

## Same class of bug on the native backend (not exercised yet, but worth fixing together)

`compose/orchestrator.go:460-470` decides reuse on three conditions:
```go
if details.Labels[LabelConfigHash] == hash &&
    details.Labels[LabelImageDigest] == imageDigest &&
    c.State == runtime.StateRunning {
    return c.ID, nil
}
// Different config or not running — recreate.
```
The `c.State == runtime.StateRunning` check fails after a daemon restart (containers are restored in `Exited` state) and falls through to stop+remove+create. A config-matched stopped container should be **started**, not recreated. Same root cause as the shellout flag gap; should be fixed in the same PR so the bug doesn't follow us when we flip the backend.

## Scope

- `runtime/runtime.go` — add `NoRecreate` to `ComposeUpSpec`.
- `runtime/docker/compose.go` — append `--no-recreate` when set; update `buildUpArgs` test.
- `up.go` — set `NoRecreate: existing != nil` in `upComposeShellout`.
- `compose/orchestrator.go` — replace the `State == Running` gate with a start-if-stopped branch.
- Integration test: cold-start compose, write a marker into the container, simulate daemon-restart-like recreation conditions, second `Engine.Up` with `Recreate=false`, assert the marker survives and the container ID is preserved.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compose shellout: missing --no-recreate flag destroys container state across docker daemon restarts #71

Summary

How we hit it

Reproduction (in DAP context, but mechanism is generic)

Root cause

Fix

Same class of bug on the native backend (not exercised yet, but worth fixing together)

Scope

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Compose shellout: missing --no-recreate flag destroys container state across docker daemon restarts #71

Description

Summary

How we hit it

Reproduction (in DAP context, but mechanism is generic)

Root cause

Fix

Same class of bug on the native backend (not exercised yet, but worth fixing together)

Scope

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions