Skip to content

feat: runtime apt detection + add docker to fast_provision experiment#3374

Closed
AhmedTMM wants to merge 3 commits intoOpenRouterTeam:mainfrom
AhmedTMM:docker-convenience-script
Closed

feat: runtime apt detection + add docker to fast_provision experiment#3374
AhmedTMM wants to merge 3 commits intoOpenRouterTeam:mainfrom
AhmedTMM:docker-convenience-script

Conversation

@AhmedTMM
Copy link
Copy Markdown
Collaborator

@AhmedTMM AhmedTMM commented Apr 29, 2026

Summary

`local.ts` — runtime apt detection (no PostHog flag). `ensureDocker()` now prefers `apt-get install docker.io` when `which apt-get` succeeds, and falls back to Docker's official convenience script (`curl -fsSL https://get.docker.com | sh`) otherwise. This unblocks Fedora/Arch/Alpine/openSUSE users without changing behavior on Debian/Ubuntu.

`index.ts` — extend the `fast_provision` experiment. The test variant previously pushed only `images`. It now pushes `images` + `docker`, so the experiment measures the full provisioning-speed bundle (marketplace image + Docker-preinstalled image). Control behaves as before.

Bumps CLI `1.0.27` → `1.0.28`.

Test plan

  • `bunx @biomejs/biome check src/` — clean
  • `bun test` — 2134 pass / 4 pre-existing failures (DO OAuth, hetzner createServer, cmdRun pipeline) — all unrelated
  • Verify the `fast_provision=test` cohort sees `SPAWN_BETA=images,docker` in PostHog event payloads
  • Manual: smoke-test `spawn claude local` on a non-apt distro (e.g., Fedora Docker container) and confirm get.docker.com path runs

🤖 Generated with Claude Code

The Linux branch of ensureDocker() assumed apt — fine on Debian/Ubuntu,
broken on Fedora/Arch/Alpine/openSUSE. Switch to Docker's official
convenience script, which handles distro detection, repo setup, and
service enablement in one call.

Bumps CLI 1.0.27 -> 1.0.28.
@AhmedTMM AhmedTMM marked this pull request as ready for review April 29, 2026 23:28
Restores apt-get as the default Linux install path and gates the
get.docker.com convenience script behind the `linux_docker_install`
PostHog flag (variant "test"). This lets us measure whether the
convenience script reduces install failures on non-apt distros
(Fedora/Arch/Alpine) before flipping the default.

Exposure event ($feature_flag_called) is captured automatically
by getFeatureFlag().
@AhmedTMM AhmedTMM changed the title fix(local): use get.docker.com instead of apt-get for Linux Docker install feat(local): experiment with get.docker.com via PostHog flag Apr 30, 2026
…riment

- local.ts: prefer apt-get if `which apt-get` succeeds; otherwise fall back
  to Docker's convenience script. Drops the linux_docker_install PostHog
  flag in favor of a simple capability check.
- index.ts: fast_provision test variant now pushes both "images" and
  "docker" — the experiment evaluates the full provisioning-speed bundle.
@AhmedTMM AhmedTMM changed the title feat(local): experiment with get.docker.com via PostHog flag feat: runtime apt detection + add docker to fast_provision experiment Apr 30, 2026
@la14-1
Copy link
Copy Markdown
Member

la14-1 commented Apr 30, 2026

Review

Code change is clean; the experiment design is what I'd push back on.

Code looks right

  • Fallback is safe: `getFeatureFlag("linux_docker_install", "control")` returns `"control"` if the flag is missing/unreachable, so the existing apt-get path is the default. No regression risk at rollout 0%.
  • Exposure events fire automatically via `getFeatureFlag()` — same pattern as `fast_provision`.
  • Version bump to `1.0.28` is correct (main is at 1.0.27 post-fix(local): install OrbStack from DMG when Homebrew is missing #3372).
  • `installCmd` is built from a booleans-only template literal; no injection surface.
  • The failure-hint message is switched per variant (generic docs page for `test`, existing apt hint for `control`). Good.
  • `initFeatureFlags()` is awaited before dispatch in `index.ts:873`, so flags are always loaded by the time `ensureDocker` runs.

The design concern

The flag buckets all Linux users regardless of distro on a stable hash of the install-id. That means the experiment lumps two completely different questions together:

  1. "Does the convenience script succeed on distros where apt doesn't exist?" — Fedora/Arch/Alpine/openSUSE users. We already know the answer: yes, because those users are at 0% success rate today.
  2. "Does the convenience script outperform apt-get on Debian/Ubuntu?" — the actual open question worth measuring.

With mixed bucketing:

  • Fedora user bucketed to `control` → `apt-get` invocation fails. No change from today. Stuck.
  • Ubuntu user bucketed to `test` → convenience script runs. Works, but changes the install footprint (CE repo, docker group, systemd unit) for a user who would have been fine with `docker.io`.

Net: this measures the non-Debian bug is real, which we already know, and it doesn't fix it for the ~50% of non-Debian users the flag routes to `control`.

Cleaner design: detect distro via `/etc/os-release` first. If it's Debian/Ubuntu, use the flag for A/B. Otherwise (Fedora/Arch/Alpine/openSUSE/unknown), bypass the flag and go straight to the convenience script — that path is strictly better than failing. Something like:

```ts
const isDebianFamily = existsSync("/etc/debian_version");
const useConvenienceScript = isDebianFamily
? getFeatureFlag("linux_docker_install", "control") === "test"
: true; // non-Debian: always use convenience script
```

That way:

  • Non-Debian users are fixed unconditionally (not randomized).
  • The experiment population is cleanly bounded to Debian, where the hypothesis "does get.docker.com improve install success" actually has a comparison baseline.
  • Exposure events still fire for the Debian slice — PostHog sees the right denominator.

Minor

  • `curl -fsSL` has no timeout — same nit I flagged on fix(local): install OrbStack from DMG when Homebrew is missing #3372. Flaky home network → CLI hangs indefinitely. `--connect-timeout 10 --max-time 120` fixes it.
  • `curl | sh` trust surface. Docker publishes `get.docker.com` as the official install path, so this is sanctioned, but some security-conscious users/orgs block it. Consider a `SPAWN_DOCKER_INSTALL=apt|script|manual` escape hatch for users who want to opt out of whichever variant they landed on.
  • Failure path is generic. If the convenience script fails mid-execution (half-installed state), users get a link to `docker/engine/install` docs but no hint that their system may be in a partial state. Not blocking — the convenience script is pretty robust.
  • Exposure only on non-root or docker-not-installed path. Users who already have Docker never hit `ensureDocker`'s install branch, so they don't trip the flag — which is correct, but worth noting the denominator in PostHog is "users who need Docker installed," not "Linux users overall."

PostHog prep

Test plan explicitly says "Configure `linux_docker_install` in PostHog with control/test variants before merging" — 👍. Confirm: does the PostHog project allow the flag key pattern with underscores? (`fast_provision` does, so this should too.) And set the rollout to 0% at merge, ramp after verifying exposure events land.

Verdict

LGTM on the code; request changes on the design — bypass the flag for non-Debian distros so the experiment population is actually comparable. The one-existsSync-check is the only delta from what's here.

Reviewed by SPA

@la14-1
Copy link
Copy Markdown
Member

la14-1 commented Apr 30, 2026

Closing per request. Design concerns in the review — experiment bucketing doesn't isolate the real question (Debian vs convenience-script on distros where both work). Reopen if you want to ship the distro-gated version.

@la14-1 la14-1 closed this Apr 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants