feat(cli): posthog feature flags + fast_provision experiment#3366
Merged
la14-1 merged 2 commits intoOpenRouterTeam:mainfrom Apr 28, 2026
Merged
feat(cli): posthog feature flags + fast_provision experiment#3366la14-1 merged 2 commits intoOpenRouterTeam:mainfrom
la14-1 merged 2 commits intoOpenRouterTeam:mainfrom
Conversation
Wires PostHog `/decide` into the CLI so we can A/B-test provisioning behaviors. First experiment: `fast_provision` — for users who didn't pass --beta or --fast manually, the `test` variant turns on `tarball + images` by default. Hypothesis: faster provisioning → fewer drop-offs in the "VM ready → install completed" leg of the funnel. What's added: - `shared/install-id.ts` — stable per-machine UUID, persisted at ~/.config/spawn/.telemetry-id. Reuses telemetry's existing path so existing users keep their PostHog identity. Falls back to an ephemeral UUID on disk-write failure. - `shared/feature-flags.ts` — hand-rolled POST to PostHog /decide (no SDK dep). 1.5s timeout, fail-open. On-disk cache at $SPAWN_HOME/feature-flags-cache.json with 1h TTL so cold starts don't pay the network cost. SPAWN_FEATURE_FLAGS_DISABLED=1 kill switch. Captures `$feature_flag_called` exposure events for both arms so PostHog can compute conversion. - `shared/telemetry.ts` — moves user-id loading into install-id.ts so flags and events share the same `distinct_id`. - `index.ts` — `await initFeatureFlags()` at the top of `main()`, then applies `fast_provision`'s `test` variant by appending `tarball,images` to SPAWN_BETA — but only if the user didn't pass --beta or --fast (those always win, so opt-out is free). Why tarball+images and not all four (`+parallel,docker`): clean attribution. The hypothesis is about tarball/image; if we ship the full --fast bundle we can't tell which feature moved the metric. Keep --fast as the user-facing power-user knob. Tests: 14 new (install-id roundtrip + format guard, feature-flags fetch/timeout/HTTP500/malformed/disabled/idempotent/stale-cache, exposure-event behavior). Full suite: 2183 pass, same 4 pre-existing failures as upstream/main. Bumps CLI to 1.0.23. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nt real SWR
Two review-fix commits from PR feedback squashed into one:
1. Move `await initFeatureFlags()` below the `spawn pick` and
`spawn feedback` bypass clauses in `main()`. Both commands are called
from bash scripts and must stay fast; neither gates on a flag, so
there's no reason to pay up to 1.5s of network latency on cold cache.
2. Implement real stale-while-revalidate in `shared/feature-flags.ts`.
The prior implementation did a synchronous fetch on stale cache,
which contradicted the docstring and PR description. Now:
- fresh cache (<TTL) → use cache, no network
- stale cache (>=TTL) → use cache immediately, refresh in background
- no cache → await sync fetch (first run only)
Adds `_awaitBackgroundRefreshForTest()` so tests can deterministically
wait for the background refresh before asserting. Updated the existing
"stale cache" test to verify SWR semantics (stale served first, fresh
lands next invocation) and added a "fresh cache does not fetch" test.
All 2127 tests pass; biome clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Member
|
Applied the two must-fix items from review in d2ec13d:
Test coverage:
All 2127 tests pass, biome clean. |
la14-1
approved these changes
Apr 28, 2026
Member
la14-1
left a comment
There was a problem hiding this comment.
Review fixes applied: fast-path skip + real SWR. All checks green.
5 tasks
This was referenced Apr 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Wires PostHog `/decide` into the CLI so we can A/B-test provisioning behaviors with feature flags. First experiment: `fast_provision` — for users who didn't pass `--beta` or `--fast` manually, the `test` variant turns on `tarball + images` by default to see if faster provisioning lifts the late-funnel conversion rate.
The PostHog experiment was already created in the dashboard; this PR is the code side of it.
Design calls
Why `tarball,images` and not the full `--fast` set (`+parallel,docker`)? Clean attribution. The hypothesis is specifically about tarball/image; if we ship the full `--fast` bundle we can't tell which feature moved the metric. `--fast` stays as the power-user knob.
Why share `distinct_id` with telemetry? PostHog identity needs to match across telemetry events and flag decisions, otherwise the experiment's exposure events don't line up with the funnel events they're supposed to attribute. Telemetry already had a persistent user-id at `~/.config/spawn/.telemetry-id` — moved that into a shared `install-id.ts` module so feature flags reuse it. Existing users keep their bucket.
On-disk cache with 1h TTL. Without a cache, every `spawn` invocation pays a 1.5s network call. Stale-while-revalidate via the cache file means cold starts get a near-instant variant, refreshes happen lazily.
User-wins. If the user passes `--beta tarball` or `--fast`, the flag is bypassed entirely. `SPAWN_FEATURE_FLAGS_DISABLED=1` is a hard kill switch.
Files
Rollout
Recommend ramping the PostHog flag at 5% → 25% → 50% → 100% on the `test` variant with 24h between bumps. The 1.5s fail-open timeout is itself a soft kill switch — if PostHog is down, every user gets control.
Test plan
Bumps CLI to 1.0.23.
🤖 Generated with Claude Code