fix(auth): structural gate falls through to first-org/first-project when stale IDs miss pendingOrgs by kelsonpw · Pull Request #797 · amplitude/wizard

kelsonpw · 2026-05-15T22:24:26Z

Summary

Fixes the recurring env-picker hang after git reset --hard on a previously-instrumented project. 6 prior PRs (#747 / #760 / #762 / #775 / #778 / #780) each landed a gate that worked at the unit-test level, but the live repro kept reproducing. This pins the actual gap.

Root cause

PR #780 added a structural fallback (needsEnvPickStillRequired → pendingOrgs[0].projects[0]) but only when BOTH selectedOrgId AND selectedProjectId are null. The live restart-after-reset sequence ends up with stale selectedOrgId/ProjectId (not null), so the fallback doesn't fire:

resolveCredentials populates pendingOrgs (first fetch) and defers on multi-env.
applyEnvSelectionDeferral sets pendingEnvSelection=true.
AuthScreen mounts; its single-org/single-project auto-resolve effect writes selectedOrgId='org-1', selectedProjectId='proj-1' from that first snapshot.
authTask runs OAuth, fetchAmplitudeUser fetches a SECOND time, setOAuthComplete REPLACES pendingOrgs with the fresh data. If IDs don't match across the two snapshots (re-fetch race, ordering, account changes, stale memory), the existing pendingOrgs.find(o => o.id === selectedOrgId) returns undefined.
The structural gate short-circuits to false. If anything then clobbers pendingEnvSelection=false, Auth.isComplete returns true and the router walks past Auth into the Setup-bucket screens. User sees ✓ Auth ─ ● Setup ← with no env picker.

Fix

When stale selectedOrgId/ProjectId don't resolve a project in pendingOrgs, fall through to the same pendingOrgs[0].projects[0] heuristic PR #780 added — instead of bailing to false. The gate is now load-bearing across:

(a) no IDs picked yet (PR fix(auth): structural gate covers no-pre-selection case (env-picker hang follow-up) #780)
(b) IDs picked and valid in pendingOrgs (existing happy path)
(c) IDs picked but stale relative to fresh pendingOrgs (this PR)

Single point of change in src/ui/tui/flows.ts — no new defensive layer.

Regression test

New src/ui/tui/__tests__/env-picker-restart-after-reset-hang.test.ts drives the complete bin.ts startup sequence and asserts the router parks on Auth at every step. The dedicated stale-IDs test failed pre-fix (router resolved to data-setup) and passes post-fix.

A pre-existing test in env-picker-repro.test.ts that asserted the OPPOSITE behavior ("does not block when the resolved project is missing from pendingOrgs") was inverted — it was inadvertently encoding the bug as expected behavior.

Test plan

pnpm exec vitest run --pool=forks --maxWorkers=1 src/ui/tui/__tests__/env-picker-restart-after-reset-hang.test.ts (new test passes)
pnpm exec vitest run --pool=forks --maxWorkers=1 src/ui/tui/__tests__/env-picker-repro.test.ts src/ui/tui/__tests__/flow-invariants.test.ts src/ui/tui/__tests__/router.test.ts src/ui/tui/screens/__tests__/AuthScreen.snap.test.tsx (all neighbours pass)
pnpm tsc --noEmit clean
pnpm lint clean
pnpm test — all 4293 tests pass
src/utils/wizard-abort.ts untouched

🤖 Generated with Claude Code

Note

Medium Risk
Changes the wizard router’s Auth completion gating logic; mistakes could trap users on Auth or incorrectly skip the env picker, but the change is small and covered by new regression tests.

Overview
Fixes a regression where the TUI wizard could advance past Auth into Setup screens without rendering the environment picker when pendingEnvSelection was cleared and selectedOrgId/selectedProjectId no longer matched the refreshed pendingOrgs.

needsEnvPickStillRequired now falls back to pendingOrgs[0].projects[0] when the selected IDs are stale (not just null), and tests were updated/added to assert the router consistently parks on Auth across the restart-after-reset sequence and stale-ID scenarios.

^{Reviewed by Cursor Bugbot for commit 771af11. Bugbot is set up for automated code reviews on this repo. Configure here.}

…stale selectedOrgId/ProjectId Prior PRs #747 / #760 / #762 / #775 / #778 / #780 each tried to fix the recurring "stuck on Setup with no env-picker" hang after `git reset --hard` on a previously-instrumented project. Each landed a gate that worked at the unit-test level, but the live repro kept reproducing. This pins the actual gap PR #780 left behind. PR #780's structural fallback (`needsEnvPickStillRequired` → `pendingOrgs[0].projects[0]`) only fires when BOTH `selectedOrgId` AND `selectedProjectId` are null. In the real restart-after-reset sequence: 1. `resolveCredentials` populates `pendingOrgs` (first fetch) and defers on multi-env. 2. `applyEnvSelectionDeferral` sets `pendingEnvSelection=true`. 3. AuthScreen mounts and its single-org/single-project auto-resolve effect writes `selectedOrgId='org-1'`, `selectedProjectId='proj-1'` from that first snapshot. 4. `authTask` runs OAuth, `fetchAmplitudeUser` fetches a SECOND time, and `setOAuthComplete` REPLACES `pendingOrgs` with the fresh data. If the two snapshots' IDs don't match (re-fetch race, ordering, account changes, stale in-memory IDs), the existing `pendingOrgs.find(o => o.id === selectedOrgId)` lookup returns `undefined`. 5. The structural gate short-circuits to `false`. If anything clobbers `pendingEnvSelection=false` (the recurring failure-mode the 6 prior PRs chased — every fix relied on the flag staying true), Auth.isComplete returns true and the router walks past Auth into the Setup-bucket screens. User sees `✓ Auth ─ ● Setup ←` with no env picker on screen. Fix: when stale `selectedOrgId/ProjectId` don't resolve a project in `pendingOrgs`, fall through to the same `pendingOrgs[0].projects[0]` heuristic instead of bailing to `false`. The structural gate is now load-bearing across all three states: (a) no IDs picked yet (PR #780) (b) IDs picked and valid in pendingOrgs (existing happy path) (c) IDs picked but stale relative to fresh pendingOrgs (this PR) Regression test: src/ui/tui/__tests__/env-picker-restart-after-reset-hang.test.ts drives the complete bin.ts startup sequence — buildSession → store.session → resolveCredentials multi-env defer → applyEnvSelectionDeferral → concludeIntro → setOAuthComplete → AuthScreen's setOrgAndProject — and asserts the router parks on Auth at every step. The new "stale IDs" test fails without the fix (router resolves to data-setup) and passes with it. Pre-existing test in env-picker-repro.test.ts that asserted the OPPOSITE behavior ("does not block when the resolved project is missing from pendingOrgs") was inverted to match the new (correct) behavior — that test was inadvertently encoding the bug as expected behavior. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… keys (#797 follow-up) (#805) 8 PRs deep into the recurring "frozen-after-reset" bug. Live integration render exposed the actual gap: when `pendingEnvSelection=true` and the fresh `setOAuthComplete` brings back pendingOrgs where one of the two envs has lost its `app.apiKey` (provisioning lag, role change, key rotation, transient server response — many production races land here), `selectableEnvs.length === 1`, AuthScreen's auto-select effect fires, the credential-loading effect takes the "selected env has apiKey" path, and the deferral flag gets cleared. The env picker NEVER renders — router walks past Auth into Run. The user's "✓ Auth ─ ● Setup" stepper was actually `Screen.Run` (Run is in the "Setup" bucket of WIZARD_STEPS). Confirmed by the "Progress Logs Snake" tabs — those are RunScreen's tabs. Fix: - Gate the auto-select-when-1-env effect on `!pendingEnvSelection`. The resolver explicitly said "user must choose" — auto-selecting silently bypasses that intent. - Widen `needsEnvPick` to render the picker even with a single-env list when `pendingEnvSelection=true`. Without this, gating the auto-select alone would freeze the user on Auth with no actionable surface — fixes the auto-resolve bug but creates a new dead-screen bug. Integration test boots App via ink-testing-library, drives the post- deferral state through `setOAuthComplete` with asymmetric env keys, and asserts the router stays on Auth. Confirmed FAILS without the fix (router resolves to Run) and PASSES with it. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

kelsonpw requested a review from a team as a code owner May 15, 2026 22:24

kelsonpw merged commit 37b3cc2 into main May 15, 2026
11 checks passed

amplitude-release-bot Bot mentioned this pull request May 15, 2026

chore(main): release wizard 1.18.0 #694

Open

This was referenced May 16, 2026

fix(auth): env picker renders when fresh fetch returns asymmetric env keys (#797 follow-up) #805

Merged

refactor(state): dedupe session-checkpoint helper boilerplate #825

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(auth): structural gate falls through to first-org/first-project when stale IDs miss pendingOrgs#797

fix(auth): structural gate falls through to first-org/first-project when stale IDs miss pendingOrgs#797
kelsonpw merged 1 commit into
mainfrom
fix/env-picker-actually-renders-after-reset

kelsonpw commented May 15, 2026 •

edited by cursor Bot

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kelsonpw commented May 15, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

Fix

Regression test

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kelsonpw commented May 15, 2026 •

edited by cursor Bot

Loading