Skip to content

Preserve session on auth failure: ExpiredSession, returnTo, draft persistence#263

Merged
91jaeminjo merged 23 commits into
mainfrom
fix/auth-failure-redirect
May 21, 2026
Merged

Preserve session on auth failure: ExpiredSession, returnTo, draft persistence#263
91jaeminjo merged 23 commits into
mainfrom
fix/auth-failure-redirect

Conversation

@91jaeminjo
Copy link
Copy Markdown
Collaborator

Summary

  • Introduce ExpiredSession — tokens are preserved across auth failures so a silent refresh can revive the session; the lobby keeps the row visible with an inline "sign in again" affordance instead of pruning it.
  • Split 401 vs 403 in the HTTP layer: a new PermissionDeniedException for 403 (renders inline, never funnels through markSessionExpired), while 401 funnels through the per-server auth funnel.
  • Wire a per-server route guard via connectionRevision so individual server flips re-evaluate routing even when the aggregate authState stays Authenticated.
  • Add PreAuthState.frontendReturnTo (30-min TTL, constructor-enforced relative-path guard) plumbed end-to-end from the route guard through ConnectFlow to the OIDC callback.
  • Persist composer drafts across the auth round-trip via ReturnToStorage (24h TTL, shared_preferences); restored on remount post-reauth.
  • Sharpen failure logging across the auth and refresh paths (SEVERE for invariant violations; WARNING for recoverable / explicit-rejection cases); include discoveryUrl in refresh-failure messages.
  • Lobby UI uses sealed switches throughout; new RoomsExpired panel widget test.

Test plan

  • flutter analyze — zero warnings
  • flutter test — 1453 tests pass
  • Auth callback rejects open-redirect targets (constructor + _safeReturnTo defense-in-depth)
  • Composer drafts persist across an auth round-trip and restore on remount
  • Lobby shows RoomsExpired panel for a flipped server while other rows stay loaded
  • In-flight history fetch is cancelled with reason 'auth expired' on auth flip
  • PermissionDeniedException (403) does not funnel through markSessionExpired
  • Manual smoke: sign in, force a 401 via token revoke at the IdP, confirm inline re-auth then draft restoration
  • Manual smoke: open two servers, expire one, confirm the other stays usable while the flipped row shows the inline sign-in affordance

🤖 Generated with Claude Code

91jaeminjo and others added 23 commits May 21, 2026 13:27
…ilures

Auth failures previously flipped sessions straight to NoSession, which
caused ServerManager persistence to overwrite the on-disk record with
KnownServer (no tokens) — destroying the refresh token even on
transient errors. The new ExpiredSession variant carries the
provider+tokens so the next interaction can attempt a silent refresh
and revive the session without a full OIDC round-trip.

- ExpiredSession(provider, tokens) joins the SessionState sealed family.
- AuthSession gains a parameterless markSessionExpired() funnel that
  flips Active → Expired; isAuthenticated stays strict-Active so
  ServerEntry.isConnected naturally returns false for Expired.
- _doRefresh now accepts both Active and Expired states; the post-await
  identical guard is relaxed to abort only on NoSession so a concurrent
  Active → Expired flip doesn't drop a successful refresh.
- invalidGrant and noRefreshToken now route through markSessionExpired
  instead of logout, keeping the refresh token on disk.
- ServerManager._onSessionChanged persists Expired with the same
  AuthenticatedServer(tokens) shape as Active.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Long-running streams (LLM generations, large uploads) can outlive a
1-minute proactive-refresh window. AuthException is non-retryable on
the SSE client, so a mid-stream access-token expiry violently aborts
the generation. Bumping the threshold to 5 minutes lets the proactive
refresh happen comfortably before any in-flight stream can cross the
expiry boundary.

Mid-stream refresh on the SSE client itself is the stricter fix; this
is the cheap version that lands the most common cases.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Throwing AuthException for both 401 and 403 forced every consumer to
disambiguate by status code and lumped two unrelated failure modes
together. 401 means the bearer token is missing or rejected — a
refresh or re-auth might fix it. 403 means the user is authenticated
but the server is refusing the action — re-auth with the same
identity won't help.

- New PermissionDeniedException sibling to AuthException; AuthException
  is now 401-only.
- HttpTransport throws PermissionDeniedException for 403 on both REST
  and stream responses.
- AguiStreamClient adds PermissionDeniedException to the non-retryable
  list.
- New FailureReason.permissionDenied; classifyError routes
  PermissionDeniedException and TransportError(403) to it instead of
  conflating with authExpired or serverError (which connotes 5xx).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GoRouter's refreshListenable was wired to the aggregate authState, a
'any server connected' computed signal. A per-server transition that
left another server authenticated kept the aggregate unchanged and
never fired the listenable — so the global redirect couldn't react to
single-server expiry while other servers remained active.

- New ServerManager.connectionRevision computed signal whose value is
  the list of every entry's session state. Bumps on any per-server
  transition and on server add/remove. No manual subscription
  bookkeeping — the signals package builds the dependency graph
  automatically through nested .value reads.
- refreshListenable now wraps connectionRevision.
- AuthAppModule.redirect restructured: public route → no-op; route
  names :serverAlias and that ServerEntry is not isConnected → redirect
  to homeWithUrl(serverUrl); else fall back to the global
  no-servers-authenticated check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Every site that previously caught AuthException now routes through the
per-server funnel. The session flips to ExpiredSession, the route
guard reacts (redirecting per-server-aliased routes to homeWithUrl),
and the lobby drops the section via its existing auth subscription.
Inline 'Session expired' messages stay where they were so the user
sees them after re-auth.

- lobby_state.dart: AuthException on the rooms/profile fetch funnels
  and lets the existing _onAuthChanged subscription drop the section,
  instead of writing RoomsFailed. Non-auth errors still surface as
  RoomsFailed.
- upload_tracker.dart: AuthException after retries-exhausted calls
  _markFailed first (commit the persistent failed row) then funnels
  through the auth funnel. New PermissionDeniedException is naturally
  caught by the broader Object branch since it's its own exception type
  now.
- quiz_session_controller.dart: AuthException funnels; new
  PermissionDeniedException catch with distinct 'no permission' copy.
- thread_view_state.dart: FailedState(authExpired) funnels before
  surfacing the friendly banner.

DI plumbing:

- UploadTracker, QuizSessionController, and ThreadViewState now take
  AuthSession. Construction sites: upload_tracker_registry,
  quiz_screen, room_state (which forwards through to ThreadViewState).
- Existing tests bulk-updated to pass `auth` via setUp helpers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When the auth session leaves ActiveSession (expired or signed out),
room/upload consumers must cancel their in-flight work. Without this,
the SSE client and upload POSTs reconnect-loop against a dead token —
flooding the backend with 401s and burning the user's bandwidth while
the route guard is busy redirecting them.

- ThreadViewState subscribes to auth.session in the constructor; on
  transition to ExpiredSession/NoSession, cancels the history fetch
  token and the active AgentSession.
- UploadTracker does the same and walks every scope's pending records
  to cancel their CancelTokens.
- LobbyState was already doing this via its existing _onAuthChanged
  subscription (line 111). ExpiredSession returning false for
  isConnected makes the existing path fire correctly; no new code.

Test scaffolding: room_state_test and upload_tracker_registry_test
were using a Fake stub for AuthSession that threw on the new session
getter access. Both swapped to real AuthSession(refreshService:
FakeTokenRefreshService()) — cheap to construct, matches the real
interface, no more stub maintenance.

This closes Layer 2 of the auth-failure-redirect work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The OIDC roundtrip needs to know where to send the user after a
successful re-auth. Stashing the original route into PreAuthState
piggybacks on the same storage cookie the existing flow already uses
for serverUrl/providerId, so the callback can read it without any
new infrastructure.

- New optional String? frontendReturnTo field. toJson emits it only
  when set, so older stored entries deserialize cleanly. The callback
  will reject absolute URLs (http://, https://, //) at the boundary
  in a follow-up commit; the field itself just carries the string.
- maxAge bumped from 5 → 30 minutes. The previous 5-minute window
  was hostile to real OIDC flows: password reset, MFA prompts, and
  magic-link emails routinely take longer than five minutes. 30
  minutes still bounds CSRF risk while matching typical IdP cookie
  lifetimes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After a successful re-auth, AuthCallbackScreen now navigates to the
preAuth.frontendReturnTo if one was stashed; otherwise falls back to
the lobby as before.

The validation rejects anything that isn't a clean relative path
starting with /: absolute URLs (http://, https://) and
protocol-relative URLs (//host/path) all fall back to the safe
default. PreAuthState is short-lived and same-origin-stored, but the
guard is defense-in-depth — closing any path where a crafted value
could otherwise become an open-redirect target.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Threads an optional returnTo arg from connect() down to _authenticate()
where the PreAuthState is saved before the OIDC roundtrip. Held in a
_pendingReturnTo field rather than in each state variant so it survives
consent and provider-selection pauses without state-class churn. reset()
clears the field so a fresh connect() doesn't inherit a stale value.

Callers (HomeScreen's autoConnect path) will pass through what the
route guard stashed before redirecting to homeWithUrl. AuthCallbackScreen
already honors the round-tripped value.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the return-to loop end-to-end. On a per-server route denial,
the guard now embeds the original matchedLocation as a returnTo query
param on the homeWithUrl redirect target. HomeScreen reads the query
param and forwards it to ConnectFlow.connect, which writes it into
PreAuthState.frontendReturnTo. After the OIDC callback, the user
lands back on the route they came from.

- AppRoutes.homeWithUrl gains an optional returnTo named arg that
  appends &returnTo=... when set. Existing callers continue to work
  without it.
- AuthAppModule's redirect passes state.matchedLocation as returnTo
  on the per-server denial path.
- HomeScreen gains an autoConnectReturnTo constructor parameter,
  read from the home route's query string and forwarded to _connect's
  ConnectFlow.connect call.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When an auth failure interrupts a user mid-compose, the route guard
navigates them away to sign in and the freshly-mounted RoomScreen
loses their typed text. ReturnToStorage stashes the unsent draft in
shared_preferences keyed by (serverId, roomId); RoomScreen pre-fills
the composer from it on mount and clears the entry afterward.

- New lib/src/modules/auth/return_to_storage.dart: thin wrapper
  around the shared SharedPreferences singleton with a namespaced
  prefix and a 24-hour TTL. Drafts are user content — a short TTL
  would silently delete them, which is hostile. 24h bounds staleness
  without surprising the user.
- ThreadViewState subscribes to _lastSendError; persists only when
  there's both unsentText AND an AuthException underneath. Transient
  failures (network, server hiccup) keep the in-memory
  SendError.unsentText path since the screen stays mounted.
- RoomScreen.initState kicks off _restorePersistedComposer; pre-fills
  the controller if it's still empty when the async load resolves,
  then clears the entry. Subsequent mounts of the same room don't
  re-pre-fill.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The branch added FailureReason.permissionDenied but the enum-count
and order tests still asserted 8 values, breaking package-local CI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
http_transport's Throws blocks still said AuthException for 401/403;
now name PermissionDeniedException for 403. Lobby room renderer maps
PermissionDeniedException to a permission-specific message instead of
showing the raw exception, and the rooms catch documents that 403
intentionally renders inline rather than firing the auth funnel (re-
auth wouldn't help). User-profile catch documents the silent-null
disposition for the same reason. Quiz screen adds the matching arm
to its switch, mirroring quiz_session_controller.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three error paths on the auth-failure branch swallowed exceptions
silently: the unawaited composer-draft save in ThreadViewState, the
unawaited composer restore in RoomScreen, and AuthSession's refresh
catch arms (bare catch and the catch-all TokenRefreshFailure case).
All three exist to make the auth roundtrip reliable; silent failure
defeats the feature. Surface them via dev.log.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two regressions could silently break the auth roundtrip without
existing tests catching them: ConnectFlow dropping the returnTo it
captures in connect() before writing PreAuthState, and ServerManager
mapping ExpiredSession back to KnownServer (losing refresh tokens
across app restarts). Add a ConnectFlow test for the returnTo plumb
(with and without a passed value) and a ServerManager test asserting
the AuthenticatedServer arm carries through markSessionExpired.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
_safeReturnTo silently fell back to the lobby when rejecting an
unsafe returnTo; for security-relevant fallbacks observability beats
silence, so log at level 900 on rejection. UploadTracker's
_onAuthChanged walks scope.pending only — that's intentional because
each queued job shares its CancelToken with the matching _Pending
record and the drain loop checks isCancelled before running. Document
the shared-token invariant inline so the next reader doesn't worry
queued jobs leak through the auth funnel.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The rooms catch only stringified non-AuthException errors into
RoomsFailed without logging; the profile catch silently nulled out
the sidebar for everything that wasn't an AuthException. Network,
5xx, decode, and programmer errors were debuggable nowhere. Log
them at level 1000 (rooms — surfaced to the user) and 900 (profile
— silent UI fallback) with stack traces. PermissionDeniedException
is skipped because the UI already renders a permission-specific
message and the profile case is intentionally silent for it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A future refactor that lumps PermissionDeniedException under
AuthException (or otherwise routes 403 through the auth funnel)
would pass all existing tests while silently breaking the 401/403
split. Mirror the existing AuthException funnel test for the
PermissionDeniedException path: session stays ActiveSession and
the lobby section surfaces a RoomsFailed carrying the 403.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three review polish items rolled together:

- auth_session: log token-refresh failures at WARNING for the
  recoverable networkError and SEVERE for unknownError; print
  reason.name instead of the enum toString.
- auth_callback: split the returnTo rejection log into open-redirect
  (level 900) and malformed-path (level 800) cases. "lobby" is
  malformed, not unsafe; conflating them muddied the security signal.
- room_screen / thread_view_state: dartdoc the silent-swallow on
  composer restore and save so the API surface doesn't hide that
  storage failures are logged and dropped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
NetworkInspector keeps a FIFO ring buffer of 1000 events; under
sustained traffic the oldest events evict first, which means a
request can be dropped while its later-arriving response remains.
The grouper then builds a group with only a response set, and
sort()'s comparator throws StateError from timestamp's getter —
red-screening the inspector UI.

Filter such orphans out before sorting. They stay invisibly in the
buffer until normal FIFO eviction removes them too; no leak, no
cleanup logic. A token-refresh loop on a recent dev session
generated enough traffic to surface this; the conceptual bug — a
ring buffer of events that doesn't keep request/response paired —
remains and is worth a follow-up issue.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When a server's session expires, AuthSession flips it to
ExpiredSession (refresh tokens preserved). The lobby previously
treated this as identical to logout and pruned the row from both
roomsByServer and userProfiles — the server vanished from the main
content area while the sidebar still showed it with subtitle "Not
signed in," and the user had no inline way to recover.

Keep the row visible with a new RoomsExpired marker. The main lobby
section renders a "Session expired" panel with a "Sign in" button
that routes through homeWithUrl(serverUrl, returnTo: lobby), so the
user lands back at the lobby once re-auth completes. NoSession
(true logout) preserves the existing pruning behavior.

The sidebar subtitle now differentiates "Session expired" from
"Not signed in." This required the per-tile subtitle to .watch()
the entry's session signal directly — the parent only rebuilds on
the servers-map changing, which a session-state flip does not do.
Pinned by a regression test.

Drop the cached profile on the Active→Expired transition so a re-
auth as a different identity doesn't briefly show the previous
user's name; one frame of no-name placeholder is honest.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The lobby session-expired affordance dispatches in two places
(_onAuthChanged and _identityLabel) that switch on SessionState's
sealed family via if/is chains. The chains are exhaustive today
but the compiler can't enforce that — adding a fourth SessionState
subtype later would silently fall through to the wrong arm. Convert
both to switch expressions on the sealed type so the analyzer flags
missing arms.

The Sign in button in the RoomsExpired panel is the user's only
inline recovery path; removing it or mis-wiring its onPressed would
re-introduce the invisible-expired-server bug. Pin it with a widget
test that finds the panel by its description text, taps the button,
and asserts the GoRouter received /?url=...&returnTo=/lobby.

Tidy the profile-clear assertion comment to inline the WHY instead
of a soft "see test rationale above" back-reference.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- PreAuthState constructor rejects non-relative frontendReturnTo so the
  open-redirect guard is enforced by the type rather than only at the
  callback boundary; auth_callback_screen keeps _safeReturnTo as
  defense-in-depth against tampered storage.
- Refresh-failure logs include the IdP discoveryUrl; noRefreshToken
  logs at SEVERE (frontend invariant violation), invalidGrant at
  WARNING.
- ThreadViewState logs internalError / serverError FailedState at
  SEVERE so the underlying error and stack are debuggable beyond the
  user-facing string.
- auth_callback_screen catch-all log bumped to level 1000.
- Drop unused AuthSession.refreshToken getter and the matching tests.
- Pin: auth flip to ExpiredSession cancels an in-flight history fetch
  with reason "auth expired".
- Remove server_manager_test "fires when added or removed" — exercised
  the framework's computed re-evaluation, not our behavior.
- Comment cleanups: drop temporal/hypothetical baggage, fix the
  inaccurate markSessionExpired doc and the misleading "next microtask"
  remark in upload_tracker.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@91jaeminjo 91jaeminjo merged commit 2f0524c into main May 21, 2026
6 checks passed
@91jaeminjo 91jaeminjo deleted the fix/auth-failure-redirect branch May 21, 2026 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant