fix(auth): refresh on 401 and stop writing bogus refresh tokens#359
fix(auth): refresh on 401 and stop writing bogus refresh tokens#359brycelelbach wants to merge 1 commit intobrevdev:mainfrom
Conversation
Users report being logged out of brev-cli multiple times per hour, forcing repeated `brev login --token`. Two bugs in the auth flow caused this: 1. `LoginWithToken` stored the literal string "auto-login" as the RefreshToken whenever it was handed a valid JWT access token (pkg/auth/auth.go). When that short-lived access token later expired, the refresh path in `GetFreshAccessTokenOrNil` tried to exchange "auto-login" with the IdP, which always failed, so the user was prompted to re-login every time the access token aged out — roughly hourly with typical NVIDIA KAS token lifetimes. 2. The auth-failure retry handler in pkg/store/http.go only fired on HTTP 403 Forbidden. APIs returning 401 Unauthorized (the more standard response for expired credentials) bypassed the refresh hook entirely and surfaced as a hard failure on the first such response, even when the refresh token was still good. This change: - Stores an empty RefreshToken when LoginWithToken receives a JWT, so the refresh path correctly recognizes there is nothing to refresh and falls through without a failed IdP round-trip. Emits a one-line notice on stderr so the user knows this session cannot be auto-renewed. - Normalizes the legacy "auto-login" sentinel to "" on read, so users with existing credentials.json files from older CLI versions benefit from the fix without re-running `brev login --token`. - Changes `GetFreshAccessTokenOrNil` to return "" (rather than an expired AccessToken) when there is no way to refresh, so callers prompt for re-login cleanly instead of shipping an expired JWT to the API. - Extends the refresh-and-retry handler in AuthHTTPStore to fire on both 401 and 403. Longer-term the website at brev.nvidia.com/settings/cli should hand out a real refresh token (or a short-lived exchange code) rather than a raw access JWT so `brev login --token` can produce a refreshable session. This PR is the CLI-only stop-the-bleeding fix. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Fixes brev-cli auth churn by preventing invalid refresh-token storage during brev login --token and ensuring auth refresh/retry triggers on both 401 and 403 responses.
Changes:
- Stop writing the legacy
"auto-login"string as a refresh token; treat it as absent when reading existing credentials. - Update access-token retrieval to return
""(no token) when an access token is invalid/expired and no refresh token is available. - Broaden the HTTP refresh-and-retry hook to trigger on both
401 Unauthorizedand403 Forbidden.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| pkg/store/http.go | Expands auth-failure detection (401/403) for refresh-and-retry behavior via isAuthFailure. |
| pkg/auth/auth.go | Removes bogus refresh-token writes, normalizes legacy sentinel on read, and avoids returning expired/invalid access tokens without refresh. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| s.authHTTPClient.restyClient.OnAfterResponse(func(c *resty.Client, r *resty.Response) error { | ||
| if r.StatusCode() == http.StatusForbidden && r.Request.Attempt < attemptsThresh+1 { | ||
| if isAuthFailure(r.StatusCode()) && r.Request.Attempt < attemptsThresh+1 { | ||
| err := handler() |
There was a problem hiding this comment.
This change broadens the retry/refresh behavior to include 401 responses, but pkg/store/http_test.go currently only exercises the 403 path. Add a unit test covering 401 to ensure the handler runs and the request retries as expected.
| @@ -154,16 +161,20 @@ func (t Auth) GetFreshAccessTokenOrNil() (string, error) { | |||
| if err != nil { | |||
| return "", breverrors.WrapAndTrace(err) | |||
| } | |||
| if !isAccessTokenValid && tokens.RefreshToken != "" { | |||
| if !isAccessTokenValid { | |||
| if tokens.RefreshToken == "" { | |||
| // Access token is expired and we have no refresh token. Returning | |||
| // the expired token here would just cause a 401 on the next API | |||
| // call; return empty so callers can prompt for re-login instead. | |||
| return "", nil | |||
| } | |||
There was a problem hiding this comment.
GetFreshAccessTokenOrNil now (1) treats the legacy refresh-token sentinel "auto-login" as absent and (2) returns an empty token when the access token is invalid/expired and no refresh token is available. Please add/adjust unit tests in pkg/auth/auth_test.go to cover these new branches so regressions don't reintroduce the logout loop.
Problem
Users report being logged out of
brev-climultiple times per hour, forcing repeatedbrev login --token $TOKEN. Two independent bugs in the auth flow combined to produce this.Bug 1:
LoginWithTokenwrites a bogus refresh tokenpkg/auth/auth.goLoginWithToken(before this PR):When a user runs
brev login --token $JWT(e.g. the snippet from https://brev.nvidia.com/settings/cli), the CLI saves the JWT as the access token and the literal string"auto-login"as the refresh token.Later, when the short-lived access token expires,
GetFreshAccessTokenOrNilseesRefreshToken != ""and callsoauth.GetNewAuthTokensWithRefresh("auto-login"). That always fails (the IdP doesn’t recognize the sentinel), so the user is bumped back to re-login. With typical NVIDIA KAS access-token TTLs, that happens roughly hourly.Bug 2: auth-failure retry only fires on 403, not 401
pkg/store/http.goSetForbiddenStatusRetryHandler(before this PR) installed a refresh-and-retry handler that only triggered onhttp.StatusForbidden(403). APIs returning401 Unauthorized— the more standard response for expired credentials — bypassed the hook entirely and surfaced as a hard failure on the first such response, even when the refresh token was still valid.Fix
pkg/auth/auth.goLoginWithTokenno longer writes"auto-login"as the refresh token for JWT input. It stores an emptyRefreshTokenand prints a one-line notice on stderr so the user knows the session will not auto-renew.GetFreshAccessTokenOrNilnormalizes the legacy"auto-login"sentinel to""on read, so existing~/.brev/credentials.jsonfiles from older CLI versions benefit from the fix without re-runningbrev login --token.GetFreshAccessTokenOrNilnow returns""instead of the expired JWT. Callers prompt for re-login cleanly instead of shipping a doomed request.pkg/store/http.goisAuthFailure(code int) boolhelper. The public function name (SetForbiddenStatusRetryHandler) is left unchanged to avoid touching callers; the comment onisAuthFailuredocuments the broadened semantics.What this does and does not fix
Fixes: the hourly logout loop caused by the bogus-refresh + JWT-expiry interaction, and unhandled 401s that skipped the refresh path.
Does not fix:
brev login --tokenstill produces a non-refreshable session, because the website currently hands out a raw access JWT rather than a refresh token. Step 2 of the larger plan is a server-side change tobrev.nvidia.com/settings/cliso it issues a refresh token (or a short-lived exchange code) that the CLI can swap for{access, refresh}. That is out of scope for this PR.Test plan
go build ./...go vet ./pkg/store/... ./pkg/auth/...go test ./pkg/auth/...brev login --token $JWT, wait for the access token to expire, verify the CLI prompts for re-login once (not per-call) and that subsequent calls with a real{access, refresh}pair auto-refresh without re-prompting.🤖 Generated with Claude Code