Skip to content

fix(auth): refresh on 401 and stop writing bogus refresh tokens#359

Open
brycelelbach wants to merge 1 commit intobrevdev:mainfrom
brycelelbach:fix/auth-token-refresh
Open

fix(auth): refresh on 401 and stop writing bogus refresh tokens#359
brycelelbach wants to merge 1 commit intobrevdev:mainfrom
brycelelbach:fix/auth-token-refresh

Conversation

@brycelelbach
Copy link
Copy Markdown
Contributor

Problem

Users report being logged out of brev-cli multiple times per hour, forcing repeated brev login --token $TOKEN. Two independent bugs in the auth flow combined to produce this.

Bug 1: LoginWithToken writes a bogus refresh token

pkg/auth/auth.go LoginWithToken (before this PR):

if valid {
    err := t.authStore.SaveAuthTokens(entity.AuthTokens{
        AccessToken:  token,
        RefreshToken: "auto-login",   // <- literal string, not a token
    })
}

When a user runs brev login --token $JWT (e.g. the snippet from https://brev.nvidia.com/settings/cli), the CLI saves the JWT as the access token and the literal string "auto-login" as the refresh token.

Later, when the short-lived access token expires, GetFreshAccessTokenOrNil sees RefreshToken != "" and calls oauth.GetNewAuthTokensWithRefresh("auto-login"). That always fails (the IdP doesn’t recognize the sentinel), so the user is bumped back to re-login. With typical NVIDIA KAS access-token TTLs, that happens roughly hourly.

Bug 2: auth-failure retry only fires on 403, not 401

pkg/store/http.go SetForbiddenStatusRetryHandler (before this PR) installed a refresh-and-retry handler that only triggered on http.StatusForbidden (403). APIs returning 401 Unauthorized — the more standard response for expired credentials — bypassed the hook entirely and surfaced as a hard failure on the first such response, even when the refresh token was still valid.

Fix

pkg/auth/auth.go

  • LoginWithToken no longer writes "auto-login" as the refresh token for JWT input. It stores an empty RefreshToken and prints a one-line notice on stderr so the user knows the session will not auto-renew.
  • GetFreshAccessTokenOrNil normalizes the legacy "auto-login" sentinel to "" on read, so existing ~/.brev/credentials.json files from older CLI versions benefit from the fix without re-running brev login --token.
  • When the access token is expired and there is no refresh token, GetFreshAccessTokenOrNil now returns "" instead of the expired JWT. Callers prompt for re-login cleanly instead of shipping a doomed request.

pkg/store/http.go

  • The refresh-and-retry handler now fires on both 401 and 403, via a small isAuthFailure(code int) bool helper. The public function name (SetForbiddenStatusRetryHandler) is left unchanged to avoid touching callers; the comment on isAuthFailure documents the broadened semantics.

What this does and does not fix

Fixes: the hourly logout loop caused by the bogus-refresh + JWT-expiry interaction, and unhandled 401s that skipped the refresh path.

Does not fix: brev login --token still produces a non-refreshable session, because the website currently hands out a raw access JWT rather than a refresh token. Step 2 of the larger plan is a server-side change to brev.nvidia.com/settings/cli so it issues a refresh token (or a short-lived exchange code) that the CLI can swap for {access, refresh}. That is out of scope for this PR.

Test plan

  • go build ./...
  • go vet ./pkg/store/... ./pkg/auth/...
  • go test ./pkg/auth/...
  • Manual: brev login --token $JWT, wait for the access token to expire, verify the CLI prompts for re-login once (not per-call) and that subsequent calls with a real {access, refresh} pair auto-refresh without re-prompting.
  • Manual: verify that an API call returning 401 now triggers the refresh handler and retries once, rather than failing immediately.

🤖 Generated with Claude Code

Users report being logged out of brev-cli multiple times per hour,
forcing repeated `brev login --token`. Two bugs in the auth flow
caused this:

1. `LoginWithToken` stored the literal string "auto-login" as the
   RefreshToken whenever it was handed a valid JWT access token
   (pkg/auth/auth.go). When that short-lived access token later
   expired, the refresh path in `GetFreshAccessTokenOrNil` tried to
   exchange "auto-login" with the IdP, which always failed, so the
   user was prompted to re-login every time the access token aged
   out — roughly hourly with typical NVIDIA KAS token lifetimes.

2. The auth-failure retry handler in pkg/store/http.go only fired on
   HTTP 403 Forbidden. APIs returning 401 Unauthorized (the more
   standard response for expired credentials) bypassed the refresh
   hook entirely and surfaced as a hard failure on the first such
   response, even when the refresh token was still good.

This change:

- Stores an empty RefreshToken when LoginWithToken receives a JWT,
  so the refresh path correctly recognizes there is nothing to
  refresh and falls through without a failed IdP round-trip. Emits a
  one-line notice on stderr so the user knows this session cannot
  be auto-renewed.
- Normalizes the legacy "auto-login" sentinel to "" on read, so
  users with existing credentials.json files from older CLI versions
  benefit from the fix without re-running `brev login --token`.
- Changes `GetFreshAccessTokenOrNil` to return "" (rather than an
  expired AccessToken) when there is no way to refresh, so callers
  prompt for re-login cleanly instead of shipping an expired JWT to
  the API.
- Extends the refresh-and-retry handler in AuthHTTPStore to fire on
  both 401 and 403.

Longer-term the website at brev.nvidia.com/settings/cli should hand
out a real refresh token (or a short-lived exchange code) rather
than a raw access JWT so `brev login --token` can produce a
refreshable session. This PR is the CLI-only stop-the-bleeding fix.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@brycelelbach brycelelbach requested a review from a team as a code owner April 18, 2026 07:17
Copilot AI review requested due to automatic review settings April 18, 2026 07:17
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes brev-cli auth churn by preventing invalid refresh-token storage during brev login --token and ensuring auth refresh/retry triggers on both 401 and 403 responses.

Changes:

  • Stop writing the legacy "auto-login" string as a refresh token; treat it as absent when reading existing credentials.
  • Update access-token retrieval to return "" (no token) when an access token is invalid/expired and no refresh token is available.
  • Broaden the HTTP refresh-and-retry hook to trigger on both 401 Unauthorized and 403 Forbidden.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
pkg/store/http.go Expands auth-failure detection (401/403) for refresh-and-retry behavior via isAuthFailure.
pkg/auth/auth.go Removes bogus refresh-token writes, normalizes legacy sentinel on read, and avoids returning expired/invalid access tokens without refresh.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/store/http.go
Comment on lines 106 to 108
s.authHTTPClient.restyClient.OnAfterResponse(func(c *resty.Client, r *resty.Response) error {
if r.StatusCode() == http.StatusForbidden && r.Request.Attempt < attemptsThresh+1 {
if isAuthFailure(r.StatusCode()) && r.Request.Attempt < attemptsThresh+1 {
err := handler()
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change broadens the retry/refresh behavior to include 401 responses, but pkg/store/http_test.go currently only exercises the 403 path. Add a unit test covering 401 to ensure the handler runs and the request retries as expected.

Copilot uses AI. Check for mistakes.
Comment thread pkg/auth/auth.go
Comment on lines 149 to +170
@@ -154,16 +161,20 @@ func (t Auth) GetFreshAccessTokenOrNil() (string, error) {
if err != nil {
return "", breverrors.WrapAndTrace(err)
}
if !isAccessTokenValid && tokens.RefreshToken != "" {
if !isAccessTokenValid {
if tokens.RefreshToken == "" {
// Access token is expired and we have no refresh token. Returning
// the expired token here would just cause a 401 on the next API
// call; return empty so callers can prompt for re-login instead.
return "", nil
}
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetFreshAccessTokenOrNil now (1) treats the legacy refresh-token sentinel "auto-login" as absent and (2) returns an empty token when the access token is invalid/expired and no refresh token is available. Please add/adjust unit tests in pkg/auth/auth_test.go to cover these new branches so regressions don't reintroduce the logout loop.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants