Use auth portal for better gcp and aws oauth UX#1992
Conversation
AWS: Use the auth portal url shortener, always use cross device workflow, but still allow press enter to open browser GCP: Use new server side oauth workflow, generate local libsodium asymmetric keys, where public key is used as the session state as well as encryption key by the server side, so no unencrypted credentials are stored on the auth server and only the local cli with the matching private key is able to decrypt and use the credentials
📝 WalkthroughWalkthroughRefactors auth flows: moves response parsing up from client to auth handlers, adds GetAuthorizeUrl, changes client.Poll to return raw bytes, replaces GCP PKCE flow with keypair-based encrypted-token flow, updates AWS to use new auth helpers and browser input, adds browser stdin utilities, and updates Go/Nix dependency pins. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant CLI
participant Browser
participant AuthServer
participant Keypair as Keypair (box)
User->>CLI: Start GCP interactive login
CLI->>Keypair: Generate keypair (pub, priv)
CLI->>AuthServer: Construct auth URL with base64(pub) as state
CLI->>Browser: Open browser at auth URL
User->>Browser: Authorize (server stores encrypted token)
CLI->>AuthServer: Poll endpoint with state
AuthServer->>CLI: Return base64(ciphertext)
CLI->>Keypair: Decode base64, decrypt with priv (box.OpenAnonymous)
CLI->>CLI: Unmarshal JSON token and build TokenSource
sequenceDiagram
actor User
participant CLI
participant BrowserLib as Browser Input
participant AuthServer
participant TokenService
User->>CLI: Start AWS cross-device login
CLI->>AuthServer: Get auth URL (includes state)
CLI->>BrowserLib: Open browser and await Enter/input
User->>Browser: Authorize
User->>CLI: Press Enter (with optional input)
BrowserLib->>CLI: Return buffered input
CLI->>AuthServer: Poll/receive code
CLI->>TokenService: Exchange code for credentials
CLI->>CLI: Persist credentials
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.11.3)level=warning msg="[linters_context] running gomodguard failed: unable to read module file go.mod: current working directory must have a go.mod file: if you are not using go modules it is suggested to disable this linter" Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
src/pkg/auth/auth.go (1)
135-137: Error message may be unclear whenerror_descriptionis empty.If the server returns an
errorfield but noerror_description, the error message will be "authentication failed: " with an empty description. Consider including the error code as a fallback:Suggested improvement
if errorMsg := query.Get("error"); errorMsg != "" { - return "", fmt.Errorf("authentication failed: %s", query.Get("error_description")) + desc := query.Get("error_description") + if desc == "" { + desc = errorMsg + } + return "", fmt.Errorf("authentication failed: %s", desc) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/pkg/auth/auth.go` around lines 135 - 137, The error handling when query.Get("error") is present uses query.Get("error_description") directly and can produce an empty message; update the code that returns fmt.Errorf in that branch (the check for query.Get("error")) to use the error_description if present, otherwise fall back to the error code (query.Get("error")) so the returned error string always contains useful information; locate the branch using query.Get("error") and replace the formatted message to include the fallback logic.src/pkg/clouds/gcp/login_test.go (1)
416-441: Assert the decrypted token, not just that a source exists.A non-nil
TokenSourcestill passes if the decrypted JSON gets dropped or partially populated. Resolvets.Token()and compare it withwantTokenso this actually validates the end-to-end success path.Suggested assertion upgrade
gcp := &Gcp{} ts, err := gcp.InteractiveLogin(t.Context()) if err != nil { t.Fatalf("InteractiveLogin() error = %v", err) } if ts == nil { - t.Error("expected non-nil token source") + t.Fatal("expected non-nil token source") + } + gotToken, err := ts.Token() + if err != nil { + t.Fatalf("TokenSource.Token() error = %v", err) + } + if gotToken.AccessToken != wantToken.AccessToken || + gotToken.RefreshToken != wantToken.RefreshToken || + !gotToken.Expiry.Equal(wantToken.Expiry) { + t.Fatalf("token = %#v, want %#v", gotToken, wantToken) } }) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/pkg/clouds/gcp/login_test.go` around lines 416 - 441, The test currently only verifies a non-nil TokenSource from Gcp.InteractiveLogin but doesn't assert the actual token contents; call Token() on the returned ts (the TokenSource) and compare the resulting oauth2.Token fields against wantToken (or deep-equal the token) to ensure the decrypted/parsed token matches expected values, updating the test after the existing ts == nil check to resolve ts.Token(), check for error, and assert AccessToken/RefreshToken/Expiry (or use reflect.DeepEqual) against wantToken.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/pkg/auth/auth_test.go`:
- Around line 78-99: The test "5xx retries until context cancelled" doesn't
verify that Poll actually retries on 5xx; modify the test so it requires more
than one server call (e.g., assert calls > 1) or arrange cancellation after the
second request; specifically update the t.Run block that sets OpenAuthClient =
NewClient("test", server.URL) and calls Poll("state") so that the final
assertion checks calls > 1 (or cancel the ctx after the second request) to prove
Poll retries on HTTP 500 responses.
In `@src/pkg/clouds/gcp/login_test.go`:
- Around line 329-336: The fake poll server handler currently decodes the
"state" into pubKeyBytes and blindly copies into pubKey ([32]byte), allowing
truncation; change the handler in login_test.go to check that len(pubKeyBytes)
== 32 and return http.Error(w, "bad state: wrong length", http.StatusBadRequest)
(or include the actual length) if not, only then copy into pubKey; reference the
variables pubKeyB64, pubKeyBytes and the pubKey [32]byte and the copy(pubKey[:],
pubKeyBytes) call to locate where to add the exact-length validation.
In `@src/pkg/term/browser.go`:
- Around line 20-26: The double-close occurs because both the cleanup path
explicitly calls input.Close() then cancel(), and the context-cancellation
goroutine in the functions OpenBrowserWithInput and OpenBrowserWithInputOnEnter
also calls input.Close() on ctx.Done(); pick one consistent approach: either
remove the goroutine that closes input and keep calling input.Close() in
cleanup, or change the cleanup code to only call cancel() and let the existing
goroutine perform the Close() — update the cleanup in the code paths that call
input.Close() then cancel() to instead only call cancel() (or delete the
ctx.Done() goroutine), ensuring no redundant Close() calls and no ignored errors
from the second Close().
---
Nitpick comments:
In `@src/pkg/auth/auth.go`:
- Around line 135-137: The error handling when query.Get("error") is present
uses query.Get("error_description") directly and can produce an empty message;
update the code that returns fmt.Errorf in that branch (the check for
query.Get("error")) to use the error_description if present, otherwise fall back
to the error code (query.Get("error")) so the returned error string always
contains useful information; locate the branch using query.Get("error") and
replace the formatted message to include the fallback logic.
In `@src/pkg/clouds/gcp/login_test.go`:
- Around line 416-441: The test currently only verifies a non-nil TokenSource
from Gcp.InteractiveLogin but doesn't assert the actual token contents; call
Token() on the returned ts (the TokenSource) and compare the resulting
oauth2.Token fields against wantToken (or deep-equal the token) to ensure the
decrypted/parsed token matches expected values, updating the test after the
existing ts == nil check to resolve ts.Token(), check for error, and assert
AccessToken/RefreshToken/Expiry (or use reflect.DeepEqual) against wantToken.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 940339be-dd62-492b-9059-ba3bcd1b47d1
📒 Files selected for processing (10)
pkgs/defang/cli.nixsrc/go.modsrc/pkg/auth/auth.gosrc/pkg/auth/auth_test.gosrc/pkg/auth/client.gosrc/pkg/clouds/aws/login.gosrc/pkg/clouds/aws/login_test.gosrc/pkg/clouds/gcp/login.gosrc/pkg/clouds/gcp/login_test.gosrc/pkg/term/browser.go
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/pkg/auth/auth_test.go`:
- Around line 78-83: The test "5xx retries until context cancelled" is using
http.StatusRequestTimeout (408) instead of a 5xx code, so change the server
response in that test (the httptest server handler used in the t.Run block that
increments calls and returns an error) to return a 5xx status such as
http.StatusInternalServerError so Poll() hits the StatusCode >= 500 retry
branch; keep the rest of the test (calls counting, context cancellation)
unchanged so it properly validates the 5xx retry behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 983a99b6-bfd3-4965-8832-d0c2f6692d5c
📒 Files selected for processing (3)
src/pkg/auth/auth_test.gosrc/pkg/clouds/gcp/login_test.gosrc/pkg/term/browser.go
✅ Files skipped from review due to trivial changes (1)
- src/pkg/clouds/gcp/login_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- src/pkg/term/browser.go
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/pkg/auth/auth_test.go (1)
78-124: Retry subtests are time-coupled; prefer deterministic cancellation triggers.These assertions currently depend on wall-clock timing (
3s/100ms), which can make CI behavior noisy. Consider cancelling from the handler afterNcalls to prove retries without timing sensitivity.♻️ Deterministic retry-test pattern
t.Run("5xx retries until context cancelled", func(t *testing.T) { - calls := 0 + calls := 0 + ctx, cancel := context.WithCancel(context.Background()) + defer cancel() server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { calls++ + if calls >= 2 { + cancel() + } http.Error(w, "internal error", http.StatusInternalServerError) })) t.Cleanup(server.Close) orig := OpenAuthClient OpenAuthClient = NewClient("test", server.URL) t.Cleanup(func() { OpenAuthClient = orig }) - ctx, cancel := context.WithTimeout(context.Background(), 3*time.Second) // Retry client retires per second - defer cancel() - _, err := Poll(ctx, "state") if err == nil { t.Error("expected error after context cancellation") } if calls < 2 { t.Error("expected server to be called at least twice") } })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/pkg/auth/auth_test.go` around lines 78 - 124, The tests rely on timeouts; instead make cancellation deterministic by creating the context and its cancel function before calling Poll and invoking cancel() from the httptest handler after N requests: e.g., in both subtests create ctx, cancel := context.WithTimeout(context.Background(), time.Minute) (or context.Background()), pass ctx to Poll, and inside the server handler (the http.HandlerFunc used by httptest.NewServer) increment calls and when calls == 2 call cancel() so Poll returns due to cancelled context; keep using OpenAuthClient = NewClient("test", server.URL) and t.Cleanup to restore OpenAuthClient and server.Close. Ensure the handler still writes the intended status (500 or 408) before calling cancel so the retry logic runs deterministically.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/pkg/auth/auth_test.go`:
- Around line 78-124: The tests rely on timeouts; instead make cancellation
deterministic by creating the context and its cancel function before calling
Poll and invoking cancel() from the httptest handler after N requests: e.g., in
both subtests create ctx, cancel := context.WithTimeout(context.Background(),
time.Minute) (or context.Background()), pass ctx to Poll, and inside the server
handler (the http.HandlerFunc used by httptest.NewServer) increment calls and
when calls == 2 call cancel() so Poll returns due to cancelled context; keep
using OpenAuthClient = NewClient("test", server.URL) and t.Cleanup to restore
OpenAuthClient and server.Close. Ensure the handler still writes the intended
status (500 or 408) before calling cancel so the retry logic runs
deterministically.
Description
Make use auth portal for better oauth UX:
Linked Issues
#1991
Checklist
Summary by CodeRabbit
New Features
Refactor
Tests
Chores