Skip to content

Use auth portal for better gcp and aws oauth UX#1992

Merged
edwardrf merged 6 commits intomainfrom
edw/auth-portal-oauth
Mar 19, 2026
Merged

Use auth portal for better gcp and aws oauth UX#1992
edwardrf merged 6 commits intomainfrom
edw/auth-portal-oauth

Conversation

@edwardrf
Copy link
Copy Markdown
Contributor

@edwardrf edwardrf commented Mar 17, 2026

Description

Make use auth portal for better oauth UX:

  1. GCP: use server side oauth to allow poll model so running defang with oauth workflow in a container would still work
  2. AWS uses auth server as oauth url shortener

Linked Issues

#1991

Checklist

  • I have performed a self-review of my code
  • I have added appropriate tests
  • I have updated the Defang CLI docs and/or README to reflect my changes, if necessary

Summary by CodeRabbit

  • New Features

    • Improved interactive login flows for AWS and GCP, including a keypair-based GCP flow and streamlined cross-device AWS login
    • Browser prompt now accepts typed input before opening and better handles cancellation
  • Refactor

    • Centralized authorization URL construction and simplified polling for authentication responses
  • Tests

    • Added comprehensive tests covering auth URL construction, polling/retry behavior, and cloud login scenarios
  • Chores

    • Updated build/dependency pins and module dependency declarations

AWS: Use the auth portal url shortener, always use cross device
workflow, but still allow press enter to open browser
GCP: Use new server side oauth workflow, generate local libsodium
asymmetric keys, where public key is used as the session state as well
as encryption key by the server side, so no unencrypted credentials are
stored on the auth server and only the local cli with the matching
private key is able to decrypt and use the credentials
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 17, 2026

📝 Walkthrough

Walkthrough

Refactors auth flows: moves response parsing up from client to auth handlers, adds GetAuthorizeUrl, changes client.Poll to return raw bytes, replaces GCP PKCE flow with keypair-based encrypted-token flow, updates AWS to use new auth helpers and browser input, adds browser stdin utilities, and updates Go/Nix dependency pins.

Changes

Cohort / File(s) Summary
Build & Modules
pkgs/defang/cli.nix, src/go.mod
Updated Nix vendorHash for buildGo124Module; promoted golang.org/x/crypto from indirect to direct in go.mod.
Auth core & client
src/pkg/auth/auth.go, src/pkg/auth/client.go
Added GetAuthorizeUrl(...); introduced exported Poll(ctx, key) ([]byte, error) returning raw response bytes; moved form parsing into pollForAuthCode; changed client.Poll signature to return []byte.
Auth tests
src/pkg/auth/auth_test.go
Added tests for GetAuthorizeUrl, Poll retry behavior, and pollForAuthCode parsing/decoding/error cases.
AWS provider
src/pkg/clouds/aws/login.go, src/pkg/clouds/aws/login_test.go
Switched AWS interactive/cross-device flows to use GetAuthorizeUrl and term browser input helper; removed REMOTE_CONTAINERS heuristic; added STS-mocking tests and a stored-credential test.
GCP provider
src/pkg/clouds/gcp/login.go, src/pkg/clouds/gcp/login_test.go
Replaced PKCE code-exchange with keypair-based flow (generate keypair, use base64(pub) as state, poll for base64 ciphertext, decrypt to JSON token); added httptest-based tests for decode/decrypt/JSON error paths and success.
Terminal/browser utilities
src/pkg/term/browser.go
Silenced upstream browser output; OpenBrowserOnEnter now returns cancel directly; added OpenBrowserWithInputOnEnter(ctx, url) that returns derived ctx, input channel, and cancel; improved stdin cancellation handling.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant CLI
    participant Browser
    participant AuthServer
    participant Keypair as Keypair (box)

    User->>CLI: Start GCP interactive login
    CLI->>Keypair: Generate keypair (pub, priv)
    CLI->>AuthServer: Construct auth URL with base64(pub) as state
    CLI->>Browser: Open browser at auth URL
    User->>Browser: Authorize (server stores encrypted token)
    CLI->>AuthServer: Poll endpoint with state
    AuthServer->>CLI: Return base64(ciphertext)
    CLI->>Keypair: Decode base64, decrypt with priv (box.OpenAnonymous)
    CLI->>CLI: Unmarshal JSON token and build TokenSource
Loading
sequenceDiagram
    actor User
    participant CLI
    participant BrowserLib as Browser Input
    participant AuthServer
    participant TokenService

    User->>CLI: Start AWS cross-device login
    CLI->>AuthServer: Get auth URL (includes state)
    CLI->>BrowserLib: Open browser and await Enter/input
    User->>Browser: Authorize
    User->>CLI: Press Enter (with optional input)
    BrowserLib->>CLI: Return buffered input
    CLI->>AuthServer: Poll/receive code
    CLI->>TokenService: Exchange code for credentials
    CLI->>CLI: Persist credentials
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested reviewers

  • jordanstephens
  • raphaeltm

Poem

🐇
I hop through keys and open doors,
I stitch the states and poll the shores,
A browser waits, an Enter hums,
Encrypted carrots, token drums—
The rabbit sings: the login's done.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 5.56% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly captures the main objective: integrating an auth portal to improve OAuth UX for GCP and AWS, which is reflected across the changed files (auth.go, login.go files, term/browser.go).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch edw/auth-portal-oauth
📝 Coding Plan
  • Generate coding plan for human review comments

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.3)

level=warning msg="[linters_context] running gomodguard failed: unable to read module file go.mod: current working directory must have a go.mod file: if you are not using go modules it is suggested to disable this linter"
level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies"


Comment @coderabbitai help to get the list of available commands and usage tips.

@edwardrf edwardrf changed the title Edw/auth portal oauth Use auth portal for better gcp and aws oauth UX Mar 17, 2026
@edwardrf edwardrf marked this pull request as ready for review March 19, 2026 17:35
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
src/pkg/auth/auth.go (1)

135-137: Error message may be unclear when error_description is empty.

If the server returns an error field but no error_description, the error message will be "authentication failed: " with an empty description. Consider including the error code as a fallback:

Suggested improvement
 if errorMsg := query.Get("error"); errorMsg != "" {
-    return "", fmt.Errorf("authentication failed: %s", query.Get("error_description"))
+    desc := query.Get("error_description")
+    if desc == "" {
+        desc = errorMsg
+    }
+    return "", fmt.Errorf("authentication failed: %s", desc)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/pkg/auth/auth.go` around lines 135 - 137, The error handling when
query.Get("error") is present uses query.Get("error_description") directly and
can produce an empty message; update the code that returns fmt.Errorf in that
branch (the check for query.Get("error")) to use the error_description if
present, otherwise fall back to the error code (query.Get("error")) so the
returned error string always contains useful information; locate the branch
using query.Get("error") and replace the formatted message to include the
fallback logic.
src/pkg/clouds/gcp/login_test.go (1)

416-441: Assert the decrypted token, not just that a source exists.

A non-nil TokenSource still passes if the decrypted JSON gets dropped or partially populated. Resolve ts.Token() and compare it with wantToken so this actually validates the end-to-end success path.

Suggested assertion upgrade
 		gcp := &Gcp{}
 		ts, err := gcp.InteractiveLogin(t.Context())
 		if err != nil {
 			t.Fatalf("InteractiveLogin() error = %v", err)
 		}
 		if ts == nil {
-			t.Error("expected non-nil token source")
+			t.Fatal("expected non-nil token source")
+		}
+		gotToken, err := ts.Token()
+		if err != nil {
+			t.Fatalf("TokenSource.Token() error = %v", err)
+		}
+		if gotToken.AccessToken != wantToken.AccessToken ||
+			gotToken.RefreshToken != wantToken.RefreshToken ||
+			!gotToken.Expiry.Equal(wantToken.Expiry) {
+			t.Fatalf("token = %#v, want %#v", gotToken, wantToken)
 		}
 	})
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/pkg/clouds/gcp/login_test.go` around lines 416 - 441, The test currently
only verifies a non-nil TokenSource from Gcp.InteractiveLogin but doesn't assert
the actual token contents; call Token() on the returned ts (the TokenSource) and
compare the resulting oauth2.Token fields against wantToken (or deep-equal the
token) to ensure the decrypted/parsed token matches expected values, updating
the test after the existing ts == nil check to resolve ts.Token(), check for
error, and assert AccessToken/RefreshToken/Expiry (or use reflect.DeepEqual)
against wantToken.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/pkg/auth/auth_test.go`:
- Around line 78-99: The test "5xx retries until context cancelled" doesn't
verify that Poll actually retries on 5xx; modify the test so it requires more
than one server call (e.g., assert calls > 1) or arrange cancellation after the
second request; specifically update the t.Run block that sets OpenAuthClient =
NewClient("test", server.URL) and calls Poll("state") so that the final
assertion checks calls > 1 (or cancel the ctx after the second request) to prove
Poll retries on HTTP 500 responses.

In `@src/pkg/clouds/gcp/login_test.go`:
- Around line 329-336: The fake poll server handler currently decodes the
"state" into pubKeyBytes and blindly copies into pubKey ([32]byte), allowing
truncation; change the handler in login_test.go to check that len(pubKeyBytes)
== 32 and return http.Error(w, "bad state: wrong length", http.StatusBadRequest)
(or include the actual length) if not, only then copy into pubKey; reference the
variables pubKeyB64, pubKeyBytes and the pubKey [32]byte and the copy(pubKey[:],
pubKeyBytes) call to locate where to add the exact-length validation.

In `@src/pkg/term/browser.go`:
- Around line 20-26: The double-close occurs because both the cleanup path
explicitly calls input.Close() then cancel(), and the context-cancellation
goroutine in the functions OpenBrowserWithInput and OpenBrowserWithInputOnEnter
also calls input.Close() on ctx.Done(); pick one consistent approach: either
remove the goroutine that closes input and keep calling input.Close() in
cleanup, or change the cleanup code to only call cancel() and let the existing
goroutine perform the Close() — update the cleanup in the code paths that call
input.Close() then cancel() to instead only call cancel() (or delete the
ctx.Done() goroutine), ensuring no redundant Close() calls and no ignored errors
from the second Close().

---

Nitpick comments:
In `@src/pkg/auth/auth.go`:
- Around line 135-137: The error handling when query.Get("error") is present
uses query.Get("error_description") directly and can produce an empty message;
update the code that returns fmt.Errorf in that branch (the check for
query.Get("error")) to use the error_description if present, otherwise fall back
to the error code (query.Get("error")) so the returned error string always
contains useful information; locate the branch using query.Get("error") and
replace the formatted message to include the fallback logic.

In `@src/pkg/clouds/gcp/login_test.go`:
- Around line 416-441: The test currently only verifies a non-nil TokenSource
from Gcp.InteractiveLogin but doesn't assert the actual token contents; call
Token() on the returned ts (the TokenSource) and compare the resulting
oauth2.Token fields against wantToken (or deep-equal the token) to ensure the
decrypted/parsed token matches expected values, updating the test after the
existing ts == nil check to resolve ts.Token(), check for error, and assert
AccessToken/RefreshToken/Expiry (or use reflect.DeepEqual) against wantToken.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 940339be-dd62-492b-9059-ba3bcd1b47d1

📥 Commits

Reviewing files that changed from the base of the PR and between c4b63e9 and 9e3e84a.

📒 Files selected for processing (10)
  • pkgs/defang/cli.nix
  • src/go.mod
  • src/pkg/auth/auth.go
  • src/pkg/auth/auth_test.go
  • src/pkg/auth/client.go
  • src/pkg/clouds/aws/login.go
  • src/pkg/clouds/aws/login_test.go
  • src/pkg/clouds/gcp/login.go
  • src/pkg/clouds/gcp/login_test.go
  • src/pkg/term/browser.go

Comment thread src/pkg/auth/auth_test.go
Comment thread src/pkg/clouds/gcp/login_test.go
Comment thread src/pkg/term/browser.go
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/pkg/auth/auth_test.go`:
- Around line 78-83: The test "5xx retries until context cancelled" is using
http.StatusRequestTimeout (408) instead of a 5xx code, so change the server
response in that test (the httptest server handler used in the t.Run block that
increments calls and returns an error) to return a 5xx status such as
http.StatusInternalServerError so Poll() hits the StatusCode >= 500 retry
branch; keep the rest of the test (calls counting, context cancellation)
unchanged so it properly validates the 5xx retry behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 983a99b6-bfd3-4965-8832-d0c2f6692d5c

📥 Commits

Reviewing files that changed from the base of the PR and between 9e3e84a and 6b70863.

📒 Files selected for processing (3)
  • src/pkg/auth/auth_test.go
  • src/pkg/clouds/gcp/login_test.go
  • src/pkg/term/browser.go
✅ Files skipped from review due to trivial changes (1)
  • src/pkg/clouds/gcp/login_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/pkg/term/browser.go

Comment thread src/pkg/auth/auth_test.go
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/pkg/auth/auth_test.go (1)

78-124: Retry subtests are time-coupled; prefer deterministic cancellation triggers.

These assertions currently depend on wall-clock timing (3s / 100ms), which can make CI behavior noisy. Consider cancelling from the handler after N calls to prove retries without timing sensitivity.

♻️ Deterministic retry-test pattern
 t.Run("5xx retries until context cancelled", func(t *testing.T) {
-	calls := 0
+	calls := 0
+	ctx, cancel := context.WithCancel(context.Background())
+	defer cancel()
 	server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
 		calls++
+		if calls >= 2 {
+			cancel()
+		}
 		http.Error(w, "internal error", http.StatusInternalServerError)
 	}))
 	t.Cleanup(server.Close)

 	orig := OpenAuthClient
 	OpenAuthClient = NewClient("test", server.URL)
 	t.Cleanup(func() { OpenAuthClient = orig })

-	ctx, cancel := context.WithTimeout(context.Background(), 3*time.Second) // Retry client retires per second
-	defer cancel()
-
 	_, err := Poll(ctx, "state")
 	if err == nil {
 		t.Error("expected error after context cancellation")
 	}
 	if calls < 2 {
 		t.Error("expected server to be called at least twice")
 	}
 })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/pkg/auth/auth_test.go` around lines 78 - 124, The tests rely on timeouts;
instead make cancellation deterministic by creating the context and its cancel
function before calling Poll and invoking cancel() from the httptest handler
after N requests: e.g., in both subtests create ctx, cancel :=
context.WithTimeout(context.Background(), time.Minute) (or
context.Background()), pass ctx to Poll, and inside the server handler (the
http.HandlerFunc used by httptest.NewServer) increment calls and when calls == 2
call cancel() so Poll returns due to cancelled context; keep using
OpenAuthClient = NewClient("test", server.URL) and t.Cleanup to restore
OpenAuthClient and server.Close. Ensure the handler still writes the intended
status (500 or 408) before calling cancel so the retry logic runs
deterministically.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/pkg/auth/auth_test.go`:
- Around line 78-124: The tests rely on timeouts; instead make cancellation
deterministic by creating the context and its cancel function before calling
Poll and invoking cancel() from the httptest handler after N requests: e.g., in
both subtests create ctx, cancel := context.WithTimeout(context.Background(),
time.Minute) (or context.Background()), pass ctx to Poll, and inside the server
handler (the http.HandlerFunc used by httptest.NewServer) increment calls and
when calls == 2 call cancel() so Poll returns due to cancelled context; keep
using OpenAuthClient = NewClient("test", server.URL) and t.Cleanup to restore
OpenAuthClient and server.Close. Ensure the handler still writes the intended
status (500 or 408) before calling cancel so the retry logic runs
deterministically.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8c9ae364-c9a4-4269-ab4f-502509cfd495

📥 Commits

Reviewing files that changed from the base of the PR and between 6b70863 and bb9f0d5.

📒 Files selected for processing (1)
  • src/pkg/auth/auth_test.go

@edwardrf edwardrf merged commit 22cc444 into main Mar 19, 2026
14 checks passed
@edwardrf edwardrf deleted the edw/auth-portal-oauth branch March 19, 2026 18:56
@lionello lionello linked an issue Mar 20, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Impove oauth user experience

3 participants