Skip to content

fix: use per-message credentials in shared sessions#989

Merged
Gkrumbach07 merged 21 commits intomainfrom
fix/shared-session-credentials
Mar 24, 2026
Merged

fix: use per-message credentials in shared sessions#989
Gkrumbach07 merged 21 commits intomainfrom
fix/shared-session-credentials

Conversation

@Gkrumbach07
Copy link
Copy Markdown
Contributor

Summary

  • Security fix: Shared sessions previously used the session owner's credentials for all runs, allowing User B to impersonate User A's GitHub/Jira/Google/GitLab credentials. Each message now uses the sender's own credentials.
  • Backward compatible: API keys and scheduled sessions continue using creator's credentials via fallback pattern (effectiveUserID = currentUserID || ownerUserID)
  • Credential isolation: Credentials are cleared from environment after each turn to prevent cross-user leakage

Changes

Backend (runtime_credentials.go)

  • Extract getEffectiveUserID() and checkCredentialRBAC() helpers used by all 4 credential handlers (GitHub, Google, Jira, GitLab)
  • Resolve effective user from X-Runner-Current-User header, falling back to session owner
  • Add audit logging: CREDENTIAL_ACCESS log line on every credential fetch

Proxy (agui_proxy.go)

  • Forward X-Current-User-ID / X-Current-User-Name headers from backend to runner
  • Merged into existing proxyRunnerStream / connectToRunner functions (no duplication)

Runner (Python)

  • context.py: Added current_user_id, current_user_name fields + set_current_user() method
  • auth.py: Send X-Runner-Current-User header when fetching credentials; added clear_runtime_credentials() and sanitize_user_context()
  • endpoints/run.py: Extract user headers from request, set on context before credential fetch
  • bridge.py: Clear credentials after each turn (opt-out via KEEP_CREDENTIALS_PERSISTENT=true)

Tests

  • 17 Go tests (Ginkgo): shared sessions, API keys, scheduled sessions, RBAC violations
  • 13 Python tests (pytest): context management, credential clearing, header sending, lifecycle

Docs

  • features/session-sharing.md: User guide for credential behavior
  • guides/migrating-shared-sessions.md: Migration guide with rollback flag

Test plan

  • Verify shared session: User A creates session, User B sends message → B's credentials are used
  • Verify owner access: Session owner's messages still use owner's credentials
  • Verify API key sessions: API key sessions use creator's credentials (no header sent)
  • Verify scheduled sessions: Scheduled sessions use creator's credentials (fallback)
  • Verify credential clearing: After turn completes, env vars are wiped
  • Run go test ./handlers — all pass
  • Run uv run pytest tests/test_shared_session_credentials.py — all 13 pass

🤖 Generated with Claude Code

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 23, 2026

Caution

Review failed

Pull request was closed or merged during review

Walkthrough

Adds per-message credential scoping for shared sessions: session active-user tracking, centralized effective-user resolution and RBAC for credential endpoints, propagation of current-user/caller-token through websocket proxy into runners and bridges, per-run credential population/cleanup, ServiceAccount annotation backfill, tests, and docs.

Changes

Cohort / File(s) Summary
Backend credential helpers & handlers
components/backend/handlers/runtime_credentials.go, components/backend/handlers/runtime_credentials_test.go
Introduced session active-user map and exported SetSessionActiveUser/ClearSessionActiveUser. Added getEffectiveUserID, resolveAuthenticatedUser, enforceCredentialRBAC. Refactored session credential endpoints to use centralized RBAC, normalized denial responses, and added CREDENTIAL_ACCESS logs. Tests extended for shared-session resolution and helpers.
WebSocket proxy & runner endpoint
components/backend/websocket/agui_proxy.go, components/runners/ambient-runner/ambient_runner/endpoints/run.py
Proxy records per-session active user, reads sender/current user and caller token, logs credential-scoping, and forwards userID, userName, callerToken into runner proxy and bridge.run. Endpoint reads/sanitizes x-current-user-* and x-caller-token and forwards them.
Runner context & auth helpers
components/runners/ambient-runner/ambient_runner/platform/context.py, components/runners/ambient-runner/ambient_runner/platform/auth.py
Added RunnerContext fields current_user_id, current_user_name, caller_token and set_current_user(). _fetch_credential() now prefers caller_token, includes X-Runner-Current-User when present, retries with BOT_TOKEN on 401/403, concurrently fetches provider creds, and added clear_runtime_credentials() for cleanup.
Bridges: per-message user propagation & cleanup
components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py, components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/bridge.py, components/runners/ambient-runner/ambient_runner/bridges/langgraph/bridge.py
ClaudeBridge.run() signature expanded to accept current_user_id, current_user_name, caller_token; added _initialize_run() and _rebuild_mcp_servers() to refresh creds and rebuild MCP config on user changes; ensures credential clearing in finally (respects KEEP_CREDENTIALS_PERSISTENT). Gemini and LangGraph run signatures accept **kwargs for compatibility.
Runner tests: shared-session credentials
components/runners/ambient-runner/tests/test_shared_session_credentials.py
New pytest suite validating RunnerContext set/clear, sanitize_user_context, _fetch_credential header propagation and fallback, and end-to-end populate/clear runtime-credentials lifecycle with ephemeral test servers.
Operator: ServiceAccount annotations
components/operator/internal/handlers/sessions.go
regenerateRunnerToken() now ensures ambient-code.io/created-by-user-id annotation on per-session ServiceAccounts, updating existing SAs when annotation differs and returning errors on fetch/update failures during backfill.
Docs & API examples
docs/astro.config.mjs, docs/src/content/docs/features/session-sharing.md, docs/src/content/docs/guides/migrating-shared-sessions.md, components/runners/ambient-runner/ambient_runner/bridges/claude/backend_tools.py
Added session-sharing and migration documentation and sidebar entries; clarified relative vs absolute URL usage in API examples and updated examples accordingly.
Websocket cleanup usage
components/backend/websocket/agui_proxy.go
cleanupStaleSessions now calls ClearSessionActiveUser to remove active-user mappings for pruned sessions.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Backend
    participant Proxy
    participant Runner
    participant Bridge
    participant Context

    Client->>Backend: GET /credentials/<type> (optional X-Runner-Current-User)
    Backend->>Backend: getEffectiveUserID(header, session.owner)
    Backend->>Backend: resolveAuthenticatedUser(middleware/token)
    Backend->>Backend: enforceCredentialRBAC(owner,effective,auth)
    alt RBAC fails
        Backend-->>Client: 403 {"error":"Access denied"} or 404
    else RBAC passes
        Backend-->>Client: 200 + credential (emit CREDENTIAL_ACCESS)
    end

    Client->>Proxy: WebSocket connect / run (session)
    Proxy->>Proxy: Read sender/current user from Gin context
    Proxy->>Proxy: SetSessionActiveUser(session, senderID)
    Proxy->>Runner: proxyRunnerStream(..., userID, userName, callerToken)
    Runner->>Bridge: bridge.run(input, current_user_id, current_user_name, caller_token)
    Bridge->>Context: set_current_user(user_id, user_name, token)
    Bridge->>Backend: _fetch_credential() (Authorization and X-Runner-Current-User)
    Backend->>Backend: enforceCredentialRBAC(...) for runtime fetch
    Backend-->>Bridge: credential scoped to effective user
    Bridge->>Bridge: execute turn
    Bridge->>Bridge: finally: clear_runtime_credentials() unless KEEP_CREDENTIALS_PERSISTENT=true
    Bridge-->>Runner: turn complete
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.04% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix: use per-message credentials in shared sessions' directly and clearly summarizes the main security fix: changing credential handling in shared sessions from using owner credentials to using sender's per-message credentials.
Description check ✅ Passed The description is comprehensive and directly related to the changeset. It explains the security fix, backward compatibility approach, changes across all affected components, test coverage, and documentation updates.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/shared-session-credentials

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/backend/handlers/runtime_credentials_test.go`:
- Around line 174-205: The tests around GetGitHubTokenForSession only assert
non-401/403 and thus don’t prove which user’s credentials were used; update the
relevant specs (those creating contexts via createCredentialContext and calling
GetGitHubTokenForSession) to seed distinct credentials for owner and
collaborator (or mock/hook the provider getter used by GetGitHubTokenForSession)
and then assert the returned token/email matches the expected user's credential
instead of only checking status codes (use httpUtils.GetResponseRecorder to
inspect response body or the provider mock to capture which userID was
requested); also add a negative case where owner-supplied X-Runner-Current-User
differs from actual owner to ensure fallback/authorization behavior is correct.

In `@components/backend/handlers/runtime_credentials.go`:
- Line 109: The log.Printf call is emitting raw user identifiers
(effectiveUserID, ownerUserID, and c.GetString("userID")/auth_as) into general
application logs which may contain PII; update the credential-access logging to
redact or replace those values with opaque IDs or deterministic hashes before
logging (e.g., hashString(effectiveUserID)), and send full principal detail only
to the dedicated audit sink; change the log.Printf invocation in
runtime_credentials.go (the line with log.Printf(... effectiveUserID, project,
session, ownerUserID, c.GetString("userID"))) accordingly and apply the same
replacement to the other similar calls referenced (lines around the other
log.Printf occurrences) so general logs contain non-PII identifiers while full
details go to the secure audit path.
- Around line 39-47: The current checkCredentialRBAC function incorrectly allows
the session owner (authenticatedUserID == ownerUserID) to access credentials for
any effectiveUserID when the X-Runner-Current-User header is supplied; change
the logic in checkCredentialRBAC to: if c.GetString("userID") is empty return
true (BOT_TOKEN), if the request supplies an effectiveUserID header
(effectiveUserID != ""), require authenticatedUserID == effectiveUserID,
otherwise (no header) allow the owner fallback by returning authenticatedUserID
== ownerUserID; update the return conditions in checkCredentialRBAC accordingly
so owner fallback is only used when no current-user header is present.

In `@components/backend/websocket/agui_proxy.go`:
- Line 374: Run gofmt -w on agui_proxy.go to fix the formatting (or manually
remove the extra blank lines reported by the pipeline); specifically remove the
stray blank lines the pipeline flagged so the file compiles with gofmt (you can
run `gofmt -w agui_proxy.go` to automatically fix them).

In `@docs/src/content/docs/guides/migrating-shared-sessions.md`:
- Around line 55-57: The fenced code block containing the environment variable
declaration (`DISABLE_PER_USER_CREDENTIALS=true`) needs a language specifier for
proper rendering; update the markdown block around that line in
migrating-shared-sessions.md to use a bash/code block fence (e.g., change ``` to
```bash) so the snippet is rendered with correct syntax highlighting.
- Around line 51-57: Docs reference a non-existent rollback flag; implement the
DISABLE_PER_USER_CREDENTIALS toggle in the backend credential resolution so the
docs are accurate. Update the logic in runtime_credentials.go (specifically
getEffectiveUserID() or the credential resolution path) to read the
DISABLE_PER_USER_CREDENTIALS env var and, when true, bypass per-user resolution
and always return the session owner ID (mirroring previous behavior), ensure
this interacts correctly with KEEP_CREDENTIALS_PERSISTENT semantics, and
add/update tests to cover both enabled and disabled flag scenarios.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3d785d23-4e3f-4f9f-85fa-c7b373c0894b

📥 Commits

Reviewing files that changed from the base of the PR and between 7af078a and a26b7f1.

📒 Files selected for processing (11)
  • components/backend/handlers/runtime_credentials.go
  • components/backend/handlers/runtime_credentials_test.go
  • components/backend/websocket/agui_proxy.go
  • components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py
  • components/runners/ambient-runner/ambient_runner/endpoints/run.py
  • components/runners/ambient-runner/ambient_runner/platform/auth.py
  • components/runners/ambient-runner/ambient_runner/platform/context.py
  • components/runners/ambient-runner/tests/test_shared_session_credentials.py
  • docs/astro.config.mjs
  • docs/src/content/docs/features/session-sharing.md
  • docs/src/content/docs/guides/migrating-shared-sessions.md

Comment on lines +174 to +205
Context("when BOT_TOKEN calls without X-Runner-Current-User header", func() {
It("should use owner's userID for credential lookup", func() {
c := createCredentialContext("github")
// BOT_TOKEN: no userID in gin context (empty authenticatedUserID)
c.Set("userID", "")

GetGitHubTokenForSession(c)

// Handler will try to fetch GitHub token for ownerUserID.
// With fake K8s, git.GetGitHubToken may not find creds, but
// the session lookup and user resolution should succeed.
statusCode := httpUtils.GetResponseRecorder().Code
// Should NOT be 401 (auth) or 403 (RBAC) — user resolution worked
Expect(statusCode).NotTo(Equal(http.StatusUnauthorized))
Expect(statusCode).NotTo(Equal(http.StatusForbidden))
logger.Log("BOT_TOKEN without header correctly resolved to owner: status=%d", statusCode)
})
})

Context("when BOT_TOKEN calls with X-Runner-Current-User header", func() {
It("should use the current user's ID for credential lookup", func() {
c := createCredentialContext("github")
c.Set("userID", "")
c.Request.Header.Set("X-Runner-Current-User", "collaborator-user-abc")

GetGitHubTokenForSession(c)

statusCode := httpUtils.GetResponseRecorder().Code
// Should NOT be 401 or 403 — BOT_TOKEN is allowed to pass current user
Expect(statusCode).NotTo(Equal(http.StatusUnauthorized))
Expect(statusCode).NotTo(Equal(http.StatusForbidden))
logger.Log("BOT_TOKEN with current user header: status=%d", statusCode)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

These “success” cases don't prove which user's credentials were selected.

Each block only checks status != 401/403. That still passes when the handler ignores X-Runner-Current-User, falls back to the owner, or returns 404/500 because no credentials were found, so this suite won't catch the exact regression this PR is about. Seed distinct owner/collaborator credentials (or hook the provider getter) and assert the returned token/email comes from the expected user; also add a negative case where the owner sets X-Runner-Current-User to someone else.

As per coding guidelines, **: -Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Also applies to: 209-220, 236-249, 254-279, 309-320, 338-349, 355-366

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/backend/handlers/runtime_credentials_test.go` around lines 174 -
205, The tests around GetGitHubTokenForSession only assert non-401/403 and thus
don’t prove which user’s credentials were used; update the relevant specs (those
creating contexts via createCredentialContext and calling
GetGitHubTokenForSession) to seed distinct credentials for owner and
collaborator (or mock/hook the provider getter used by GetGitHubTokenForSession)
and then assert the returned token/email matches the expected user's credential
instead of only checking status codes (use httpUtils.GetResponseRecorder to
inspect response body or the provider mock to capture which userID was
requested); also add a negative case where owner-supplied X-Runner-Current-User
differs from actual owner to ensure fallback/authorization behavior is correct.

Comment on lines +51 to +57
## Rollback

If you encounter issues, you can revert to the previous behavior (session owner credentials for all messages) by setting the environment variable:

```
DISABLE_PER_USER_CREDENTIALS=true
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for DISABLE_PER_USER_CREDENTIALS implementation in the codebase
echo "=== Searching for DISABLE_PER_USER_CREDENTIALS ==="
rg -n "DISABLE_PER_USER_CREDENTIALS" --type go --type py

echo ""
echo "=== Searching for KEEP_CREDENTIALS_PERSISTENT ==="
rg -n "KEEP_CREDENTIALS_PERSISTENT" --type go --type py

echo ""
echo "=== Check runtime_credentials.go for rollback logic ==="
fd "runtime_credentials.go" --exec cat {}

Repository: ambient-code/platform

Length of output: 20332


🏁 Script executed:

# Search for all credential-related files and handlers
echo "=== Searching for other credential handlers ==="
fd -e go -e py "credential" --type f | head -20

echo ""
echo "=== Search for environment variable checks that might disable per-user behavior ==="
rg -n "X-Runner-Current-User|getEffectiveUserID" --type go --type py

echo ""
echo "=== Check if there are git handlers or other modules ==="
fd -e go "handler" --type f components/

Repository: ambient-code/platform

Length of output: 4932


🏁 Script executed:

# Check what KEEP_CREDENTIALS_PERSISTENT does in the runner
echo "=== Check KEEP_CREDENTIALS_PERSISTENT context in bridge.py ==="
rg -B 5 -A 5 "KEEP_CREDENTIALS_PERSISTENT" components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py

echo ""
echo "=== Check if runner has any logic to disable X-Runner-Current-User header ==="
rg -B 3 -A 3 "X-Runner-Current-User" components/runners/ambient-runner/ambient_runner/platform/auth.py

Repository: ambient-code/platform

Length of output: 1086


Documentation references non-existent rollback flag; backend has no mechanism to revert to session owner credentials.

The DISABLE_PER_USER_CREDENTIALS environment variable does not exist in the codebase. The backend credential handlers in runtime_credentials.go implement per-user credentials through hardcoded logic in getEffectiveUserID() with no environment variable toggle. KEEP_CREDENTIALS_PERSISTENT (found in the runner) only controls whether credentials are cleared from memory after a turn—it does not affect backend credential resolution. There is no rollback mechanism to revert to "session owner credentials for all messages" as the documentation claims.

Either implement the DISABLE_PER_USER_CREDENTIALS flag in the backend credential resolution logic, or remove the rollback section from the documentation.

🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 55-55: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/src/content/docs/guides/migrating-shared-sessions.md` around lines 51 -
57, Docs reference a non-existent rollback flag; implement the
DISABLE_PER_USER_CREDENTIALS toggle in the backend credential resolution so the
docs are accurate. Update the logic in runtime_credentials.go (specifically
getEffectiveUserID() or the credential resolution path) to read the
DISABLE_PER_USER_CREDENTIALS env var and, when true, bypass per-user resolution
and always return the session owner ID (mirroring previous behavior), ensure
this interacts correctly with KEEP_CREDENTIALS_PERSISTENT semantics, and
add/update tests to cover both enabled and disabled flag scenarios.

Ambient Code Bot and others added 2 commits March 23, 2026 11:56
Shared sessions previously used the session owner's credentials for all
runs regardless of who sent the message, allowing credential
impersonation. Each message now uses the sender's own credentials.

- Backend: extract effective user from X-Runner-Current-User header with
  fallback to session owner (supports API keys and scheduled sessions)
- Proxy: forward X-Current-User-ID/Name headers from backend to runner
- Runner: set per-message user context before credential fetch, clear
  credentials after each turn to prevent cross-user leakage
- Add getEffectiveUserID/checkCredentialRBAC helpers (DRY)
- 30 new tests (17 Go + 13 Python)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add guidance for when to use relative URLs (browser/dashboard context
with automatic auth) vs absolute URLs (CLI/scripts/external with
explicit Bearer token). Updates HTML examples to use relative URLs
and adds curl examples for external scripts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Gkrumbach07 Gkrumbach07 force-pushed the fix/shared-session-credentials branch from a26b7f1 to 47fb751 Compare March 23, 2026 16:57
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (3)
components/backend/handlers/runtime_credentials_test.go (1)

173-366: ⚠️ Potential issue | 🟠 Major

These success cases still don't verify which user's credentials were selected.

Most of the new "success" paths only assert "not 401/403". They still pass if the handler ignores X-Runner-Current-User, falls back to the owner, or returns 404/500, so this suite won't catch the regression this PR is about. Seed distinct owner/collaborator credentials (or stub the provider getter) and assert the returned token/email belongs to the expected user, plus add the owner-impersonates-other-user negative case.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/backend/handlers/runtime_credentials_test.go` around lines 173 -
366, Tests only assert "not 401/403" so they don't verify which user's
credentials were returned; update the credential tests (functions:
GetGitHubTokenForSession, GetJiraCredentialsForSession,
GetGoogleCredentialsForSession, GetGitLabTokenForSession) by seeding distinct
owner and collaborator credentials (or stub the provider getter used by
createCredentialContext) so requests return distinguishable tokens/emails, then
replace the weak Expect(...).NotTo(Equal(http.StatusUnauthorized/Forbidden))
checks with assertions that the response body (from
httpUtils.GetResponseRecorder()) contains the expected user's token/email for
each scenario (BOT_TOKEN-without-header -> owner, BOT_TOKEN-with-header ->
collaborator, owner direct -> owner, non-owner direct -> 403), and add a
negative test where owner attempts to impersonate another user and assert it
does not return the other user's credentials.
components/backend/handlers/runtime_credentials.go (2)

37-47: ⚠️ Potential issue | 🔴 Critical

Owner fallback still allows impersonation when the current-user header is present.

If the authenticated caller is the session owner, authenticatedUserID == ownerUserID makes this return true for any effectiveUserID. That lets the owner supply X-Runner-Current-User for another participant and fetch that participant's credentials; owner fallback must only apply when no current-user header was supplied.

🔒 Suggested fix
 func checkCredentialRBAC(c *gin.Context, ownerUserID, effectiveUserID string) bool {
 	authenticatedUserID := c.GetString("userID")
 	if authenticatedUserID == "" {
 		// BOT_TOKEN (session ServiceAccount) - allowed by K8s RBAC
 		return true
 	}
-	// Allow if authenticated user is either the owner or the effective user
-	return authenticatedUserID == ownerUserID || authenticatedUserID == effectiveUserID
+	if effectiveUserID == ownerUserID {
+		return authenticatedUserID == ownerUserID
+	}
+	return authenticatedUserID == effectiveUserID
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/backend/handlers/runtime_credentials.go` around lines 37 - 47, In
checkCredentialRBAC, the owner fallback currently allows the session owner to
impersonate any effectiveUserID whenever the owner is authenticated; change the
logic so the owner is only allowed to act as the effective user when no
"current-user" header was supplied. Specifically, detect whether the
X-Runner-Current-User (or whatever header you use to derive effectiveUserID) is
present/non-empty from the gin.Context and, if it is present, require
authenticatedUserID == effectiveUserID (do not grant owner fallback); if the
header is absent, allow owner fallback by permitting authenticatedUserID ==
ownerUserID. Keep the existing BOT_TOKEN behavior (authenticatedUserID == ""
returns true).

109-109: ⚠️ Potential issue | 🟠 Major

Keep raw principals out of default credential-access logs.

These CREDENTIAL_ACCESS lines write effectiveUserID, ownerUserID, and auth_as on every fetch. In this codebase those IDs can be email-like, so this creates a high-volume PII path in ordinary application logs; prefer opaque IDs or hashes here and reserve full principals for a dedicated audit sink.

Also applies to: 191-191, 275-275, 341-341

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/backend/handlers/runtime_credentials.go` at line 109, The
log.Printfs are emitting raw principals (effectiveUserID, ownerUserID, and
c.GetString("userID")) which may contain PII—change the logging to emit only
opaque identifiers by replacing raw values with a deterministic hash or masked
form (e.g., sha256/hex or a MaskPrincipal helper) before calling log.Printf;
update the occurrences that use log.Printf and those same symbols
(effectiveUserID, ownerUserID, c.GetString("userID")) at the other sites noted
(lines with the other CREDENTIAL_ACCESS calls) to use the same hashing/masking
helper so all credential-access logs contain only non-PII opaque IDs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`:
- Around line 154-158: The cleanup of process-wide credentials must be made safe
and deterministic: move the clear_runtime_credentials() call into a finally
block in run(), and only clear if this turn set the credentials (e.g., record a
per-run marker or the exact env values you set via _context.get_env and compare
before clearing). Also synchronize access to os.environ mutations using the same
get_lock(thread_id) or a dedicated global lock so concurrent turns cannot race;
alternatively restore previous env values saved at setup time instead of
unconditionally calling clear_runtime_credentials(). Ensure
KEEP_CREDENTIALS_PERSISTENT handling remains respected.

In `@components/runners/ambient-runner/ambient_runner/endpoints/run.py`:
- Around line 63-78: The code is mutating the shared RunnerContext by calling
bridge.context.set_current_user(...) which can race across concurrent requests;
instead keep identity per-run by not calling bridge.context.set_current_user and
passing the sanitized values through the local run scope (e.g., store
current_user_id/current_user_name in local variables or attach them to the
per-request run object) and update callers that need them (notably
platform/auth.py::_fetch_credential) to accept the per-run identity as
parameters; use sanitize_user_context(...) as you already do, remove the shared
mutation via bridge.context.set_current_user, and thread the per-run identity
into functions that require it (or into a per-request context object that is not
shared globally).

In `@components/runners/ambient-runner/ambient_runner/platform/auth.py`:
- Around line 330-344: The current clear_runtime_credentials() only unsets a
fixed env list; extend it to also remove any dynamically-created MCP_*
environment variables (e.g., keys created by populate_mcp_server_credentials())
by scanning os.environ for keys starting with "MCP_" and popping them, and also
remove Google refresh credentials written by populate_runtime_credentials() by
deleting the workspace credential file (the
.google_workspace_mcp/credentials/credentials.json under the workspace path) and
its parent credentials directory if empty; ensure logging records which MCP_*
keys and whether the Google credential file was removed.

---

Duplicate comments:
In `@components/backend/handlers/runtime_credentials_test.go`:
- Around line 173-366: Tests only assert "not 401/403" so they don't verify
which user's credentials were returned; update the credential tests (functions:
GetGitHubTokenForSession, GetJiraCredentialsForSession,
GetGoogleCredentialsForSession, GetGitLabTokenForSession) by seeding distinct
owner and collaborator credentials (or stub the provider getter used by
createCredentialContext) so requests return distinguishable tokens/emails, then
replace the weak Expect(...).NotTo(Equal(http.StatusUnauthorized/Forbidden))
checks with assertions that the response body (from
httpUtils.GetResponseRecorder()) contains the expected user's token/email for
each scenario (BOT_TOKEN-without-header -> owner, BOT_TOKEN-with-header ->
collaborator, owner direct -> owner, non-owner direct -> 403), and add a
negative test where owner attempts to impersonate another user and assert it
does not return the other user's credentials.

In `@components/backend/handlers/runtime_credentials.go`:
- Around line 37-47: In checkCredentialRBAC, the owner fallback currently allows
the session owner to impersonate any effectiveUserID whenever the owner is
authenticated; change the logic so the owner is only allowed to act as the
effective user when no "current-user" header was supplied. Specifically, detect
whether the X-Runner-Current-User (or whatever header you use to derive
effectiveUserID) is present/non-empty from the gin.Context and, if it is
present, require authenticatedUserID == effectiveUserID (do not grant owner
fallback); if the header is absent, allow owner fallback by permitting
authenticatedUserID == ownerUserID. Keep the existing BOT_TOKEN behavior
(authenticatedUserID == "" returns true).
- Line 109: The log.Printfs are emitting raw principals (effectiveUserID,
ownerUserID, and c.GetString("userID")) which may contain PII—change the logging
to emit only opaque identifiers by replacing raw values with a deterministic
hash or masked form (e.g., sha256/hex or a MaskPrincipal helper) before calling
log.Printf; update the occurrences that use log.Printf and those same symbols
(effectiveUserID, ownerUserID, c.GetString("userID")) at the other sites noted
(lines with the other CREDENTIAL_ACCESS calls) to use the same hashing/masking
helper so all credential-access logs contain only non-PII opaque IDs.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 70a21b05-9f89-4c0b-a597-3843b503dd52

📥 Commits

Reviewing files that changed from the base of the PR and between a26b7f1 and 47fb751.

📒 Files selected for processing (7)
  • components/backend/handlers/runtime_credentials.go
  • components/backend/handlers/runtime_credentials_test.go
  • components/backend/websocket/agui_proxy.go
  • components/runners/ambient-runner/ambient_runner/bridges/claude/backend_tools.py
  • components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py
  • components/runners/ambient-runner/ambient_runner/endpoints/run.py
  • components/runners/ambient-runner/ambient_runner/platform/auth.py

…n identity

Session ServiceAccounts created by the operator lacked the
ambient-code.io/created-by-user-id annotation, causing child sessions
created via the create-session tool to resolve the raw SA identity
(e.g. system-serviceaccount-...) instead of the human user. The backend
middleware (resolveServiceAccountFromToken) relies on this annotation to
map SA tokens back to the original user.

- Set annotation from spec.userContext.userId when creating the SA
- Patch annotation on pre-existing SAs during restarts
- Fix gofmt formatting in agui_proxy.go

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/backend/websocket/agui_proxy.go`:
- Around line 771-777: The current header-setting logic may set
X-Current-User-Name without X-Current-User-ID, creating partial identity
context; update the header logic so that X-Current-User-Name is only set when
userID is present (i.e., nest the userName check inside the userID check) by
adjusting the block that calls req.Header.Set using the userID and userName
variables so both headers are emitted together only when userID != "".
- Around line 336-340: The logging currently prints raw sensitive identifiers
via the log.Printf statements that reference currentUserID and currentUserName
(in the agui_proxy.go run handling around projectName/sessionName); change these
logs to avoid full identifiers by either logging a redacted/masked form (e.g.,
show only a fixed prefix/suffix or hash of currentUserID), or log presence/role
only (e.g., "user present" or "using session owner credentials") and include
projectName/sessionName for context; update the two log.Printf calls so they
never emit full currentUserID/currentUserName values and instead emit the
masked/hashed or boolean form.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: fd676ba2-30b4-4e29-8f5f-ae709c44601f

📥 Commits

Reviewing files that changed from the base of the PR and between 47fb751 and 5ee9958.

📒 Files selected for processing (2)
  • components/backend/websocket/agui_proxy.go
  • components/operator/internal/handlers/sessions.go

- Fix checkCredentialRBAC: prevent owner from impersonating other users
  via X-Runner-Current-User header (only allow owner fallback when no
  header present)
- Fix runner concurrency: pass user context through bridge.run() params
  instead of mutating shared RunnerContext; set inside the lock
- Move clear_runtime_credentials() into finally block to ensure cleanup
  on errors/cancellation
- Extend credential cleanup to clear MCP_* env vars and Google
  Workspace credential files
- Remove PII (raw userIDs) from CREDENTIAL_ACCESS log lines
- Fix docs: replace non-existent DISABLE_PER_USER_CREDENTIALS flag with
  KEEP_CREDENTIALS_PERSISTENT
- Add **kwargs to gemini/langgraph bridges for forward compatibility

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py (1)

89-127: ⚠️ Potential issue | 🔴 Critical

Set the per-run user before any credential refresh.

_setup_platform() populates runtime/MCP credentials later in this file, and those fetches only consult context.current_user_id in ambient_runner/platform/auth.py. Here the per-run user is not set until after _ensure_ready() / _refresh_credentials_if_stale() have already run, so the first credential hydration of a shared-session turn can still use the owner or previous user's identity.

As per coding guidelines, **: -Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`
around lines 89 - 127, The per-run user is being set too late in run — after
_ensure_ready() and _refresh_credentials_if_stale() — which allows credential
hydration to use the wrong user; move the call to
self._context.set_current_user(current_user_id, current_user_name) to the top of
run (before calling self._ensure_ready() and
self._refresh_credentials_if_stale()) so that platform/auth credential fetches
see the correct current_user_id; locate the change in the run method and update
the placement of _context.set_current_user accordingly.
components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/bridge.py (1)

77-133: ⚠️ Potential issue | 🔴 Critical

Per-message credential scoping is still not wired into Gemini runs.

ambient_runner/platform/auth.py::_fetch_credential() only looks at RunnerContext.current_user_id, but this method never applies the new per-run user kwargs before _ensure_ready() / _refresh_credentials_if_stale() / _setup_platform() populate credentials. It also never clears runtime credentials in finally. In a shared Gemini session, that means a turn can still hydrate owner/default credentials and leave them behind for the next turn.

As per coding guidelines, **: -Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/bridge.py`
around lines 77 - 133, The Gemini run path doesn't scope per-message credentials
because ambient_runner/platform/auth.py::_fetch_credential reads
RunnerContext.current_user_id but the run method (run in bridge.py) calls
_ensure_ready/_refresh_credentials_if_stale before applying per-run user kwargs
and never clears runtime creds; fix it by setting the per-run user id into
RunnerContext (or calling the runtime-credential setter) from the incoming
RunAgentInput before calling
_ensure_ready/_refresh_credentials_if_stale/_setup_platform so _fetch_credential
resolves the right creds, and ensure you clear/restore runtime credentials in a
finally block at the end of run (so owner/default creds are not left behind
between threads); reference run, RunAgentInput, _ensure_ready,
_refresh_credentials_if_stale, and _fetch_credential when making these changes.
♻️ Duplicate comments (5)
docs/src/content/docs/guides/migrating-shared-sessions.md (2)

55-55: ⚠️ Potential issue | 🟡 Minor

Add language specifier to code block.

The fenced code block should specify bash for proper syntax highlighting.

📝 Proposed fix
-```
+```bash
 KEEP_CREDENTIALS_PERSISTENT=true
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @docs/src/content/docs/guides/migrating-shared-sessions.md at line 55, The
fenced code block containing the environment variable example (the
triple-backtick block that wraps KEEP_CREDENTIALS_PERSISTENT=true) lacks a
language specifier; update that code fence to start with bash so the block reads bash followed by KEEP_CREDENTIALS_PERSISTENT=true and then the closing


51-59: ⚠️ Potential issue | 🔴 Critical

Rollback section is inaccurate and misleading.

The documentation claims that KEEP_CREDENTIALS_PERSISTENT=true will "preserve the previous session owner behavior" and that "the session owner's credentials remain active for all messages." This is incorrect. Based on the implementation:

  • KEEP_CREDENTIALS_PERSISTENT only prevents the runner from clearing credentials from memory after each turn
  • It does NOT affect backend credential resolution in getEffectiveUserID()
  • The backend still resolves the effective user from the X-Runner-Current-User header for each message
  • There is no mechanism to revert to using session owner credentials for all messages

This is a security documentation issue because it gives users false expectations about how to rollback the per-message credential feature.

Either implement a backend toggle (e.g., DISABLE_PER_USER_CREDENTIALS in runtime_credentials.go) that actually reverts credential resolution to session owner mode, or remove this rollback section entirely.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/src/content/docs/guides/migrating-shared-sessions.md` around lines 51 -
59, The Rollback section is incorrect because KEEP_CREDENTIALS_PERSISTENT only
prevents in-memory credential cleanup and does not change backend resolution in
getEffectiveUserID(), so either remove the rollback paragraph entirely or
implement a true backend toggle; to fix, either delete the claims about
restoring "previous session owner behavior" and the KEEP_CREDENTIALS_PERSISTENT
example in the docs, or add a backend feature (e.g., a
DISABLE_PER_USER_CREDENTIALS flag in runtime_credentials.go) that forces
getEffectiveUserID() to return the session owner and then update the docs to
reference that exact flag and its behavior.
components/runners/ambient-runner/ambient_runner/endpoints/run.py (1)

68-75: ⚠️ Potential issue | 🟠 Major

Don't emit current_user_id in runner info logs.

current_user_id can be an email-like identifier. Logging it on every run creates another PII exhaust path in default runner logs; log presence or a hash instead if you need correlation.

As per coding guidelines, **: -Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/endpoints/run.py` around
lines 68 - 75, The code logs a PII-bearing current_user_id (in run.py after
sanitize_user_context), which must be removed; replace the logger.info(f"Run
user context: {current_user_id}") call in the block that follows
sanitize_user_context with a non-PII alternative such as logging
presence/boolean or a deterministic hash of current_user_id (e.g., compute a
SHA‑256 or similar of the value and log that) so callers can correlate runs
without exposing the actual identifier; keep the sanitize_user_context call but
ensure only the hashed or "present"/"absent" value is passed to logger.info.
components/backend/handlers/runtime_credentials.go (1)

95-97: ⚠️ Potential issue | 🟠 Major

The new credential path still writes raw user identifiers into backend logs.

These branches log authenticatedUserID, ownerUserID, or effectiveUserID on denial, lookup failure, and Google refresh. In this codebase those IDs can be email-like, so this reintroduces the same PII exhaust path the PR is otherwise trying to close. Prefer provider/session metadata plus booleans, or hashed IDs if you need correlation.

As per coding guidelines, **: -Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Also applies to: 110-113, 175-177, 189-214, 263-265, 273-274, 329-331, 339-339

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/backend/handlers/runtime_credentials.go` around lines 95 - 97, The
log statements in runtime_credentials.go (e.g., the log.Printf in the
checkCredentialRBAC failure branch and similar logs around the same function)
are emitting raw user identifiers
(authenticatedUserID/ownerUserID/effectiveUserID); replace those prints with
non-PII diagnostics such as provider/session metadata and boolean flags, or log
a stable hashed ID (e.g., SHA256 of the userID) if correlation is required;
update every occurrence noted (the failure branch at the checkCredentialRBAC
call and the other logged branches around Google refresh/lookup) to stop writing
raw identifiers and instead emit either (1) provider/session info + action
outcome or (2) hashed user identifiers, keeping the log messages concise and
consistent with existing logging helpers.
components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py (1)

123-169: ⚠️ Potential issue | 🔴 Critical

Per-thread locking still doesn't make process-wide credential state safe.

populate_runtime_credentials() / clear_runtime_credentials() mutate os.environ and on-disk credential files, but this path only serializes by thread_id. Two concurrent turns on different threads can still overwrite or clear each other's credentials mid-run. This needs a process-wide lock or per-turn isolated env/context instead of shared process globals.

As per coding guidelines, **: -Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`
around lines 123 - 169, The credential setup/teardown is only guarded by
per-thread locks (get_lock(thread_id)) but
populate_runtime_credentials()/clear_runtime_credentials() modify process-global
state (os.environ and on-disk files), so you must serialize those calls with a
process-wide lock or use per-run isolated credential contexts; change the code
that calls populate_runtime_credentials()/clear_runtime_credentials() to acquire
a global credential lock (e.g., add or reuse a global asyncio.Lock or
threading.Lock exposed from ambient_runner.platform.auth like CREDENTIAL_LOCK)
before populating and hold it until credentials are cleared in the finally
block, or refactor adapter.run to accept an explicit credentials/context object
instead of relying on process env and remove the global env mutation; ensure the
lock is acquired around both populate_runtime_credentials and
clear_runtime_credentials and reference the symbols
populate_runtime_credentials, clear_runtime_credentials, get_lock, and
adapter.run to locate the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/runners/ambient-runner/ambient_runner/platform/auth.py`:
- Around line 354-367: The cleanup logic uses WORKSPACE_PATH directly which can
diverge from where populate_runtime_credentials() actually wrote the
user-specific Google refresh-token file; update both the writer and the remover
to use a single shared helper/constant (e.g., a
get_workspace_mcp_credentials_path() helper or a WORKSPACE_MCP_DIR constant) so
that the code that writes the refresh token in populate_runtime_credentials()
and the cleanup code that builds google_cred_file reference the exact same path;
replace direct uses of os.environ.get("WORKSPACE_PATH", "/workspace") and manual
Path construction in the cleanup block with that shared helper/constant and
ensure the helper honors any overridden WORKSPACE_PATH.
- Around line 348-352: The cleanup loop currently removes every MCP_* env var
(mcp_keys), including static config like MCP_CONFIG_FILE; instead, change
populate_mcp_server_credentials to record and return only the dynamically
injected credential names (e.g., return a list or set of keys, or store them in
a module-level variable like injected_mcp_keys), then update this cleanup code
to iterate only over that returned/injected_mcp_keys to pop and append to
cleared; keep static MCP_* vars untouched so populate_mcp_server_credentials can
still read MCP_CONFIG_FILE and other static settings.

---

Outside diff comments:
In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`:
- Around line 89-127: The per-run user is being set too late in run — after
_ensure_ready() and _refresh_credentials_if_stale() — which allows credential
hydration to use the wrong user; move the call to
self._context.set_current_user(current_user_id, current_user_name) to the top of
run (before calling self._ensure_ready() and
self._refresh_credentials_if_stale()) so that platform/auth credential fetches
see the correct current_user_id; locate the change in the run method and update
the placement of _context.set_current_user accordingly.

In
`@components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/bridge.py`:
- Around line 77-133: The Gemini run path doesn't scope per-message credentials
because ambient_runner/platform/auth.py::_fetch_credential reads
RunnerContext.current_user_id but the run method (run in bridge.py) calls
_ensure_ready/_refresh_credentials_if_stale before applying per-run user kwargs
and never clears runtime creds; fix it by setting the per-run user id into
RunnerContext (or calling the runtime-credential setter) from the incoming
RunAgentInput before calling
_ensure_ready/_refresh_credentials_if_stale/_setup_platform so _fetch_credential
resolves the right creds, and ensure you clear/restore runtime credentials in a
finally block at the end of run (so owner/default creds are not left behind
between threads); reference run, RunAgentInput, _ensure_ready,
_refresh_credentials_if_stale, and _fetch_credential when making these changes.

---

Duplicate comments:
In `@components/backend/handlers/runtime_credentials.go`:
- Around line 95-97: The log statements in runtime_credentials.go (e.g., the
log.Printf in the checkCredentialRBAC failure branch and similar logs around the
same function) are emitting raw user identifiers
(authenticatedUserID/ownerUserID/effectiveUserID); replace those prints with
non-PII diagnostics such as provider/session metadata and boolean flags, or log
a stable hashed ID (e.g., SHA256 of the userID) if correlation is required;
update every occurrence noted (the failure branch at the checkCredentialRBAC
call and the other logged branches around Google refresh/lookup) to stop writing
raw identifiers and instead emit either (1) provider/session info + action
outcome or (2) hashed user identifiers, keeping the log messages concise and
consistent with existing logging helpers.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`:
- Around line 123-169: The credential setup/teardown is only guarded by
per-thread locks (get_lock(thread_id)) but
populate_runtime_credentials()/clear_runtime_credentials() modify process-global
state (os.environ and on-disk files), so you must serialize those calls with a
process-wide lock or use per-run isolated credential contexts; change the code
that calls populate_runtime_credentials()/clear_runtime_credentials() to acquire
a global credential lock (e.g., add or reuse a global asyncio.Lock or
threading.Lock exposed from ambient_runner.platform.auth like CREDENTIAL_LOCK)
before populating and hold it until credentials are cleared in the finally
block, or refactor adapter.run to accept an explicit credentials/context object
instead of relying on process env and remove the global env mutation; ensure the
lock is acquired around both populate_runtime_credentials and
clear_runtime_credentials and reference the symbols
populate_runtime_credentials, clear_runtime_credentials, get_lock, and
adapter.run to locate the change.

In `@components/runners/ambient-runner/ambient_runner/endpoints/run.py`:
- Around line 68-75: The code logs a PII-bearing current_user_id (in run.py
after sanitize_user_context), which must be removed; replace the
logger.info(f"Run user context: {current_user_id}") call in the block that
follows sanitize_user_context with a non-PII alternative such as logging
presence/boolean or a deterministic hash of current_user_id (e.g., compute a
SHA‑256 or similar of the value and log that) so callers can correlate runs
without exposing the actual identifier; keep the sanitize_user_context call but
ensure only the hashed or "present"/"absent" value is passed to logger.info.

In `@docs/src/content/docs/guides/migrating-shared-sessions.md`:
- Line 55: The fenced code block containing the environment variable example
(the triple-backtick block that wraps KEEP_CREDENTIALS_PERSISTENT=true) lacks a
language specifier; update that code fence to start with ```bash so the block
reads ```bash followed by KEEP_CREDENTIALS_PERSISTENT=true and then the closing
``` to enable proper Bash syntax highlighting.
- Around line 51-59: The Rollback section is incorrect because
KEEP_CREDENTIALS_PERSISTENT only prevents in-memory credential cleanup and does
not change backend resolution in getEffectiveUserID(), so either remove the
rollback paragraph entirely or implement a true backend toggle; to fix, either
delete the claims about restoring "previous session owner behavior" and the
KEEP_CREDENTIALS_PERSISTENT example in the docs, or add a backend feature (e.g.,
a DISABLE_PER_USER_CREDENTIALS flag in runtime_credentials.go) that forces
getEffectiveUserID() to return the session owner and then update the docs to
reference that exact flag and its behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 167bc726-8b8e-4f0e-b638-ea8fd6764cca

📥 Commits

Reviewing files that changed from the base of the PR and between 5ee9958 and 58f46ed.

📒 Files selected for processing (7)
  • components/backend/handlers/runtime_credentials.go
  • components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py
  • components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/bridge.py
  • components/runners/ambient-runner/ambient_runner/bridges/langgraph/bridge.py
  • components/runners/ambient-runner/ambient_runner/endpoints/run.py
  • components/runners/ambient-runner/ambient_runner/platform/auth.py
  • docs/src/content/docs/guides/migrating-shared-sessions.md

Ambient Code Bot and others added 13 commits March 23, 2026 14:58
- Remove raw userID from agui_proxy run logs
- Only send X-Current-User-Name header when X-Current-User-ID is present
- Narrow MCP_* cleanup to credential keys only (preserve MCP_CONFIG_FILE)
- Use hardcoded /workspace path for Google credential cleanup to match
  populate_runtime_credentials write path

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…run user

Users inside a shared Claude session have access to the BOT_TOKEN
(ServiceAccount token) via environment variables. Previously, a user
could craft a curl request with X-Runner-Current-User set to another
user's ID and fetch their credentials (GitHub, Jira, Google, GitLab).

Fix: the AG-UI proxy now records which authenticated user initiated each
run in SessionActiveUserMap. The credential RBAC check validates that
BOT_TOKEN callers can only request credentials for the user who started
the current run — not arbitrary users.

- Add SessionActiveUserMap (sync.Map) to track active user per session
- Update checkCredentialRBAC to validate BOT_TOKEN requests against map
- Fall back to owner-only when no active user tracked (scheduled/API)
- Update tests to pre-populate active user for BOT_TOKEN test cases

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…scoping

Replace BOT_TOKEN + X-Runner-Current-User header approach with
forwarding the actual user's bearer token to the runner. This
eliminates the credential impersonation vector where users inside a
shared session could use the BOT_TOKEN (accessible via $BOT_TOKEN env
var) to fetch other users' credentials.

New flow:
1. Backend proxy extracts caller's Authorization header
2. Forwards it to runner as X-Caller-Token
3. Runner uses caller's token (not BOT_TOKEN) for credential requests
4. Backend authenticates the request as the actual user
5. Token cleared in finally block after each turn

BOT_TOKEN callers can now only access the session owner's credentials
(fallback for automated/scheduled sessions). Per-user scoping is
handled entirely by the forwarded caller token.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the runner uses the caller's bearer token, it must also set
X-Runner-Current-User so the backend resolves the effective user to the
caller (not the owner). Without this, non-owner users in shared sessions
would get 403 because the RBAC check compared their authenticated
identity against the owner.

The security model remains intact: authenticatedUserID (from token) must
match effectiveUserID (from header), so impersonation is impossible.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The caller token was being set on the context INSIDE the lock, but
credential population happened BEFORE the lock. This meant credentials
were always fetched with BOT_TOKEN, never with the caller's token.

Move set_current_user() to before _ensure_ready() and
_refresh_credentials_if_stale() so populate_runtime_credentials() uses
the caller's bearer token for per-user credential scoping.

The refresh_credentials MCP tool also benefits — it calls
populate_runtime_credentials(context) which now has the caller token,
so mid-run refreshes use the correct user's identity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The 60-second credential cache meant that when User B sent a message
right after User A, User A's credentials were still active. Now
credentials are always re-fetched when the caller_user_id changes,
bypassing the staleness cache. Same-user consecutive messages still
benefit from the cache.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cached credentials (env vars like JIRA_API_TOKEN) persisted across runs,
so a child session or subsequent user would inherit the previous user's
tokens. Now every run clears all credentials first, then fetches fresh
ones for the current caller. This prevents cross-user credential leakage
in shared and child sessions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
External MCP servers (Jira, Google Workspace) capture credentials at
process start and don't refresh when env vars change. When a different
user sends a message, the existing Claude Code worker is destroyed and
a new one created. The session ID is preserved so --resume picks up the
conversation seamlessly with the new user's credentials and fresh MCP
servers.

Also populates MCP server credentials on every run (not just first)
to ensure dynamic MCP_* env vars are set before the Claude client
starts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the runner forwards the caller's bearer token (X-Caller-Token),
the backend middleware doesn't resolve the user identity because the
request bypasses the OAuth proxy. The RBAC check saw an empty
authenticatedUserID and treated it as BOT_TOKEN, rejecting non-owner
credential requests with 403.

Fix: checkCredentialRBAC now calls resolveTokenIdentity() via
SelfSubjectReview when the middleware didn't set userID but a valid
K8s client exists. This correctly identifies the caller from their
forwarded token.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… change

- Extract enforceCredentialRBAC() helper to deduplicate RBAC boilerplate
  across 4 credential endpoints (GitHub, Google, Jira, GitLab)
- Cache SelfSubjectReview result per-request via resolveAuthenticatedUser()
  to avoid 4 redundant K8s API calls per turn
- Skip SA annotation update when already correct (avoid no-op K8s writes)
- Rebuild MCP server config when user changes so .mcp.json env blocks
  (e.g., ${JIRA_API_TOKEN}) re-expand with the new user's credentials
- Remove duplicate CREDENTIAL_ACCESS log lines (now in shared helper)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Consolidate user context, credential refresh, worker restart, and MCP
rebuild into a single _initialize_run() method. Makes run() cleaner
and gives a single entry point for spinning up/down the runtime per user.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When --resume is used, Claude Code restores the previous session's MCP
server state including baked-in credentials. This meant that when User B
sent a message after User A, the Jira MCP server still had User A's
API token despite the runner clearing and re-fetching credentials.

Fix: when the user changes, destroy the worker WITHOUT saving the
session ID. The new Claude Code process starts fresh with no --resume,
getting MCP servers initialized with the current user's credentials.

Conversation history is lost on user switch, but credential isolation
is guaranteed. A future improvement could use the SDK's
add_mcp_server/remove_mcp_server to dynamically swap servers without
losing the session.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Revert the "no resume" approach. Each ClaudeSDKClient creates a new CLI
subprocess that spawns fresh MCP servers from the current os.environ —
--resume only restores conversation state, not MCP server processes.

When the user changes:
1. Destroy the worker (saves session ID for --resume)
2. Rebuild MCP server config with current env vars
3. Set adapter to None so _ensure_adapter rebuilds with new mcp_servers
4. New worker created via get_or_create with --resume + fresh MCP config

Conversation history preserved, credentials isolated per user.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ambient-code ambient-code bot added this to the Review Queue milestone Mar 24, 2026
Ambient Code Bot and others added 2 commits March 24, 2026 08:44
OAuth tokens typically expire after ~1 hour. For long-running sessions
where the same user stays active, the caller token expires and credential
refresh fails with 401/403. Now falls back to BOT_TOKEN which fetches
the session owner's credentials — correct for single-user sessions
since the owner IS the active user.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the caller's OAuth token expires during a long run, the runner
falls back to BOT_TOKEN. Previously this could only access the owner's
credentials. Now the backend tracks which user initiated the current
run (sessionActiveUserMap, set by the AG-UI proxy) and allows BOT_TOKEN
callers to access that user's credentials.

Flow:
1. User B sends message → proxy records activeUser = "user-B"
2. Runner uses caller token → works while OAuth valid
3. Token expires → runner falls back to BOT_TOKEN + X-Runner-Current-User
4. Backend checks: effectiveUserID == activeUser → allowed
5. Returns User B's credentials (not owner's)

Security: only the proxy can set the active user map (not the runner or
Claude). An attacker with BOT_TOKEN can only access the user that the
proxy last authenticated — not arbitrary users.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
components/runners/ambient-runner/ambient_runner/platform/auth.py (1)

90-147: ⚠️ Potential issue | 🔴 Critical

Validate BACKEND_API_URL scheme and host before credential fetches.

BACKEND_API_URL is read from the environment without validation and passed directly to urlopen(). Since runner containers accept user-provided environmentVariables that can override this variable, a session can redirect credential fetches to an attacker-controlled host and exfiltrate caller_token or BOT_TOKEN. Additionally, urllib.request.urlopen() accepts non-HTTP schemes like file://, which could enable further exploitation. Enforce that the URL uses http:// or https:// and targets only trusted backend endpoints before issuing the request.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/platform/auth.py` around
lines 90 - 147, The code in _fetch_credential reads BACKEND_API_URL into base
and constructs url then calls urlopen without validating scheme/host, allowing
attacker-controlled environment to exfiltrate tokens; fix by validating base
(from BACKEND_API_URL) before constructing url: parse base with
urllib.parse.urlparse and ensure scheme is "http" or "https" and the netloc
(hostname) matches the expected trusted backend host or a configured allowlist,
raise/log and return {} if validation fails, and only then proceed to build url
and perform the request (this validation should occur before creating req and
before any use of context.caller_token or BOT_TOKEN and applies to both the
primary request and the BOT_TOKEN fallback in _do_req).
♻️ Duplicate comments (4)
components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py (1)

113-117: ⚠️ Potential issue | 🔴 Critical

Serialize credential env/file mutation across all runs.

These calls mutate process-wide os.environ and the shared Google credential file, but the only synchronization here is get_lock(thread_id). A second thread can clear or overwrite credentials while another turn is still using them. Use a bridge-global lock around populate/clear, or stop storing per-user secrets in process-global env/file state.

Also applies to: 206-211

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`
around lines 113 - 117, The credential mutation sequence
(clear_runtime_credentials, populate_runtime_credentials,
populate_mcp_server_credentials) is not serialized across threads and must be
protected by a bridge-global lock; introduce a module-level asyncio.Lock or
threading.Lock in the claude bridge and wrap the entire sequence (including
setting self._last_creds_refresh) inside that lock in the function where these
calls occur (and likewise for the other occurrence around lines 206-211) so that
clear_runtime_credentials and the two populate_* calls execute atomically for
the bridge instance instead of relying only on get_lock(thread_id).
components/backend/websocket/agui_proxy.go (1)

373-377: ⚠️ Potential issue | 🟠 Major

Don't reintroduce raw user IDs into proxy logs.

logSuffix prints the full userID again, and in this codebase that identifier can be email-like PII. Log presence or a redacted/hash form instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/backend/websocket/agui_proxy.go` around lines 373 - 377, The log
currently appends the raw userID via logSuffix (userID) when printing "AGUI
Proxy: connecting to runner" which may expose PII; change the code around
logSuffix/userID so it does not emit the full identifier—either log a boolean
like "user=present" or a non-reversible short hash/redaction of userID (e.g.
first/last char masked or an HMAC/sha256 snippet) and include that sanitized
value instead of userID in the log message that uses runnerURL and logSuffix;
update any helper or inline code that constructs logSuffix to call the sanitizer
(referencing logSuffix, userID, and the log.Printf call) so raw IDs are never
written to logs.
components/runners/ambient-runner/ambient_runner/platform/context.py (1)

29-31: ⚠️ Potential issue | 🔴 Critical

Keep per-run identity off the shared RunnerContext.

This object is created once and shared, but these fields make request identity process-global. components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py mutates them per run and components/runners/ambient-runner/ambient_runner/platform/auth.py::_fetch_credential() reads them for auth headers, so concurrent turns can fetch the wrong user's credentials and stale IDs can leak into later turns.

Also applies to: 55-59

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/platform/context.py` around
lines 29 - 31, RunnerContext currently holds per-run identity fields
(current_user_id, current_user_name, caller_token) which are mutated by
bridges/claude/bridge.py and read by platform/auth.py::_fetch_credential(),
causing cross-request leakage; remove these fields from RunnerContext and
instead propagate per-request identity via a new ephemeral object or explicit
parameters: create a small PerRunIdentity (or pass identity args) that
bridges/claude/bridge.py instantiates per run and pass it into calls that need
auth (e.g., platform/auth.py::_fetch_credential(identity) or other auth call
sites) so no global RunnerContext state is mutated. Ensure you update all
locations that read/write current_user_id/current_user_name/caller_token to use
the new per-request carrier (or function params) and delete the shared fields
from RunnerContext to prevent concurrent leaks.
components/backend/handlers/runtime_credentials_test.go (1)

175-190: ⚠️ Potential issue | 🟠 Major

Success tests don't verify which user's credentials were selected.

These test cases only assert status != 401/403, which still passes if the handler ignores X-Runner-Current-User, falls back to the wrong user, or returns 404/500. For a security feature ensuring per-message credential scoping, the tests should:

  1. Seed distinct credentials for owner vs collaborator (or mock the credential provider)
  2. Assert the returned token/identity matches the expected user
  3. Add a negative case where owner supplies X-Runner-Current-User pointing to someone else whose credentials don't exist

Without this, the test suite won't catch regressions where the wrong user's credentials are returned.

Also applies to: 208-220, 236-249, 254-265, 352-363

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/backend/handlers/runtime_credentials_test.go` around lines 175 -
190, The tests in runtime_credentials_test.go (the It blocks using
createCredentialContext and calling GetGitHubTokenForSession) only assert
non-401/403 and don't verify which user's credentials were returned; update
these tests to seed distinct credentials for owner vs collaborator (or mock the
credential provider), call GetGitHubTokenForSession (using
createCredentialContext and httpUtils.GetResponseRecorder()) and assert the
returned token/identity equals the expected owner's token, then add a negative
case where X-Runner-Current-User is set to a different user whose creds do not
exist and assert the handler does not return the owner's token (or returns the
expected error/404); apply the same changes to the other similar It blocks in
this file that test credential selection.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/backend/handlers/runtime_credentials.go`:
- Around line 28-38: The current global sessionActiveUserMap and
SetSessionActiveUser use only sessionName as the key which can collide across
projects and concurrent runs; change the keying to include project and run
context (e.g., projectID/sessionName/runID or bind to the runner/request token)
so entries are unique, update all accesses (including enforceCredentialRBAC
where the BOT_TOKEN fallback reads sessionActiveUserMap) to use the new
composite key or request-scoped binding, and remove reliance on a bare
sessionName to prevent wrong non-owner authorization.

In `@components/operator/internal/handlers/sessions.go`:
- Around line 2431-2437: The code currently proceeds to mint a new runner token
even if updating/reading the target ServiceAccount annotations fails, which can
leave BOT_TOKEN attribution tied to the previous owner; update the handler so
that after computing saAnnotations (from
unstructured.NestedString(session.Object, "spec", "userContext", "userId")) you
must reconcile the ServiceAccount annotations (ensure the ServiceAccount has
ambient-code.io/created-by-user-id set when ownerUID is present and remove that
annotation when ownerUID is empty) and treat any failure to read or update the
ServiceAccount as a hard error that aborts before calling CreateToken(...);
specifically change the path around ServiceAccount annotation lookup/update (the
code manipulating saAnnotations and the ServiceAccount resource) to perform a
guaranteed successful reconciliation or return the error, then only call
CreateToken(...) when reconciliation succeeded.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`:
- Around line 123-133: The code currently skips rebuilding when current_user_id
becomes the empty-owner string because user_changed checks prev_user != "";
change user_changed to simply detect any change in owner id by using
user_changed = current_user_id != prev_user (or equivalent) so transitions like
"userB" -> "" trigger the rebuild path; keep the existing guard that checks
self._session_manager.get_existing(thread_id) and continue to call
self._session_manager.destroy(thread_id), self._rebuild_mcp_servers(), and set
self._adapter = None so MCP subprocesses and adapter are rebuilt with owner
credentials.

---

Outside diff comments:
In `@components/runners/ambient-runner/ambient_runner/platform/auth.py`:
- Around line 90-147: The code in _fetch_credential reads BACKEND_API_URL into
base and constructs url then calls urlopen without validating scheme/host,
allowing attacker-controlled environment to exfiltrate tokens; fix by validating
base (from BACKEND_API_URL) before constructing url: parse base with
urllib.parse.urlparse and ensure scheme is "http" or "https" and the netloc
(hostname) matches the expected trusted backend host or a configured allowlist,
raise/log and return {} if validation fails, and only then proceed to build url
and perform the request (this validation should occur before creating req and
before any use of context.caller_token or BOT_TOKEN and applies to both the
primary request and the BOT_TOKEN fallback in _do_req).

---

Duplicate comments:
In `@components/backend/handlers/runtime_credentials_test.go`:
- Around line 175-190: The tests in runtime_credentials_test.go (the It blocks
using createCredentialContext and calling GetGitHubTokenForSession) only assert
non-401/403 and don't verify which user's credentials were returned; update
these tests to seed distinct credentials for owner vs collaborator (or mock the
credential provider), call GetGitHubTokenForSession (using
createCredentialContext and httpUtils.GetResponseRecorder()) and assert the
returned token/identity equals the expected owner's token, then add a negative
case where X-Runner-Current-User is set to a different user whose creds do not
exist and assert the handler does not return the owner's token (or returns the
expected error/404); apply the same changes to the other similar It blocks in
this file that test credential selection.

In `@components/backend/websocket/agui_proxy.go`:
- Around line 373-377: The log currently appends the raw userID via logSuffix
(userID) when printing "AGUI Proxy: connecting to runner" which may expose PII;
change the code around logSuffix/userID so it does not emit the full
identifier—either log a boolean like "user=present" or a non-reversible short
hash/redaction of userID (e.g. first/last char masked or an HMAC/sha256 snippet)
and include that sanitized value instead of userID in the log message that uses
runnerURL and logSuffix; update any helper or inline code that constructs
logSuffix to call the sanitizer (referencing logSuffix, userID, and the
log.Printf call) so raw IDs are never written to logs.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`:
- Around line 113-117: The credential mutation sequence
(clear_runtime_credentials, populate_runtime_credentials,
populate_mcp_server_credentials) is not serialized across threads and must be
protected by a bridge-global lock; introduce a module-level asyncio.Lock or
threading.Lock in the claude bridge and wrap the entire sequence (including
setting self._last_creds_refresh) inside that lock in the function where these
calls occur (and likewise for the other occurrence around lines 206-211) so that
clear_runtime_credentials and the two populate_* calls execute atomically for
the bridge instance instead of relying only on get_lock(thread_id).

In `@components/runners/ambient-runner/ambient_runner/platform/context.py`:
- Around line 29-31: RunnerContext currently holds per-run identity fields
(current_user_id, current_user_name, caller_token) which are mutated by
bridges/claude/bridge.py and read by platform/auth.py::_fetch_credential(),
causing cross-request leakage; remove these fields from RunnerContext and
instead propagate per-request identity via a new ephemeral object or explicit
parameters: create a small PerRunIdentity (or pass identity args) that
bridges/claude/bridge.py instantiates per run and pass it into calls that need
auth (e.g., platform/auth.py::_fetch_credential(identity) or other auth call
sites) so no global RunnerContext state is mutated. Ensure you update all
locations that read/write current_user_id/current_user_name/caller_token to use
the new per-request carrier (or function params) and delete the shared fields
from RunnerContext to prevent concurrent leaks.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9c9c9b4c-32c7-46c7-bd21-752986bff29a

📥 Commits

Reviewing files that changed from the base of the PR and between 58f46ed and c018a31.

📒 Files selected for processing (9)
  • components/backend/handlers/runtime_credentials.go
  • components/backend/handlers/runtime_credentials_test.go
  • components/backend/websocket/agui_proxy.go
  • components/operator/internal/handlers/sessions.go
  • components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py
  • components/runners/ambient-runner/ambient_runner/endpoints/run.py
  • components/runners/ambient-runner/ambient_runner/platform/auth.py
  • components/runners/ambient-runner/ambient_runner/platform/context.py
  • components/runners/ambient-runner/tests/test_shared_session_credentials.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
components/backend/handlers/runtime_credentials.go (1)

143-157: ⚠️ Potential issue | 🔴 Critical

Don't re-expose the legacy project GitHub token to non-owners.

After enforceCredentialRBAC admits a non-owner effectiveUserID, this handler still calls git.GetGitHubToken; the helper in components/backend/handlers/git/operations.go falls back to ambient-non-vertex-integrations/GITHUB_TOKEN when that user has no PAT/App. In a shared session, that gives any participant the same project-wide token and undoes the per-message credential isolation this PR is adding. Have enforceCredentialRBAC return sameAsOwner/ownerUserID and suppress the legacy fallback when it is false.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/backend/handlers/runtime_credentials.go` around lines 143 - 157,
enforceCredentialRBAC currently only returns ok; update it to also return a flag
or ownerUserID (e.g., sameAsOwner bool / ownerUserID string) so callers know
whether the effectiveUserID is treated as the project owner, then in
runtime_credentials.go change the call site to capture that flag and, when
sameAsOwner==false, call git.GetGitHubToken in a mode that disables the legacy
project-wide fallback (or pass ownerUserID so GetGitHubToken skips
ambient-non-vertex-integrations/GITHUB_TOKEN fallback); specifically, modify
enforceCredentialRBAC and the call in runtime_credentials.go (where
enforceCredentialRBAC and git.GetGitHubToken are used) so non-owners cannot
receive the legacy project GitHub token.
♻️ Duplicate comments (7)
components/backend/handlers/runtime_credentials_test.go (1)

174-190: ⚠️ Potential issue | 🟠 Major

These “success” cases still don't prove which user's credentials were selected.

status != 401/403 also passes when the handler ignores X-Runner-Current-User, falls back to the owner, or just returns 404/500 because no credentials were found. Since the test setup uses generic provider stubs, this suite still misses the main regression this PR is about. Seed distinct owner/collaborator credentials or capture the requested effectiveUserID, then assert the response body matches the expected user.

Also applies to: 208-219, 235-248, 253-264, 350-363

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/backend/handlers/runtime_credentials_test.go` around lines 174 -
190, The test only checks for non-401/403 but not which user's credentials were
used; update the tests around createCredentialContext and
GetGitHubTokenForSession to assert the effective user explicitly by either (a)
seeding the fake K8s/provider with distinct owner vs collaborator GitHub token
records and asserting the response body or returned token matches the expected
user's token, or (b) instrumenting/mocking the git.GetGitHubToken (or the
provider stub used by GetGitHubTokenForSession) to capture the effectiveUserID
passed in and assert it equals the owner or collaborator as intended; apply the
same change pattern to the other affected cases (the blocks noted at lines
~208-219, 235-248, 253-264, 350-363) so each test validates which user was
selected rather than only checking status codes.
components/operator/internal/handlers/sessions.go (1)

2431-2473: ⚠️ Potential issue | 🔴 Critical

Make ServiceAccount attribution reconciliation a hard prerequisite before minting a new token.

Line 2454 skips reconciliation entirely when ownerUID becomes empty, so a reused ServiceAccount can keep a stale ambient-code.io/created-by-user-id. And Lines 2468-2470 only warn on update failure before CreateToken(...) still runs. That can keep BOT_TOKEN attribution pinned to the previous owner in the restart path this PR is hardening.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/operator/internal/handlers/sessions.go` around lines 2431 - 2473,
Ensure ServiceAccount attribution reconciliation always runs before minting a
token: when handling the existing SA (variables saAnnotations, ownerUID,
existingSA) always compare and reconcile the ambient-code.io/created-by-user-id
annotation — if ownerUID is empty remove the annotation if present, otherwise
set it to ownerUID; do this regardless of ownerUID being empty or not (don’t
skip when len(saAnnotations)==0). Also treat an Update failure as a hard error
(return the updErr) instead of only logging a warning so CreateToken (or any
subsequent token minting) cannot proceed with stale attribution.
components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py (2)

107-117: ⚠️ Potential issue | 🔴 Critical

Shared user/credential state is still racing across concurrent turns.

Line 109 writes request identity onto the shared RunnerContext, Lines 114-116 repopulate process-global env, and Lines 208-211 clear it again, but the only lock here is the per-thread lock acquired later at Line 165. Two overlapping turns can therefore overwrite or clear each other's identity and credentials before worker setup or _fetch_credential() reads them. Either make this state per-run, or guard the full setup→run→cleanup lifecycle with process-wide synchronization.

Also applies to: 165-211

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`
around lines 107 - 117, The code races shared user/credential state because
set_current_user on self._context and calls to clear_runtime_credentials /
populate_runtime_credentials / populate_mcp_server_credentials run outside the
process-wide synchronization used later; to fix, either make credentials and
identity per-run (pass a run-scoped credential/identity object into
_ensure_ready/_fetch_credential and into any helper that reads env) or wrap the
whole setup→run→cleanup lifecycle (the sequence that calls
self._context.set_current_user, await self._ensure_ready(),
clear_runtime_credentials(), populate_runtime_credentials(...),
populate_mcp_server_credentials(...), and the later cleanup) in a process-wide
lock so no two turns can interleave; move the set_current_user and
_last_creds_refresh assignment inside that protected section (or change helpers
to accept run-scoped state) and ensure _fetch_credential reads from the
run-scoped object rather than global process env.

119-133: ⚠️ Potential issue | 🔴 Critical

Rebuild on collaborator → owner/API-key transitions too.

With Line 123, prev_user="userB" and current_user_id="" yields user_changed == False, so the existing worker and MCP subprocesses survive the transition and keep user B's env-derived credentials.

Suggested fix
-        user_changed = current_user_id and current_user_id != prev_user and prev_user != ""
+        user_changed = current_user_id != prev_user
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`
around lines 119 - 133, The current user-change detection misses transitions
where current_user_id is empty (owner/API-key) but prev_user was a collaborator;
update the user_changed logic so it treats an empty current_user_id as a change
when prev_user is non-empty. Replace the current expression using
current_user_id and prev_user with something like: user_changed = prev_user !=
"" and (not current_user_id or current_user_id != prev_user) so that when
prev_user is set and current_user_id is empty or different it triggers the
existing destroy/rebuild path (affecting
self._session_manager.get_existing(thread_id), await
self._session_manager.destroy(thread_id), self._rebuild_mcp_servers(), and
resetting self._adapter).
components/runners/ambient-runner/ambient_runner/platform/auth.py (1)

372-377: ⚠️ Potential issue | 🟠 Major

Only clear MCP keys that were injected for this turn.

The heuristic here still deletes any preexisting MCP_{SERVER}_{FIELD} env var. That includes static per-server config supplied by the image/operator/user, so the first cleanup can break MCP servers even when no credential was injected. Track the keys set by populate_mcp_server_credentials() and clear only that set.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/platform/auth.py` around
lines 372 - 377, The current cleanup loop removes any env var matching
MCP_{SERVER}_{FIELD} which can delete preexisting static config; change
populate_mcp_server_credentials to record and return the exact keys it injects
(or append them to a module-level list like injected_mcp_keys), then in the
cleanup section use that returned/recorded list instead of computing
mcp_cred_keys via os.environ scan; specifically update
populate_mcp_server_credentials(...) to return List[str] of injected keys (or
push names into injected_mcp_keys), and replace the mcp_cred_keys comprehension
and loop with iterating over that returned/recorded list and popping only those
keys and appending them to cleared.
components/backend/handlers/runtime_credentials.go (2)

28-42: ⚠️ Potential issue | 🔴 Critical

Don't key BOT_TOKEN fallback by bare sessionName.

sessionActiveUserMap still collapses all active-user state to a single session string, so foo in project A can overwrite foo in project B, and overlapping runs in one session race each other. Downstream at Lines 112-113 that can authorize the wrong non-owner or clear the wrong entry. Key it by at least project/session/run, or bind it to the runner/request token instead of a global map.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/backend/handlers/runtime_credentials.go` around lines 28 - 42,
sessionActiveUserMap currently keys only by sessionName which allows collisions
across projects and concurrent runs; update SetSessionActiveUser,
ClearSessionActiveUser and the consumers (e.g., enforceCredentialRBAC) to use a
stronger composite key (for example "projectID:sessionName:runID" or the
runner/request token) instead of plain sessionName, store and delete entries
using that composite key in sessionActiveUserMap, and update lookup logic in
enforceCredentialRBAC to read by the same composite key so entries cannot be
overwritten or misattributed across projects or concurrent runs.

121-123: ⚠️ Potential issue | 🟠 Major

Redact principal IDs from ordinary backend logs.

These paths still emit raw authenticatedUserID / effectiveUserID values on RBAC denials and credential fetches. That recreates a general-log trail of sensitive principal identifiers outside the dedicated audit sink; hash or redact principals here and keep full identities only in the audit channel.

Also applies to: 159-159, 207-230, 262-262, 299-299

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/backend/handlers/runtime_credentials.go` around lines 121 - 123,
The RBAC-denial logs currently print raw principal IDs (authenticatedUserID,
effectiveUserID, ownerUserID) which must be redacted; update the handlers that
log RBAC denials (the block that prints "RBAC violation..." and the other
occurrences around the credential fetch paths) to replace or hash principal
identifiers before calling log.Printf or c.JSON logs—e.g., compute a short
deterministic hash or redact string for authenticatedUserID and effectiveUserID
and log those instead, while ensuring the full principal values continue to be
sent only to the existing audit sink; locate uses of authenticatedUserID,
effectiveUserID, ownerUserID in the runtime credential handlers and substitute
redacted/hashed values in all ordinary backend logs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/backend/handlers/runtime_credentials.go`:
- Around line 56-75: resolveAuthenticatedUser currently swallows
resolveTokenIdentity failures and returns "" which is treated as BOT_TOKEN
downstream; change resolveAuthenticatedUser to return an explicit result and
error (e.g., (string, error) or a small enum) so callers can distinguish three
outcomes: 1) resolved user id, 2) explicit BOT_TOKEN/service-account marker when
token is a caller SA, and 3) an error for any lookup failure; update
resolveAuthenticatedUser to propagate resolveTokenIdentity errors (don’t convert
them to ""), set the cache key "resolvedUserID" only on successful or BOT_TOKEN
results, and modify callers (the code paths around the existing usage in
resolveAuthenticatedUser callers at the area noted ~lines 104-119) to deny the
request when an error is returned instead of treating "" as BOT_TOKEN.

In `@components/backend/websocket/agui_proxy.go`:
- Line 100: The ClearSessionActiveUser call (and corresponding
SetSessionActiveUser uses) currently keys active-user state by sessionName only,
which causes collisions across namespaces; change the keying to a composite key
that includes the project/namespace plus session (e.g., projectID + "/" +
sessionName or a structured {project,session} key) and update all uses in this
file and the credential-handler to consistently pass and consume that composite
key (replace handlers.ClearSessionActiveUser(sessionName) and
handlers.SetSessionActiveUser(sessionName, ...) with
handlers.ClearSessionActiveUser(compositeKey) /
handlers.SetSessionActiveUser(compositeKey, ...)), ensuring the BOT_TOKEN
fallback validation uses the same composite key.

In `@components/runners/ambient-runner/ambient_runner/platform/auth.py`:
- Around line 109-145: The code is attaching context.caller_token and BOT_TOKEN
to requests built from a user-overridable BACKEND_API_URL; before adding
Authorization headers in the block that prepares req (and fallback_req),
validate the request target (the variable url / req) against a
non-user-overridable, operator-controlled backend origin or an allowlist of
trusted hostnames resolved from secure config; if the url's origin does not
match the trusted origin, do not add context.caller_token or BOT_TOKEN and
either reject the request or log/warn and return an error/empty response. Update
the logic around use_caller_token / fallback_req to perform this host-origin
check first (use the same check when constructing the initial req and the
fallback_req) so tokens are never sent to untrusted endpoints.

---

Outside diff comments:
In `@components/backend/handlers/runtime_credentials.go`:
- Around line 143-157: enforceCredentialRBAC currently only returns ok; update
it to also return a flag or ownerUserID (e.g., sameAsOwner bool / ownerUserID
string) so callers know whether the effectiveUserID is treated as the project
owner, then in runtime_credentials.go change the call site to capture that flag
and, when sameAsOwner==false, call git.GetGitHubToken in a mode that disables
the legacy project-wide fallback (or pass ownerUserID so GetGitHubToken skips
ambient-non-vertex-integrations/GITHUB_TOKEN fallback); specifically, modify
enforceCredentialRBAC and the call in runtime_credentials.go (where
enforceCredentialRBAC and git.GetGitHubToken are used) so non-owners cannot
receive the legacy project GitHub token.

---

Duplicate comments:
In `@components/backend/handlers/runtime_credentials_test.go`:
- Around line 174-190: The test only checks for non-401/403 but not which user's
credentials were used; update the tests around createCredentialContext and
GetGitHubTokenForSession to assert the effective user explicitly by either (a)
seeding the fake K8s/provider with distinct owner vs collaborator GitHub token
records and asserting the response body or returned token matches the expected
user's token, or (b) instrumenting/mocking the git.GetGitHubToken (or the
provider stub used by GetGitHubTokenForSession) to capture the effectiveUserID
passed in and assert it equals the owner or collaborator as intended; apply the
same change pattern to the other affected cases (the blocks noted at lines
~208-219, 235-248, 253-264, 350-363) so each test validates which user was
selected rather than only checking status codes.

In `@components/backend/handlers/runtime_credentials.go`:
- Around line 28-42: sessionActiveUserMap currently keys only by sessionName
which allows collisions across projects and concurrent runs; update
SetSessionActiveUser, ClearSessionActiveUser and the consumers (e.g.,
enforceCredentialRBAC) to use a stronger composite key (for example
"projectID:sessionName:runID" or the runner/request token) instead of plain
sessionName, store and delete entries using that composite key in
sessionActiveUserMap, and update lookup logic in enforceCredentialRBAC to read
by the same composite key so entries cannot be overwritten or misattributed
across projects or concurrent runs.
- Around line 121-123: The RBAC-denial logs currently print raw principal IDs
(authenticatedUserID, effectiveUserID, ownerUserID) which must be redacted;
update the handlers that log RBAC denials (the block that prints "RBAC
violation..." and the other occurrences around the credential fetch paths) to
replace or hash principal identifiers before calling log.Printf or c.JSON
logs—e.g., compute a short deterministic hash or redact string for
authenticatedUserID and effectiveUserID and log those instead, while ensuring
the full principal values continue to be sent only to the existing audit sink;
locate uses of authenticatedUserID, effectiveUserID, ownerUserID in the runtime
credential handlers and substitute redacted/hashed values in all ordinary
backend logs.

In `@components/operator/internal/handlers/sessions.go`:
- Around line 2431-2473: Ensure ServiceAccount attribution reconciliation always
runs before minting a token: when handling the existing SA (variables
saAnnotations, ownerUID, existingSA) always compare and reconcile the
ambient-code.io/created-by-user-id annotation — if ownerUID is empty remove the
annotation if present, otherwise set it to ownerUID; do this regardless of
ownerUID being empty or not (don’t skip when len(saAnnotations)==0). Also treat
an Update failure as a hard error (return the updErr) instead of only logging a
warning so CreateToken (or any subsequent token minting) cannot proceed with
stale attribution.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`:
- Around line 107-117: The code races shared user/credential state because
set_current_user on self._context and calls to clear_runtime_credentials /
populate_runtime_credentials / populate_mcp_server_credentials run outside the
process-wide synchronization used later; to fix, either make credentials and
identity per-run (pass a run-scoped credential/identity object into
_ensure_ready/_fetch_credential and into any helper that reads env) or wrap the
whole setup→run→cleanup lifecycle (the sequence that calls
self._context.set_current_user, await self._ensure_ready(),
clear_runtime_credentials(), populate_runtime_credentials(...),
populate_mcp_server_credentials(...), and the later cleanup) in a process-wide
lock so no two turns can interleave; move the set_current_user and
_last_creds_refresh assignment inside that protected section (or change helpers
to accept run-scoped state) and ensure _fetch_credential reads from the
run-scoped object rather than global process env.
- Around line 119-133: The current user-change detection misses transitions
where current_user_id is empty (owner/API-key) but prev_user was a collaborator;
update the user_changed logic so it treats an empty current_user_id as a change
when prev_user is non-empty. Replace the current expression using
current_user_id and prev_user with something like: user_changed = prev_user !=
"" and (not current_user_id or current_user_id != prev_user) so that when
prev_user is set and current_user_id is empty or different it triggers the
existing destroy/rebuild path (affecting
self._session_manager.get_existing(thread_id), await
self._session_manager.destroy(thread_id), self._rebuild_mcp_servers(), and
resetting self._adapter).

In `@components/runners/ambient-runner/ambient_runner/platform/auth.py`:
- Around line 372-377: The current cleanup loop removes any env var matching
MCP_{SERVER}_{FIELD} which can delete preexisting static config; change
populate_mcp_server_credentials to record and return the exact keys it injects
(or append them to a module-level list like injected_mcp_keys), then in the
cleanup section use that returned/recorded list instead of computing
mcp_cred_keys via os.environ scan; specifically update
populate_mcp_server_credentials(...) to return List[str] of injected keys (or
push names into injected_mcp_keys), and replace the mcp_cred_keys comprehension
and loop with iterating over that returned/recorded list and popping only those
keys and appending them to cleared.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2cbcb335-8236-4854-89f6-ff2957170466

📥 Commits

Reviewing files that changed from the base of the PR and between 58f46ed and c018a31.

📒 Files selected for processing (9)
  • components/backend/handlers/runtime_credentials.go
  • components/backend/handlers/runtime_credentials_test.go
  • components/backend/websocket/agui_proxy.go
  • components/operator/internal/handlers/sessions.go
  • components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py
  • components/runners/ambient-runner/ambient_runner/endpoints/run.py
  • components/runners/ambient-runner/ambient_runner/platform/auth.py
  • components/runners/ambient-runner/ambient_runner/platform/context.py
  • components/runners/ambient-runner/tests/test_shared_session_credentials.py

Comment on lines +56 to +75
// resolveAuthenticatedUser returns the authenticated user ID for the request.
// Prefers the middleware-resolved userID; falls back to SelfSubjectReview for
// forwarded caller tokens that bypass the OAuth proxy. Caches the result on
// the gin context to avoid repeated K8s API calls within the same request.
func resolveAuthenticatedUser(c *gin.Context, reqK8s kubernetes.Interface) string {
if uid := c.GetString("userID"); uid != "" {
return uid
}
// Check cache from a previous call in this request
if uid := c.GetString("resolvedUserID"); uid != "" {
return uid
}
if reqK8s != nil {
if resolved, err := resolveTokenIdentity(c.Request.Context(), reqK8s); err == nil {
uid := strings.ReplaceAll(resolved, ":", "-")
c.Set("resolvedUserID", uid)
return uid
}
}
return ""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fail closed when caller identity lookup errors.

Lines 69-75 convert every resolveTokenIdentity failure into "", and Lines 106-114 treat "" as the BOT_TOKEN path. A transient SelfSubjectReview failure can therefore fall through to owner/active-user credentials instead of being rejected. Return an explicit BOT_TOKEN/service-account state from resolveAuthenticatedUser and deny on any other resolution error.

Also applies to: 104-119

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/backend/handlers/runtime_credentials.go` around lines 56 - 75,
resolveAuthenticatedUser currently swallows resolveTokenIdentity failures and
returns "" which is treated as BOT_TOKEN downstream; change
resolveAuthenticatedUser to return an explicit result and error (e.g., (string,
error) or a small enum) so callers can distinguish three outcomes: 1) resolved
user id, 2) explicit BOT_TOKEN/service-account marker when token is a caller SA,
and 3) an error for any lookup failure; update resolveAuthenticatedUser to
propagate resolveTokenIdentity errors (don’t convert them to ""), set the cache
key "resolvedUserID" only on successful or BOT_TOKEN results, and modify callers
(the code paths around the existing usage in resolveAuthenticatedUser callers at
the area noted ~lines 104-119) to deny the request when an error is returned
instead of treating "" as BOT_TOKEN.

sessionLastSeen.Delete(sessionName)
sessionPortMap.Delete(sessionName)
sessionProjectMap.Delete(sessionName)
handlers.ClearSessionActiveUser(sessionName)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Key active-user state by project/session, not sessionName alone.

These writes use only sessionName, but AgenticSession names are namespace-scoped. projectA/foo and projectB/foo will overwrite or clear each other's active user in handlers.SetSessionActiveUser(...) / ClearSessionActiveUser(...), so the BOT_TOKEN fallback can validate against the wrong human. Use a composite key consistently on both the proxy and credential-handler sides.

Also applies to: 261-265

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/backend/websocket/agui_proxy.go` at line 100, The
ClearSessionActiveUser call (and corresponding SetSessionActiveUser uses)
currently keys active-user state by sessionName only, which causes collisions
across namespaces; change the keying to a composite key that includes the
project/namespace plus session (e.g., projectID + "/" + sessionName or a
structured {project,session} key) and update all uses in this file and the
credential-handler to consistently pass and consume that composite key (replace
handlers.ClearSessionActiveUser(sessionName) and
handlers.SetSessionActiveUser(sessionName, ...) with
handlers.ClearSessionActiveUser(compositeKey) /
handlers.SetSessionActiveUser(compositeKey, ...)), ensuring the BOT_TOKEN
fallback validation uses the same composite key.

Ambient Code Bot and others added 2 commits March 24, 2026 09:38
…validation

- Key sessionActiveUserMap by project/session (not bare sessionName)
  to prevent cross-namespace collisions
- Distinguish BOT_TOKEN from resolution failure in resolveAuthenticatedUser
  (fail closed on SelfSubjectReview errors, open on confirmed SA tokens)
- Make SA annotation update a hard failure (blocks token minting if
  annotation can't be reconciled)
- Validate BACKEND_API_URL is a cluster-internal host before sending
  bearer tokens (prevents token exfiltration via user-overridden env)
- Fix user_changed detection to trigger on all transitions including
  non-owner → owner and owner → non-owner

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nstant

- Fix unchecked type assertion on sessionActiveUserMap value (could panic)
- Parallelize credential fetches with asyncio.gather (4 sequential HTTP
  calls → concurrent, reduces latency from ~4x to ~1x timeout)
- Extract _GOOGLE_WORKSPACE_CREDS_FILE constant to keep populate and
  clear paths in sync

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py (1)

113-117: ⚠️ Potential issue | 🔴 Critical

Process-wide credential env is still unsynchronized across concurrent turns.

clear_runtime_credentials() / populate_*() mutate os.environ, but they run before the per-thread lock is acquired and are cleared again under that same per-thread lock. Two different thread_ids can still overwrite or delete each other's credentials mid-run. Guard the whole credential lifecycle with a process-wide lock, or snapshot/restore per-run state instead of mutating shared process env.

Also applies to: 166-211

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`
around lines 113 - 117, Concurrent runs mutate os.environ via
clear_runtime_credentials(), populate_runtime_credentials(), and
populate_mcp_server_credentials() outside the per-thread lock causing
credentials races; protect the entire credential lifecycle by either (A)
introducing and acquiring a process-wide lock (e.g., credential_lock) before
calling clear_runtime_credentials(), populate_runtime_credentials(),
populate_mcp_server_credentials() and hold it until credentials are no longer
needed (also update where _last_creds_refresh is set), or (B) avoid mutating
shared env by snapshotting os.environ before modifying it and restoring it after
the run (wrap snapshot/modify/restore around those calls); update all similar
blocks (including the 166-211 region) to use the same strategy and reference the
same functions so concurrent thread_id runs can’t overwrite each other’s
credentials.
components/backend/handlers/runtime_credentials.go (1)

80-89: ⚠️ Potential issue | 🔴 Critical

SelfSubjectReview failures still fail open as BOT token access.

This branch converts any resolveTokenIdentity(...) error into isBotToken=true whenever X-Forwarded-User is absent. A transient SSR failure or a direct bearer-token request that bypasses the OAuth proxy can then fall through to owner/active-user credential access. Only return BOT token status after positive ServiceAccount identification; deny on lookup errors.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/backend/handlers/runtime_credentials.go` around lines 80 - 89, The
code currently treats any resolveTokenIdentity/SelfSubjectReview error as
isBotToken=true when X-Forwarded-User is empty; change that so you only return
isBotToken=true after a positive ServiceAccount identification (e.g., when the
resolved identity explicitly indicates kind "ServiceAccount" or similar), and on
lookup/SSR errors without confirmation you must deny (return "", false). Update
the branch in resolveTokenIdentity (the block checking
c.GetHeader("X-Forwarded-User") and the SelfSubjectReview error handling) to: if
X-Forwarded-User is empty and the resolved subject is confirmed ServiceAccount
then return "", true; otherwise log the error and return "", false so transient
SSR failures do not fall back to BOT_TOKEN.
components/operator/internal/handlers/sessions.go (1)

2452-2474: ⚠️ Potential issue | 🔴 Critical

Remove the stale created-by-user-id annotation when the session owner is now empty.

This reconciliation only runs when saAnnotations is non-empty. If spec.userContext.userId has been cleared, the old ambient-code.io/created-by-user-id stays on the existing ServiceAccount, and the freshly minted BOT token is still attributed to the previous owner. Reconcile deletion as well before calling CreateToken(...).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/operator/internal/handlers/sessions.go` around lines 2452 - 2474,
The current logic only updates annotations when saAnnotations is non-empty, so a
cleared session owner leaves the stale ambient-code.io/created-by-user-id on the
ServiceAccount; change reconciliation to always fetch the existing
ServiceAccount (call CoreV1().ServiceAccounts(...).Get(...) regardless of
len(saAnnotations)) and then compare existingSA.Annotations to saAnnotations:
for each key in saAnnotations set/overwrite as now, and for any annotation key
present in existingSA.Annotations but missing from saAnnotations (specifically
"ambient-code.io/created-by-user-id") delete that key from
existingSA.Annotations, set needsUpdate = true, and call Update(...) when
needsUpdate; keep using the same existingSA/saAnnotations/needsUpdate variables
and ensure CreateToken(...) is called after this cleanup.
components/runners/ambient-runner/ambient_runner/platform/auth.py (1)

106-113: ⚠️ Potential issue | 🔴 Critical

Loopback should not be trusted here.

BACKEND_API_URL is still user-overridable in components/operator/internal/handlers/sessions.go, Lines 1203-1205. Allowing localhost / 127.0.0.1 means a session can repoint credential fetches to a listener inside the runner pod and receive caller_token or BOT_TOKEN. Restrict this to the operator-controlled service DNS (or another non-user-overridable allowlist).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/platform/auth.py` around
lines 106 - 113, Current code in auth.py uses parsed = urlparse(base) and allows
localhost/127.0.0.1 which lets a user override BACKEND_API_URL to exfiltrate
tokens; remove "localhost" and "127.0.0.1" from the allowed-host check so only
cluster-controlled hosts are permitted (e.g.,
parsed.hostname.endswith(".svc.cluster.local") or an explicit
operator-controlled DNS constant/setting), update the condition around
parsed.hostname accordingly, and ensure the rejection path (the logger.error at
"Refusing to send credentials to external host: {parsed.hostname}") actually
prevents sending credentials (e.g., return/raise or otherwise stop credential
use) so loopback addresses can no longer be targeted.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/backend/websocket/agui_proxy.go`:
- Around line 263-267: The code sets a session-scoped fallback via
handlers.SetSessionActiveUser(projectName, sessionName, senderUserID) but never
clears it at turn end; update proxyRunnerStream to accept run-scoped context
(e.g., projectName + runID or sessionRunID) and set the active-user using that
run-scoped key instead of session-only, then ensure you clear or rotate the
entry when the proxied turn finishes (use a defer/cleanup in proxyRunnerStream
exit path) so enforceCredentialRBAC cannot authorize a BOT_TOKEN after the turn
boundary; apply the same change to the other occurrence noted (around lines
359-360) to make the fallback run-scoped and removed on exit.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`:
- Around line 119-133: The current flow calls populate_mcp_server_credentials()
each run but only calls _rebuild_mcp_servers() when the user changes, leaving
self._mcp_servers stale for same-user turns; change the logic so that after
fetching fresh credentials you always refresh the MCP server definitions (call
_rebuild_mcp_servers()) before any adapter reuse or agent spawn so MCP
subprocesses get up-to-date tokens, and keep the existing forced adapter rebuild
(self._adapter = None) when appropriate; update the block around
user_changed/current_user_id handling to invoke _rebuild_mcp_servers() for both
user-changed and same-user runs (or alternatively defer credential expansion
into the code that spawns MCP subprocesses) so that
populate_mcp_server_credentials(), _rebuild_mcp_servers(), self._mcp_servers and
self._adapter are kept in sync.

---

Duplicate comments:
In `@components/backend/handlers/runtime_credentials.go`:
- Around line 80-89: The code currently treats any
resolveTokenIdentity/SelfSubjectReview error as isBotToken=true when
X-Forwarded-User is empty; change that so you only return isBotToken=true after
a positive ServiceAccount identification (e.g., when the resolved identity
explicitly indicates kind "ServiceAccount" or similar), and on lookup/SSR errors
without confirmation you must deny (return "", false). Update the branch in
resolveTokenIdentity (the block checking c.GetHeader("X-Forwarded-User") and the
SelfSubjectReview error handling) to: if X-Forwarded-User is empty and the
resolved subject is confirmed ServiceAccount then return "", true; otherwise log
the error and return "", false so transient SSR failures do not fall back to
BOT_TOKEN.

In `@components/operator/internal/handlers/sessions.go`:
- Around line 2452-2474: The current logic only updates annotations when
saAnnotations is non-empty, so a cleared session owner leaves the stale
ambient-code.io/created-by-user-id on the ServiceAccount; change reconciliation
to always fetch the existing ServiceAccount (call
CoreV1().ServiceAccounts(...).Get(...) regardless of len(saAnnotations)) and
then compare existingSA.Annotations to saAnnotations: for each key in
saAnnotations set/overwrite as now, and for any annotation key present in
existingSA.Annotations but missing from saAnnotations (specifically
"ambient-code.io/created-by-user-id") delete that key from
existingSA.Annotations, set needsUpdate = true, and call Update(...) when
needsUpdate; keep using the same existingSA/saAnnotations/needsUpdate variables
and ensure CreateToken(...) is called after this cleanup.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`:
- Around line 113-117: Concurrent runs mutate os.environ via
clear_runtime_credentials(), populate_runtime_credentials(), and
populate_mcp_server_credentials() outside the per-thread lock causing
credentials races; protect the entire credential lifecycle by either (A)
introducing and acquiring a process-wide lock (e.g., credential_lock) before
calling clear_runtime_credentials(), populate_runtime_credentials(),
populate_mcp_server_credentials() and hold it until credentials are no longer
needed (also update where _last_creds_refresh is set), or (B) avoid mutating
shared env by snapshotting os.environ before modifying it and restoring it after
the run (wrap snapshot/modify/restore around those calls); update all similar
blocks (including the 166-211 region) to use the same strategy and reference the
same functions so concurrent thread_id runs can’t overwrite each other’s
credentials.

In `@components/runners/ambient-runner/ambient_runner/platform/auth.py`:
- Around line 106-113: Current code in auth.py uses parsed = urlparse(base) and
allows localhost/127.0.0.1 which lets a user override BACKEND_API_URL to
exfiltrate tokens; remove "localhost" and "127.0.0.1" from the allowed-host
check so only cluster-controlled hosts are permitted (e.g.,
parsed.hostname.endswith(".svc.cluster.local") or an explicit
operator-controlled DNS constant/setting), update the condition around
parsed.hostname accordingly, and ensure the rejection path (the logger.error at
"Refusing to send credentials to external host: {parsed.hostname}") actually
prevents sending credentials (e.g., return/raise or otherwise stop credential
use) so loopback addresses can no longer be targeted.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f4aff76b-f053-4e99-8fe6-004bcc4b51e8

📥 Commits

Reviewing files that changed from the base of the PR and between c018a31 and 3e0a854.

📒 Files selected for processing (5)
  • components/backend/handlers/runtime_credentials.go
  • components/backend/websocket/agui_proxy.go
  • components/operator/internal/handlers/sessions.go
  • components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py
  • components/runners/ambient-runner/ambient_runner/platform/auth.py

Comment on lines +263 to +267
// Track active user so credential endpoints can validate BOT_TOKEN
// fallback requests when the caller token expires during long runs.
if senderUserID != "" {
handlers.SetSessionActiveUser(projectName, sessionName, senderUserID)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Expire the active-user fallback when the proxied turn ends.

This stores the non-owner fallback user, but nothing clears it when the proxied turn finishes; the only cleanup is the 1-hour stale-session sweeper. Since enforceCredentialRBAC() uses this map to authorize BOT_TOKEN requests for non-owners, the last collaborator can remain authorized after the turn boundary. Pass project/run context into proxyRunnerStream and clear or rotate the entry on exit; if overlapping turns are possible, make it run-scoped instead of session-scoped.

Also applies to: 359-360

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/backend/websocket/agui_proxy.go` around lines 263 - 267, The code
sets a session-scoped fallback via handlers.SetSessionActiveUser(projectName,
sessionName, senderUserID) but never clears it at turn end; update
proxyRunnerStream to accept run-scoped context (e.g., projectName + runID or
sessionRunID) and set the active-user using that run-scoped key instead of
session-only, then ensure you clear or rotate the entry when the proxied turn
finishes (use a defer/cleanup in proxyRunnerStream exit path) so
enforceCredentialRBAC cannot authorize a BOT_TOKEN after the turn boundary;
apply the same change to the other occurrence noted (around lines 359-360) to
make the fallback run-scoped and removed on exit.

Comment on lines +119 to +133
# If the caller changed, destroy the worker and rebuild MCP servers +
# adapter so the new ClaudeSDKClient gets fresh mcp_servers config.
# The session ID is preserved — --resume works because each SDK client
# is a new CLI subprocess that spawns fresh MCP servers from os.environ.
user_changed = current_user_id != prev_user
if user_changed and self._session_manager.get_existing(thread_id):
logger.info(
f"User changed for thread={thread_id}, "
"rebuilding MCP servers and adapter with new credentials"
)
await self._session_manager.destroy(thread_id)
self._rebuild_mcp_servers()
# Force adapter rebuild so ClaudeAgentOptions uses new mcp_servers
self._adapter = None

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file structure and find relevant files
find . -type f -name "*.py" | grep -E "(bridge|credential|auth|mcp)" | head -20

Repository: ambient-code/platform

Length of output: 1627


🏁 Script executed:

# Get more context around the reviewed code section
cd components/runners/ambient-runner/ambient_runner/bridges/claude
wc -l bridge.py

Repository: ambient-code/platform

Length of output: 78


🏁 Script executed:

# Look at the full context of the bridge file, especially around credentials and rebuilding
head -150 bridge.py | tail -80

Repository: ambient-code/platform

Length of output: 133


🏁 Script executed:

# Search for credential/token refresh mechanisms
rg -n "credential|token|refresh|env|environ" bridge.py -A 2 -B 2 | head -100

Repository: ambient-code/platform

Length of output: 114


🏁 Script executed:

cat -n components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py | head -150 | tail -100

Repository: ambient-code/platform

Length of output: 4700


🏁 Script executed:

# Search for _rebuild_mcp_servers and credential handling
rg -n "_rebuild_mcp_servers|credential|token|refresh" components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py -A 3 -B 2

Repository: ambient-code/platform

Length of output: 4468


🏁 Script executed:

# Check the auth.py file for credential refresh logic
cat -n components/runners/ambient-runner/ambient_runner/bridges/claude/auth.py

Repository: ambient-code/platform

Length of output: 5370


🏁 Script executed:

# Check mcp.py to understand how MCP servers are spawned
head -100 components/runners/ambient-runner/ambient_runner/bridges/claude/mcp.py

Repository: ambient-code/platform

Length of output: 3226


🏁 Script executed:

# Check _ensure_adapter to see if it rebuilds the adapter for same-user turns
rg -n "_ensure_adapter|_adapter" components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py -B 2 -A 5 | head -80

Repository: ambient-code/platform

Length of output: 3037


🏁 Script executed:

# Check session manager to understand when workers are reused
rg -n "session_manager|get_existing|destroy" components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py -B 1 -A 3

Repository: ambient-code/platform

Length of output: 4308


🏁 Script executed:

# Check how populate_mcp_server_credentials sets credentials
rg -n "populate_mcp_server_credentials|populate_runtime_credentials" components/runners/ambient-runner/ambient_runner/platform/auth.py -B 2 -A 10 | head -100

Repository: ambient-code/platform

Length of output: 2323


🏁 Script executed:

# Let's understand how the adapter's mcp_servers are used and when they're set
rg -n "mcp_servers|ClaudeAgentOptions" components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py -B 2 -A 3 | head -100

Repository: ambient-code/platform

Length of output: 3397


🏁 Script executed:

# Check how credentials flow through the MCP config expansion
cat -n components/runners/ambient-runner/ambient_runner/platform/auth.py | sed -n '412,500p'

Repository: ambient-code/platform

Length of output: 4192


🏁 Script executed:

# Check how the adapter uses mcp_servers when building options
rg -n "build_mcp_servers|load_mcp_config" components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py -B 2 -A 8

Repository: ambient-code/platform

Length of output: 2642


Credentials refreshed per-turn don't reach MCP servers on same-user turns.

You call populate_mcp_server_credentials() every run to fetch fresh tokens and set them in os.environ, but _rebuild_mcp_servers() only rebuilds the _mcp_servers dict when the user changes. For same-user turns, the adapter reuses the stale dict from the previous expansion, and any short-lived credentials (e.g., JIRA tokens) become expired when MCP subprocesses spawn. Rebuild _mcp_servers on every run, or defer credential expansion from dict-build-time to MCP-spawn-time.

🧰 Tools
🪛 Ruff (0.15.6)

[warning] 126-127: Logging statement uses f-string

(G004)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`
around lines 119 - 133, The current flow calls populate_mcp_server_credentials()
each run but only calls _rebuild_mcp_servers() when the user changes, leaving
self._mcp_servers stale for same-user turns; change the logic so that after
fetching fresh credentials you always refresh the MCP server definitions (call
_rebuild_mcp_servers()) before any adapter reuse or agent spawn so MCP
subprocesses get up-to-date tokens, and keep the existing forced adapter rebuild
(self._adapter = None) when appropriate; update the block around
user_changed/current_user_id handling to invoke _rebuild_mcp_servers() for both
user-changed and same-user runs (or alternatively defer credential expansion
into the code that spawns MCP subprocesses) so that
populate_mcp_server_credentials(), _rebuild_mcp_servers(), self._mcp_servers and
self._adapter are kept in sync.

@Gkrumbach07 Gkrumbach07 merged commit 0d78e86 into main Mar 24, 2026
37 of 38 checks passed
@Gkrumbach07 Gkrumbach07 deleted the fix/shared-session-credentials branch March 24, 2026 14:58
@ambient-code ambient-code bot removed this from the Review Queue milestone Mar 26, 2026
jeremyeder pushed a commit to jeremyeder/platform that referenced this pull request Mar 26, 2026
## Summary

- **Security fix**: Shared sessions previously used the session owner's
credentials for all runs, allowing User B to impersonate User A's
GitHub/Jira/Google/GitLab credentials. Each message now uses the
sender's own credentials.
- **Backward compatible**: API keys and scheduled sessions continue
using creator's credentials via fallback pattern (`effectiveUserID =
currentUserID || ownerUserID`)
- **Credential isolation**: Credentials are cleared from environment
after each turn to prevent cross-user leakage

## Changes

### Backend (`runtime_credentials.go`)
- Extract `getEffectiveUserID()` and `checkCredentialRBAC()` helpers
used by all 4 credential handlers (GitHub, Google, Jira, GitLab)
- Resolve effective user from `X-Runner-Current-User` header, falling
back to session owner
- Add audit logging: `CREDENTIAL_ACCESS` log line on every credential
fetch

### Proxy (`agui_proxy.go`)
- Forward `X-Current-User-ID` / `X-Current-User-Name` headers from
backend to runner
- Merged into existing `proxyRunnerStream` / `connectToRunner` functions
(no duplication)

### Runner (Python)
- `context.py`: Added `current_user_id`, `current_user_name` fields +
`set_current_user()` method
- `auth.py`: Send `X-Runner-Current-User` header when fetching
credentials; added `clear_runtime_credentials()` and
`sanitize_user_context()`
- `endpoints/run.py`: Extract user headers from request, set on context
before credential fetch
- `bridge.py`: Clear credentials after each turn (opt-out via
`KEEP_CREDENTIALS_PERSISTENT=true`)

### Tests
- **17 Go tests** (Ginkgo): shared sessions, API keys, scheduled
sessions, RBAC violations
- **13 Python tests** (pytest): context management, credential clearing,
header sending, lifecycle

### Docs
- `features/session-sharing.md`: User guide for credential behavior
- `guides/migrating-shared-sessions.md`: Migration guide with rollback
flag

## Test plan

- [ ] Verify shared session: User A creates session, User B sends
message → B's credentials are used
- [ ] Verify owner access: Session owner's messages still use owner's
credentials
- [ ] Verify API key sessions: API key sessions use creator's
credentials (no header sent)
- [ ] Verify scheduled sessions: Scheduled sessions use creator's
credentials (fallback)
- [ ] Verify credential clearing: After turn completes, env vars are
wiped
- [ ] Run `go test ./handlers` — all pass
- [ ] Run `uv run pytest tests/test_shared_session_credentials.py` — all
13 pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Ambient Code Bot <bot@ambient-code.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant