fix: use per-message credentials in shared sessions#989
Conversation
|
Caution Review failedPull request was closed or merged during review WalkthroughAdds per-message credential scoping for shared sessions: session active-user tracking, centralized effective-user resolution and RBAC for credential endpoints, propagation of current-user/caller-token through websocket proxy into runners and bridges, per-run credential population/cleanup, ServiceAccount annotation backfill, tests, and docs. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Backend
participant Proxy
participant Runner
participant Bridge
participant Context
Client->>Backend: GET /credentials/<type> (optional X-Runner-Current-User)
Backend->>Backend: getEffectiveUserID(header, session.owner)
Backend->>Backend: resolveAuthenticatedUser(middleware/token)
Backend->>Backend: enforceCredentialRBAC(owner,effective,auth)
alt RBAC fails
Backend-->>Client: 403 {"error":"Access denied"} or 404
else RBAC passes
Backend-->>Client: 200 + credential (emit CREDENTIAL_ACCESS)
end
Client->>Proxy: WebSocket connect / run (session)
Proxy->>Proxy: Read sender/current user from Gin context
Proxy->>Proxy: SetSessionActiveUser(session, senderID)
Proxy->>Runner: proxyRunnerStream(..., userID, userName, callerToken)
Runner->>Bridge: bridge.run(input, current_user_id, current_user_name, caller_token)
Bridge->>Context: set_current_user(user_id, user_name, token)
Bridge->>Backend: _fetch_credential() (Authorization and X-Runner-Current-User)
Backend->>Backend: enforceCredentialRBAC(...) for runtime fetch
Backend-->>Bridge: credential scoped to effective user
Bridge->>Bridge: execute turn
Bridge->>Bridge: finally: clear_runtime_credentials() unless KEEP_CREDENTIALS_PERSISTENT=true
Bridge-->>Runner: turn complete
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/backend/handlers/runtime_credentials_test.go`:
- Around line 174-205: The tests around GetGitHubTokenForSession only assert
non-401/403 and thus don’t prove which user’s credentials were used; update the
relevant specs (those creating contexts via createCredentialContext and calling
GetGitHubTokenForSession) to seed distinct credentials for owner and
collaborator (or mock/hook the provider getter used by GetGitHubTokenForSession)
and then assert the returned token/email matches the expected user's credential
instead of only checking status codes (use httpUtils.GetResponseRecorder to
inspect response body or the provider mock to capture which userID was
requested); also add a negative case where owner-supplied X-Runner-Current-User
differs from actual owner to ensure fallback/authorization behavior is correct.
In `@components/backend/handlers/runtime_credentials.go`:
- Line 109: The log.Printf call is emitting raw user identifiers
(effectiveUserID, ownerUserID, and c.GetString("userID")/auth_as) into general
application logs which may contain PII; update the credential-access logging to
redact or replace those values with opaque IDs or deterministic hashes before
logging (e.g., hashString(effectiveUserID)), and send full principal detail only
to the dedicated audit sink; change the log.Printf invocation in
runtime_credentials.go (the line with log.Printf(... effectiveUserID, project,
session, ownerUserID, c.GetString("userID"))) accordingly and apply the same
replacement to the other similar calls referenced (lines around the other
log.Printf occurrences) so general logs contain non-PII identifiers while full
details go to the secure audit path.
- Around line 39-47: The current checkCredentialRBAC function incorrectly allows
the session owner (authenticatedUserID == ownerUserID) to access credentials for
any effectiveUserID when the X-Runner-Current-User header is supplied; change
the logic in checkCredentialRBAC to: if c.GetString("userID") is empty return
true (BOT_TOKEN), if the request supplies an effectiveUserID header
(effectiveUserID != ""), require authenticatedUserID == effectiveUserID,
otherwise (no header) allow the owner fallback by returning authenticatedUserID
== ownerUserID; update the return conditions in checkCredentialRBAC accordingly
so owner fallback is only used when no current-user header is present.
In `@components/backend/websocket/agui_proxy.go`:
- Line 374: Run gofmt -w on agui_proxy.go to fix the formatting (or manually
remove the extra blank lines reported by the pipeline); specifically remove the
stray blank lines the pipeline flagged so the file compiles with gofmt (you can
run `gofmt -w agui_proxy.go` to automatically fix them).
In `@docs/src/content/docs/guides/migrating-shared-sessions.md`:
- Around line 55-57: The fenced code block containing the environment variable
declaration (`DISABLE_PER_USER_CREDENTIALS=true`) needs a language specifier for
proper rendering; update the markdown block around that line in
migrating-shared-sessions.md to use a bash/code block fence (e.g., change ``` to
```bash) so the snippet is rendered with correct syntax highlighting.
- Around line 51-57: Docs reference a non-existent rollback flag; implement the
DISABLE_PER_USER_CREDENTIALS toggle in the backend credential resolution so the
docs are accurate. Update the logic in runtime_credentials.go (specifically
getEffectiveUserID() or the credential resolution path) to read the
DISABLE_PER_USER_CREDENTIALS env var and, when true, bypass per-user resolution
and always return the session owner ID (mirroring previous behavior), ensure
this interacts correctly with KEEP_CREDENTIALS_PERSISTENT semantics, and
add/update tests to cover both enabled and disabled flag scenarios.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3d785d23-4e3f-4f9f-85fa-c7b373c0894b
📒 Files selected for processing (11)
components/backend/handlers/runtime_credentials.gocomponents/backend/handlers/runtime_credentials_test.gocomponents/backend/websocket/agui_proxy.gocomponents/runners/ambient-runner/ambient_runner/bridges/claude/bridge.pycomponents/runners/ambient-runner/ambient_runner/endpoints/run.pycomponents/runners/ambient-runner/ambient_runner/platform/auth.pycomponents/runners/ambient-runner/ambient_runner/platform/context.pycomponents/runners/ambient-runner/tests/test_shared_session_credentials.pydocs/astro.config.mjsdocs/src/content/docs/features/session-sharing.mddocs/src/content/docs/guides/migrating-shared-sessions.md
| Context("when BOT_TOKEN calls without X-Runner-Current-User header", func() { | ||
| It("should use owner's userID for credential lookup", func() { | ||
| c := createCredentialContext("github") | ||
| // BOT_TOKEN: no userID in gin context (empty authenticatedUserID) | ||
| c.Set("userID", "") | ||
|
|
||
| GetGitHubTokenForSession(c) | ||
|
|
||
| // Handler will try to fetch GitHub token for ownerUserID. | ||
| // With fake K8s, git.GetGitHubToken may not find creds, but | ||
| // the session lookup and user resolution should succeed. | ||
| statusCode := httpUtils.GetResponseRecorder().Code | ||
| // Should NOT be 401 (auth) or 403 (RBAC) — user resolution worked | ||
| Expect(statusCode).NotTo(Equal(http.StatusUnauthorized)) | ||
| Expect(statusCode).NotTo(Equal(http.StatusForbidden)) | ||
| logger.Log("BOT_TOKEN without header correctly resolved to owner: status=%d", statusCode) | ||
| }) | ||
| }) | ||
|
|
||
| Context("when BOT_TOKEN calls with X-Runner-Current-User header", func() { | ||
| It("should use the current user's ID for credential lookup", func() { | ||
| c := createCredentialContext("github") | ||
| c.Set("userID", "") | ||
| c.Request.Header.Set("X-Runner-Current-User", "collaborator-user-abc") | ||
|
|
||
| GetGitHubTokenForSession(c) | ||
|
|
||
| statusCode := httpUtils.GetResponseRecorder().Code | ||
| // Should NOT be 401 or 403 — BOT_TOKEN is allowed to pass current user | ||
| Expect(statusCode).NotTo(Equal(http.StatusUnauthorized)) | ||
| Expect(statusCode).NotTo(Equal(http.StatusForbidden)) | ||
| logger.Log("BOT_TOKEN with current user header: status=%d", statusCode) |
There was a problem hiding this comment.
These “success” cases don't prove which user's credentials were selected.
Each block only checks status != 401/403. That still passes when the handler ignores X-Runner-Current-User, falls back to the owner, or returns 404/500 because no credentials were found, so this suite won't catch the exact regression this PR is about. Seed distinct owner/collaborator credentials (or hook the provider getter) and assert the returned token/email comes from the expected user; also add a negative case where the owner sets X-Runner-Current-User to someone else.
As per coding guidelines, **: -Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.
Also applies to: 209-220, 236-249, 254-279, 309-320, 338-349, 355-366
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/backend/handlers/runtime_credentials_test.go` around lines 174 -
205, The tests around GetGitHubTokenForSession only assert non-401/403 and thus
don’t prove which user’s credentials were used; update the relevant specs (those
creating contexts via createCredentialContext and calling
GetGitHubTokenForSession) to seed distinct credentials for owner and
collaborator (or mock/hook the provider getter used by GetGitHubTokenForSession)
and then assert the returned token/email matches the expected user's credential
instead of only checking status codes (use httpUtils.GetResponseRecorder to
inspect response body or the provider mock to capture which userID was
requested); also add a negative case where owner-supplied X-Runner-Current-User
differs from actual owner to ensure fallback/authorization behavior is correct.
| ## Rollback | ||
|
|
||
| If you encounter issues, you can revert to the previous behavior (session owner credentials for all messages) by setting the environment variable: | ||
|
|
||
| ``` | ||
| DISABLE_PER_USER_CREDENTIALS=true | ||
| ``` |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for DISABLE_PER_USER_CREDENTIALS implementation in the codebase
echo "=== Searching for DISABLE_PER_USER_CREDENTIALS ==="
rg -n "DISABLE_PER_USER_CREDENTIALS" --type go --type py
echo ""
echo "=== Searching for KEEP_CREDENTIALS_PERSISTENT ==="
rg -n "KEEP_CREDENTIALS_PERSISTENT" --type go --type py
echo ""
echo "=== Check runtime_credentials.go for rollback logic ==="
fd "runtime_credentials.go" --exec cat {}Repository: ambient-code/platform
Length of output: 20332
🏁 Script executed:
# Search for all credential-related files and handlers
echo "=== Searching for other credential handlers ==="
fd -e go -e py "credential" --type f | head -20
echo ""
echo "=== Search for environment variable checks that might disable per-user behavior ==="
rg -n "X-Runner-Current-User|getEffectiveUserID" --type go --type py
echo ""
echo "=== Check if there are git handlers or other modules ==="
fd -e go "handler" --type f components/Repository: ambient-code/platform
Length of output: 4932
🏁 Script executed:
# Check what KEEP_CREDENTIALS_PERSISTENT does in the runner
echo "=== Check KEEP_CREDENTIALS_PERSISTENT context in bridge.py ==="
rg -B 5 -A 5 "KEEP_CREDENTIALS_PERSISTENT" components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py
echo ""
echo "=== Check if runner has any logic to disable X-Runner-Current-User header ==="
rg -B 3 -A 3 "X-Runner-Current-User" components/runners/ambient-runner/ambient_runner/platform/auth.pyRepository: ambient-code/platform
Length of output: 1086
Documentation references non-existent rollback flag; backend has no mechanism to revert to session owner credentials.
The DISABLE_PER_USER_CREDENTIALS environment variable does not exist in the codebase. The backend credential handlers in runtime_credentials.go implement per-user credentials through hardcoded logic in getEffectiveUserID() with no environment variable toggle. KEEP_CREDENTIALS_PERSISTENT (found in the runner) only controls whether credentials are cleared from memory after a turn—it does not affect backend credential resolution. There is no rollback mechanism to revert to "session owner credentials for all messages" as the documentation claims.
Either implement the DISABLE_PER_USER_CREDENTIALS flag in the backend credential resolution logic, or remove the rollback section from the documentation.
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)
[warning] 55-55: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/src/content/docs/guides/migrating-shared-sessions.md` around lines 51 -
57, Docs reference a non-existent rollback flag; implement the
DISABLE_PER_USER_CREDENTIALS toggle in the backend credential resolution so the
docs are accurate. Update the logic in runtime_credentials.go (specifically
getEffectiveUserID() or the credential resolution path) to read the
DISABLE_PER_USER_CREDENTIALS env var and, when true, bypass per-user resolution
and always return the session owner ID (mirroring previous behavior), ensure
this interacts correctly with KEEP_CREDENTIALS_PERSISTENT semantics, and
add/update tests to cover both enabled and disabled flag scenarios.
Shared sessions previously used the session owner's credentials for all runs regardless of who sent the message, allowing credential impersonation. Each message now uses the sender's own credentials. - Backend: extract effective user from X-Runner-Current-User header with fallback to session owner (supports API keys and scheduled sessions) - Proxy: forward X-Current-User-ID/Name headers from backend to runner - Runner: set per-message user context before credential fetch, clear credentials after each turn to prevent cross-user leakage - Add getEffectiveUserID/checkCredentialRBAC helpers (DRY) - 30 new tests (17 Go + 13 Python) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add guidance for when to use relative URLs (browser/dashboard context with automatic auth) vs absolute URLs (CLI/scripts/external with explicit Bearer token). Updates HTML examples to use relative URLs and adds curl examples for external scripts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
a26b7f1 to
47fb751
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (3)
components/backend/handlers/runtime_credentials_test.go (1)
173-366:⚠️ Potential issue | 🟠 MajorThese success cases still don't verify which user's credentials were selected.
Most of the new "success" paths only assert "not 401/403". They still pass if the handler ignores
X-Runner-Current-User, falls back to the owner, or returns 404/500, so this suite won't catch the regression this PR is about. Seed distinct owner/collaborator credentials (or stub the provider getter) and assert the returned token/email belongs to the expected user, plus add the owner-impersonates-other-user negative case.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/backend/handlers/runtime_credentials_test.go` around lines 173 - 366, Tests only assert "not 401/403" so they don't verify which user's credentials were returned; update the credential tests (functions: GetGitHubTokenForSession, GetJiraCredentialsForSession, GetGoogleCredentialsForSession, GetGitLabTokenForSession) by seeding distinct owner and collaborator credentials (or stub the provider getter used by createCredentialContext) so requests return distinguishable tokens/emails, then replace the weak Expect(...).NotTo(Equal(http.StatusUnauthorized/Forbidden)) checks with assertions that the response body (from httpUtils.GetResponseRecorder()) contains the expected user's token/email for each scenario (BOT_TOKEN-without-header -> owner, BOT_TOKEN-with-header -> collaborator, owner direct -> owner, non-owner direct -> 403), and add a negative test where owner attempts to impersonate another user and assert it does not return the other user's credentials.components/backend/handlers/runtime_credentials.go (2)
37-47:⚠️ Potential issue | 🔴 CriticalOwner fallback still allows impersonation when the current-user header is present.
If the authenticated caller is the session owner,
authenticatedUserID == ownerUserIDmakes this return true for anyeffectiveUserID. That lets the owner supplyX-Runner-Current-Userfor another participant and fetch that participant's credentials; owner fallback must only apply when no current-user header was supplied.🔒 Suggested fix
func checkCredentialRBAC(c *gin.Context, ownerUserID, effectiveUserID string) bool { authenticatedUserID := c.GetString("userID") if authenticatedUserID == "" { // BOT_TOKEN (session ServiceAccount) - allowed by K8s RBAC return true } - // Allow if authenticated user is either the owner or the effective user - return authenticatedUserID == ownerUserID || authenticatedUserID == effectiveUserID + if effectiveUserID == ownerUserID { + return authenticatedUserID == ownerUserID + } + return authenticatedUserID == effectiveUserID }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/backend/handlers/runtime_credentials.go` around lines 37 - 47, In checkCredentialRBAC, the owner fallback currently allows the session owner to impersonate any effectiveUserID whenever the owner is authenticated; change the logic so the owner is only allowed to act as the effective user when no "current-user" header was supplied. Specifically, detect whether the X-Runner-Current-User (or whatever header you use to derive effectiveUserID) is present/non-empty from the gin.Context and, if it is present, require authenticatedUserID == effectiveUserID (do not grant owner fallback); if the header is absent, allow owner fallback by permitting authenticatedUserID == ownerUserID. Keep the existing BOT_TOKEN behavior (authenticatedUserID == "" returns true).
109-109:⚠️ Potential issue | 🟠 MajorKeep raw principals out of default credential-access logs.
These
CREDENTIAL_ACCESSlines writeeffectiveUserID,ownerUserID, andauth_ason every fetch. In this codebase those IDs can be email-like, so this creates a high-volume PII path in ordinary application logs; prefer opaque IDs or hashes here and reserve full principals for a dedicated audit sink.Also applies to: 191-191, 275-275, 341-341
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/backend/handlers/runtime_credentials.go` at line 109, The log.Printfs are emitting raw principals (effectiveUserID, ownerUserID, and c.GetString("userID")) which may contain PII—change the logging to emit only opaque identifiers by replacing raw values with a deterministic hash or masked form (e.g., sha256/hex or a MaskPrincipal helper) before calling log.Printf; update the occurrences that use log.Printf and those same symbols (effectiveUserID, ownerUserID, c.GetString("userID")) at the other sites noted (lines with the other CREDENTIAL_ACCESS calls) to use the same hashing/masking helper so all credential-access logs contain only non-PII opaque IDs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`:
- Around line 154-158: The cleanup of process-wide credentials must be made safe
and deterministic: move the clear_runtime_credentials() call into a finally
block in run(), and only clear if this turn set the credentials (e.g., record a
per-run marker or the exact env values you set via _context.get_env and compare
before clearing). Also synchronize access to os.environ mutations using the same
get_lock(thread_id) or a dedicated global lock so concurrent turns cannot race;
alternatively restore previous env values saved at setup time instead of
unconditionally calling clear_runtime_credentials(). Ensure
KEEP_CREDENTIALS_PERSISTENT handling remains respected.
In `@components/runners/ambient-runner/ambient_runner/endpoints/run.py`:
- Around line 63-78: The code is mutating the shared RunnerContext by calling
bridge.context.set_current_user(...) which can race across concurrent requests;
instead keep identity per-run by not calling bridge.context.set_current_user and
passing the sanitized values through the local run scope (e.g., store
current_user_id/current_user_name in local variables or attach them to the
per-request run object) and update callers that need them (notably
platform/auth.py::_fetch_credential) to accept the per-run identity as
parameters; use sanitize_user_context(...) as you already do, remove the shared
mutation via bridge.context.set_current_user, and thread the per-run identity
into functions that require it (or into a per-request context object that is not
shared globally).
In `@components/runners/ambient-runner/ambient_runner/platform/auth.py`:
- Around line 330-344: The current clear_runtime_credentials() only unsets a
fixed env list; extend it to also remove any dynamically-created MCP_*
environment variables (e.g., keys created by populate_mcp_server_credentials())
by scanning os.environ for keys starting with "MCP_" and popping them, and also
remove Google refresh credentials written by populate_runtime_credentials() by
deleting the workspace credential file (the
.google_workspace_mcp/credentials/credentials.json under the workspace path) and
its parent credentials directory if empty; ensure logging records which MCP_*
keys and whether the Google credential file was removed.
---
Duplicate comments:
In `@components/backend/handlers/runtime_credentials_test.go`:
- Around line 173-366: Tests only assert "not 401/403" so they don't verify
which user's credentials were returned; update the credential tests (functions:
GetGitHubTokenForSession, GetJiraCredentialsForSession,
GetGoogleCredentialsForSession, GetGitLabTokenForSession) by seeding distinct
owner and collaborator credentials (or stub the provider getter used by
createCredentialContext) so requests return distinguishable tokens/emails, then
replace the weak Expect(...).NotTo(Equal(http.StatusUnauthorized/Forbidden))
checks with assertions that the response body (from
httpUtils.GetResponseRecorder()) contains the expected user's token/email for
each scenario (BOT_TOKEN-without-header -> owner, BOT_TOKEN-with-header ->
collaborator, owner direct -> owner, non-owner direct -> 403), and add a
negative test where owner attempts to impersonate another user and assert it
does not return the other user's credentials.
In `@components/backend/handlers/runtime_credentials.go`:
- Around line 37-47: In checkCredentialRBAC, the owner fallback currently allows
the session owner to impersonate any effectiveUserID whenever the owner is
authenticated; change the logic so the owner is only allowed to act as the
effective user when no "current-user" header was supplied. Specifically, detect
whether the X-Runner-Current-User (or whatever header you use to derive
effectiveUserID) is present/non-empty from the gin.Context and, if it is
present, require authenticatedUserID == effectiveUserID (do not grant owner
fallback); if the header is absent, allow owner fallback by permitting
authenticatedUserID == ownerUserID. Keep the existing BOT_TOKEN behavior
(authenticatedUserID == "" returns true).
- Line 109: The log.Printfs are emitting raw principals (effectiveUserID,
ownerUserID, and c.GetString("userID")) which may contain PII—change the logging
to emit only opaque identifiers by replacing raw values with a deterministic
hash or masked form (e.g., sha256/hex or a MaskPrincipal helper) before calling
log.Printf; update the occurrences that use log.Printf and those same symbols
(effectiveUserID, ownerUserID, c.GetString("userID")) at the other sites noted
(lines with the other CREDENTIAL_ACCESS calls) to use the same hashing/masking
helper so all credential-access logs contain only non-PII opaque IDs.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 70a21b05-9f89-4c0b-a597-3843b503dd52
📒 Files selected for processing (7)
components/backend/handlers/runtime_credentials.gocomponents/backend/handlers/runtime_credentials_test.gocomponents/backend/websocket/agui_proxy.gocomponents/runners/ambient-runner/ambient_runner/bridges/claude/backend_tools.pycomponents/runners/ambient-runner/ambient_runner/bridges/claude/bridge.pycomponents/runners/ambient-runner/ambient_runner/endpoints/run.pycomponents/runners/ambient-runner/ambient_runner/platform/auth.py
components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py
Outdated
Show resolved
Hide resolved
components/runners/ambient-runner/ambient_runner/endpoints/run.py
Outdated
Show resolved
Hide resolved
…n identity Session ServiceAccounts created by the operator lacked the ambient-code.io/created-by-user-id annotation, causing child sessions created via the create-session tool to resolve the raw SA identity (e.g. system-serviceaccount-...) instead of the human user. The backend middleware (resolveServiceAccountFromToken) relies on this annotation to map SA tokens back to the original user. - Set annotation from spec.userContext.userId when creating the SA - Patch annotation on pre-existing SAs during restarts - Fix gofmt formatting in agui_proxy.go Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/backend/websocket/agui_proxy.go`:
- Around line 771-777: The current header-setting logic may set
X-Current-User-Name without X-Current-User-ID, creating partial identity
context; update the header logic so that X-Current-User-Name is only set when
userID is present (i.e., nest the userName check inside the userID check) by
adjusting the block that calls req.Header.Set using the userID and userName
variables so both headers are emitted together only when userID != "".
- Around line 336-340: The logging currently prints raw sensitive identifiers
via the log.Printf statements that reference currentUserID and currentUserName
(in the agui_proxy.go run handling around projectName/sessionName); change these
logs to avoid full identifiers by either logging a redacted/masked form (e.g.,
show only a fixed prefix/suffix or hash of currentUserID), or log presence/role
only (e.g., "user present" or "using session owner credentials") and include
projectName/sessionName for context; update the two log.Printf calls so they
never emit full currentUserID/currentUserName values and instead emit the
masked/hashed or boolean form.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: fd676ba2-30b4-4e29-8f5f-ae709c44601f
📒 Files selected for processing (2)
components/backend/websocket/agui_proxy.gocomponents/operator/internal/handlers/sessions.go
- Fix checkCredentialRBAC: prevent owner from impersonating other users via X-Runner-Current-User header (only allow owner fallback when no header present) - Fix runner concurrency: pass user context through bridge.run() params instead of mutating shared RunnerContext; set inside the lock - Move clear_runtime_credentials() into finally block to ensure cleanup on errors/cancellation - Extend credential cleanup to clear MCP_* env vars and Google Workspace credential files - Remove PII (raw userIDs) from CREDENTIAL_ACCESS log lines - Fix docs: replace non-existent DISABLE_PER_USER_CREDENTIALS flag with KEEP_CREDENTIALS_PERSISTENT - Add **kwargs to gemini/langgraph bridges for forward compatibility Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py (1)
89-127:⚠️ Potential issue | 🔴 CriticalSet the per-run user before any credential refresh.
_setup_platform()populates runtime/MCP credentials later in this file, and those fetches only consultcontext.current_user_idinambient_runner/platform/auth.py. Here the per-run user is not set until after_ensure_ready()/_refresh_credentials_if_stale()have already run, so the first credential hydration of a shared-session turn can still use the owner or previous user's identity.As per coding guidelines,
**: -Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py` around lines 89 - 127, The per-run user is being set too late in run — after _ensure_ready() and _refresh_credentials_if_stale() — which allows credential hydration to use the wrong user; move the call to self._context.set_current_user(current_user_id, current_user_name) to the top of run (before calling self._ensure_ready() and self._refresh_credentials_if_stale()) so that platform/auth credential fetches see the correct current_user_id; locate the change in the run method and update the placement of _context.set_current_user accordingly.components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/bridge.py (1)
77-133:⚠️ Potential issue | 🔴 CriticalPer-message credential scoping is still not wired into Gemini runs.
ambient_runner/platform/auth.py::_fetch_credential()only looks atRunnerContext.current_user_id, but this method never applies the new per-run user kwargs before_ensure_ready()/_refresh_credentials_if_stale()/_setup_platform()populate credentials. It also never clears runtime credentials infinally. In a shared Gemini session, that means a turn can still hydrate owner/default credentials and leave them behind for the next turn.As per coding guidelines,
**: -Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/bridge.py` around lines 77 - 133, The Gemini run path doesn't scope per-message credentials because ambient_runner/platform/auth.py::_fetch_credential reads RunnerContext.current_user_id but the run method (run in bridge.py) calls _ensure_ready/_refresh_credentials_if_stale before applying per-run user kwargs and never clears runtime creds; fix it by setting the per-run user id into RunnerContext (or calling the runtime-credential setter) from the incoming RunAgentInput before calling _ensure_ready/_refresh_credentials_if_stale/_setup_platform so _fetch_credential resolves the right creds, and ensure you clear/restore runtime credentials in a finally block at the end of run (so owner/default creds are not left behind between threads); reference run, RunAgentInput, _ensure_ready, _refresh_credentials_if_stale, and _fetch_credential when making these changes.
♻️ Duplicate comments (5)
docs/src/content/docs/guides/migrating-shared-sessions.md (2)
55-55:⚠️ Potential issue | 🟡 MinorAdd language specifier to code block.
The fenced code block should specify
bashfor proper syntax highlighting.📝 Proposed fix
-``` +```bash KEEP_CREDENTIALS_PERSISTENT=true</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@docs/src/content/docs/guides/migrating-shared-sessions.mdat line 55, The
fenced code block containing the environment variable example (the
triple-backtick block that wraps KEEP_CREDENTIALS_PERSISTENT=true) lacks a
language specifier; update that code fence to start withbash so the block readsbash followed by KEEP_CREDENTIALS_PERSISTENT=true and then the closing
51-59:⚠️ Potential issue | 🔴 CriticalRollback section is inaccurate and misleading.
The documentation claims that
KEEP_CREDENTIALS_PERSISTENT=truewill "preserve the previous session owner behavior" and that "the session owner's credentials remain active for all messages." This is incorrect. Based on the implementation:
KEEP_CREDENTIALS_PERSISTENTonly prevents the runner from clearing credentials from memory after each turn- It does NOT affect backend credential resolution in
getEffectiveUserID()- The backend still resolves the effective user from the
X-Runner-Current-Userheader for each message- There is no mechanism to revert to using session owner credentials for all messages
This is a security documentation issue because it gives users false expectations about how to rollback the per-message credential feature.
Either implement a backend toggle (e.g.,
DISABLE_PER_USER_CREDENTIALSinruntime_credentials.go) that actually reverts credential resolution to session owner mode, or remove this rollback section entirely.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/src/content/docs/guides/migrating-shared-sessions.md` around lines 51 - 59, The Rollback section is incorrect because KEEP_CREDENTIALS_PERSISTENT only prevents in-memory credential cleanup and does not change backend resolution in getEffectiveUserID(), so either remove the rollback paragraph entirely or implement a true backend toggle; to fix, either delete the claims about restoring "previous session owner behavior" and the KEEP_CREDENTIALS_PERSISTENT example in the docs, or add a backend feature (e.g., a DISABLE_PER_USER_CREDENTIALS flag in runtime_credentials.go) that forces getEffectiveUserID() to return the session owner and then update the docs to reference that exact flag and its behavior.components/runners/ambient-runner/ambient_runner/endpoints/run.py (1)
68-75:⚠️ Potential issue | 🟠 MajorDon't emit
current_user_idin runner info logs.
current_user_idcan be an email-like identifier. Logging it on every run creates another PII exhaust path in default runner logs; log presence or a hash instead if you need correlation.As per coding guidelines,
**: -Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/runners/ambient-runner/ambient_runner/endpoints/run.py` around lines 68 - 75, The code logs a PII-bearing current_user_id (in run.py after sanitize_user_context), which must be removed; replace the logger.info(f"Run user context: {current_user_id}") call in the block that follows sanitize_user_context with a non-PII alternative such as logging presence/boolean or a deterministic hash of current_user_id (e.g., compute a SHA‑256 or similar of the value and log that) so callers can correlate runs without exposing the actual identifier; keep the sanitize_user_context call but ensure only the hashed or "present"/"absent" value is passed to logger.info.components/backend/handlers/runtime_credentials.go (1)
95-97:⚠️ Potential issue | 🟠 MajorThe new credential path still writes raw user identifiers into backend logs.
These branches log
authenticatedUserID,ownerUserID, oreffectiveUserIDon denial, lookup failure, and Google refresh. In this codebase those IDs can be email-like, so this reintroduces the same PII exhaust path the PR is otherwise trying to close. Prefer provider/session metadata plus booleans, or hashed IDs if you need correlation.As per coding guidelines,
**: -Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.Also applies to: 110-113, 175-177, 189-214, 263-265, 273-274, 329-331, 339-339
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/backend/handlers/runtime_credentials.go` around lines 95 - 97, The log statements in runtime_credentials.go (e.g., the log.Printf in the checkCredentialRBAC failure branch and similar logs around the same function) are emitting raw user identifiers (authenticatedUserID/ownerUserID/effectiveUserID); replace those prints with non-PII diagnostics such as provider/session metadata and boolean flags, or log a stable hashed ID (e.g., SHA256 of the userID) if correlation is required; update every occurrence noted (the failure branch at the checkCredentialRBAC call and the other logged branches around Google refresh/lookup) to stop writing raw identifiers and instead emit either (1) provider/session info + action outcome or (2) hashed user identifiers, keeping the log messages concise and consistent with existing logging helpers.components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py (1)
123-169:⚠️ Potential issue | 🔴 CriticalPer-thread locking still doesn't make process-wide credential state safe.
populate_runtime_credentials()/clear_runtime_credentials()mutateos.environand on-disk credential files, but this path only serializes bythread_id. Two concurrent turns on different threads can still overwrite or clear each other's credentials mid-run. This needs a process-wide lock or per-turn isolated env/context instead of shared process globals.As per coding guidelines,
**: -Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py` around lines 123 - 169, The credential setup/teardown is only guarded by per-thread locks (get_lock(thread_id)) but populate_runtime_credentials()/clear_runtime_credentials() modify process-global state (os.environ and on-disk files), so you must serialize those calls with a process-wide lock or use per-run isolated credential contexts; change the code that calls populate_runtime_credentials()/clear_runtime_credentials() to acquire a global credential lock (e.g., add or reuse a global asyncio.Lock or threading.Lock exposed from ambient_runner.platform.auth like CREDENTIAL_LOCK) before populating and hold it until credentials are cleared in the finally block, or refactor adapter.run to accept an explicit credentials/context object instead of relying on process env and remove the global env mutation; ensure the lock is acquired around both populate_runtime_credentials and clear_runtime_credentials and reference the symbols populate_runtime_credentials, clear_runtime_credentials, get_lock, and adapter.run to locate the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/runners/ambient-runner/ambient_runner/platform/auth.py`:
- Around line 354-367: The cleanup logic uses WORKSPACE_PATH directly which can
diverge from where populate_runtime_credentials() actually wrote the
user-specific Google refresh-token file; update both the writer and the remover
to use a single shared helper/constant (e.g., a
get_workspace_mcp_credentials_path() helper or a WORKSPACE_MCP_DIR constant) so
that the code that writes the refresh token in populate_runtime_credentials()
and the cleanup code that builds google_cred_file reference the exact same path;
replace direct uses of os.environ.get("WORKSPACE_PATH", "/workspace") and manual
Path construction in the cleanup block with that shared helper/constant and
ensure the helper honors any overridden WORKSPACE_PATH.
- Around line 348-352: The cleanup loop currently removes every MCP_* env var
(mcp_keys), including static config like MCP_CONFIG_FILE; instead, change
populate_mcp_server_credentials to record and return only the dynamically
injected credential names (e.g., return a list or set of keys, or store them in
a module-level variable like injected_mcp_keys), then update this cleanup code
to iterate only over that returned/injected_mcp_keys to pop and append to
cleared; keep static MCP_* vars untouched so populate_mcp_server_credentials can
still read MCP_CONFIG_FILE and other static settings.
---
Outside diff comments:
In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`:
- Around line 89-127: The per-run user is being set too late in run — after
_ensure_ready() and _refresh_credentials_if_stale() — which allows credential
hydration to use the wrong user; move the call to
self._context.set_current_user(current_user_id, current_user_name) to the top of
run (before calling self._ensure_ready() and
self._refresh_credentials_if_stale()) so that platform/auth credential fetches
see the correct current_user_id; locate the change in the run method and update
the placement of _context.set_current_user accordingly.
In
`@components/runners/ambient-runner/ambient_runner/bridges/gemini_cli/bridge.py`:
- Around line 77-133: The Gemini run path doesn't scope per-message credentials
because ambient_runner/platform/auth.py::_fetch_credential reads
RunnerContext.current_user_id but the run method (run in bridge.py) calls
_ensure_ready/_refresh_credentials_if_stale before applying per-run user kwargs
and never clears runtime creds; fix it by setting the per-run user id into
RunnerContext (or calling the runtime-credential setter) from the incoming
RunAgentInput before calling
_ensure_ready/_refresh_credentials_if_stale/_setup_platform so _fetch_credential
resolves the right creds, and ensure you clear/restore runtime credentials in a
finally block at the end of run (so owner/default creds are not left behind
between threads); reference run, RunAgentInput, _ensure_ready,
_refresh_credentials_if_stale, and _fetch_credential when making these changes.
---
Duplicate comments:
In `@components/backend/handlers/runtime_credentials.go`:
- Around line 95-97: The log statements in runtime_credentials.go (e.g., the
log.Printf in the checkCredentialRBAC failure branch and similar logs around the
same function) are emitting raw user identifiers
(authenticatedUserID/ownerUserID/effectiveUserID); replace those prints with
non-PII diagnostics such as provider/session metadata and boolean flags, or log
a stable hashed ID (e.g., SHA256 of the userID) if correlation is required;
update every occurrence noted (the failure branch at the checkCredentialRBAC
call and the other logged branches around Google refresh/lookup) to stop writing
raw identifiers and instead emit either (1) provider/session info + action
outcome or (2) hashed user identifiers, keeping the log messages concise and
consistent with existing logging helpers.
In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`:
- Around line 123-169: The credential setup/teardown is only guarded by
per-thread locks (get_lock(thread_id)) but
populate_runtime_credentials()/clear_runtime_credentials() modify process-global
state (os.environ and on-disk files), so you must serialize those calls with a
process-wide lock or use per-run isolated credential contexts; change the code
that calls populate_runtime_credentials()/clear_runtime_credentials() to acquire
a global credential lock (e.g., add or reuse a global asyncio.Lock or
threading.Lock exposed from ambient_runner.platform.auth like CREDENTIAL_LOCK)
before populating and hold it until credentials are cleared in the finally
block, or refactor adapter.run to accept an explicit credentials/context object
instead of relying on process env and remove the global env mutation; ensure the
lock is acquired around both populate_runtime_credentials and
clear_runtime_credentials and reference the symbols
populate_runtime_credentials, clear_runtime_credentials, get_lock, and
adapter.run to locate the change.
In `@components/runners/ambient-runner/ambient_runner/endpoints/run.py`:
- Around line 68-75: The code logs a PII-bearing current_user_id (in run.py
after sanitize_user_context), which must be removed; replace the
logger.info(f"Run user context: {current_user_id}") call in the block that
follows sanitize_user_context with a non-PII alternative such as logging
presence/boolean or a deterministic hash of current_user_id (e.g., compute a
SHA‑256 or similar of the value and log that) so callers can correlate runs
without exposing the actual identifier; keep the sanitize_user_context call but
ensure only the hashed or "present"/"absent" value is passed to logger.info.
In `@docs/src/content/docs/guides/migrating-shared-sessions.md`:
- Line 55: The fenced code block containing the environment variable example
(the triple-backtick block that wraps KEEP_CREDENTIALS_PERSISTENT=true) lacks a
language specifier; update that code fence to start with ```bash so the block
reads ```bash followed by KEEP_CREDENTIALS_PERSISTENT=true and then the closing
``` to enable proper Bash syntax highlighting.
- Around line 51-59: The Rollback section is incorrect because
KEEP_CREDENTIALS_PERSISTENT only prevents in-memory credential cleanup and does
not change backend resolution in getEffectiveUserID(), so either remove the
rollback paragraph entirely or implement a true backend toggle; to fix, either
delete the claims about restoring "previous session owner behavior" and the
KEEP_CREDENTIALS_PERSISTENT example in the docs, or add a backend feature (e.g.,
a DISABLE_PER_USER_CREDENTIALS flag in runtime_credentials.go) that forces
getEffectiveUserID() to return the session owner and then update the docs to
reference that exact flag and its behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 167bc726-8b8e-4f0e-b638-ea8fd6764cca
📒 Files selected for processing (7)
components/backend/handlers/runtime_credentials.gocomponents/runners/ambient-runner/ambient_runner/bridges/claude/bridge.pycomponents/runners/ambient-runner/ambient_runner/bridges/gemini_cli/bridge.pycomponents/runners/ambient-runner/ambient_runner/bridges/langgraph/bridge.pycomponents/runners/ambient-runner/ambient_runner/endpoints/run.pycomponents/runners/ambient-runner/ambient_runner/platform/auth.pydocs/src/content/docs/guides/migrating-shared-sessions.md
components/runners/ambient-runner/ambient_runner/platform/auth.py
Outdated
Show resolved
Hide resolved
components/runners/ambient-runner/ambient_runner/platform/auth.py
Outdated
Show resolved
Hide resolved
- Remove raw userID from agui_proxy run logs - Only send X-Current-User-Name header when X-Current-User-ID is present - Narrow MCP_* cleanup to credential keys only (preserve MCP_CONFIG_FILE) - Use hardcoded /workspace path for Google credential cleanup to match populate_runtime_credentials write path Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…run user Users inside a shared Claude session have access to the BOT_TOKEN (ServiceAccount token) via environment variables. Previously, a user could craft a curl request with X-Runner-Current-User set to another user's ID and fetch their credentials (GitHub, Jira, Google, GitLab). Fix: the AG-UI proxy now records which authenticated user initiated each run in SessionActiveUserMap. The credential RBAC check validates that BOT_TOKEN callers can only request credentials for the user who started the current run — not arbitrary users. - Add SessionActiveUserMap (sync.Map) to track active user per session - Update checkCredentialRBAC to validate BOT_TOKEN requests against map - Fall back to owner-only when no active user tracked (scheduled/API) - Update tests to pre-populate active user for BOT_TOKEN test cases Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…scoping Replace BOT_TOKEN + X-Runner-Current-User header approach with forwarding the actual user's bearer token to the runner. This eliminates the credential impersonation vector where users inside a shared session could use the BOT_TOKEN (accessible via $BOT_TOKEN env var) to fetch other users' credentials. New flow: 1. Backend proxy extracts caller's Authorization header 2. Forwards it to runner as X-Caller-Token 3. Runner uses caller's token (not BOT_TOKEN) for credential requests 4. Backend authenticates the request as the actual user 5. Token cleared in finally block after each turn BOT_TOKEN callers can now only access the session owner's credentials (fallback for automated/scheduled sessions). Per-user scoping is handled entirely by the forwarded caller token. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the runner uses the caller's bearer token, it must also set X-Runner-Current-User so the backend resolves the effective user to the caller (not the owner). Without this, non-owner users in shared sessions would get 403 because the RBAC check compared their authenticated identity against the owner. The security model remains intact: authenticatedUserID (from token) must match effectiveUserID (from header), so impersonation is impossible. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The caller token was being set on the context INSIDE the lock, but credential population happened BEFORE the lock. This meant credentials were always fetched with BOT_TOKEN, never with the caller's token. Move set_current_user() to before _ensure_ready() and _refresh_credentials_if_stale() so populate_runtime_credentials() uses the caller's bearer token for per-user credential scoping. The refresh_credentials MCP tool also benefits — it calls populate_runtime_credentials(context) which now has the caller token, so mid-run refreshes use the correct user's identity. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The 60-second credential cache meant that when User B sent a message right after User A, User A's credentials were still active. Now credentials are always re-fetched when the caller_user_id changes, bypassing the staleness cache. Same-user consecutive messages still benefit from the cache. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cached credentials (env vars like JIRA_API_TOKEN) persisted across runs, so a child session or subsequent user would inherit the previous user's tokens. Now every run clears all credentials first, then fetches fresh ones for the current caller. This prevents cross-user credential leakage in shared and child sessions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
External MCP servers (Jira, Google Workspace) capture credentials at process start and don't refresh when env vars change. When a different user sends a message, the existing Claude Code worker is destroyed and a new one created. The session ID is preserved so --resume picks up the conversation seamlessly with the new user's credentials and fresh MCP servers. Also populates MCP server credentials on every run (not just first) to ensure dynamic MCP_* env vars are set before the Claude client starts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the runner forwards the caller's bearer token (X-Caller-Token), the backend middleware doesn't resolve the user identity because the request bypasses the OAuth proxy. The RBAC check saw an empty authenticatedUserID and treated it as BOT_TOKEN, rejecting non-owner credential requests with 403. Fix: checkCredentialRBAC now calls resolveTokenIdentity() via SelfSubjectReview when the middleware didn't set userID but a valid K8s client exists. This correctly identifies the caller from their forwarded token. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… change
- Extract enforceCredentialRBAC() helper to deduplicate RBAC boilerplate
across 4 credential endpoints (GitHub, Google, Jira, GitLab)
- Cache SelfSubjectReview result per-request via resolveAuthenticatedUser()
to avoid 4 redundant K8s API calls per turn
- Skip SA annotation update when already correct (avoid no-op K8s writes)
- Rebuild MCP server config when user changes so .mcp.json env blocks
(e.g., ${JIRA_API_TOKEN}) re-expand with the new user's credentials
- Remove duplicate CREDENTIAL_ACCESS log lines (now in shared helper)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Consolidate user context, credential refresh, worker restart, and MCP rebuild into a single _initialize_run() method. Makes run() cleaner and gives a single entry point for spinning up/down the runtime per user. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When --resume is used, Claude Code restores the previous session's MCP server state including baked-in credentials. This meant that when User B sent a message after User A, the Jira MCP server still had User A's API token despite the runner clearing and re-fetching credentials. Fix: when the user changes, destroy the worker WITHOUT saving the session ID. The new Claude Code process starts fresh with no --resume, getting MCP servers initialized with the current user's credentials. Conversation history is lost on user switch, but credential isolation is guaranteed. A future improvement could use the SDK's add_mcp_server/remove_mcp_server to dynamically swap servers without losing the session. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Revert the "no resume" approach. Each ClaudeSDKClient creates a new CLI subprocess that spawns fresh MCP servers from the current os.environ — --resume only restores conversation state, not MCP server processes. When the user changes: 1. Destroy the worker (saves session ID for --resume) 2. Rebuild MCP server config with current env vars 3. Set adapter to None so _ensure_adapter rebuilds with new mcp_servers 4. New worker created via get_or_create with --resume + fresh MCP config Conversation history preserved, credentials isolated per user. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
OAuth tokens typically expire after ~1 hour. For long-running sessions where the same user stays active, the caller token expires and credential refresh fails with 401/403. Now falls back to BOT_TOKEN which fetches the session owner's credentials — correct for single-user sessions since the owner IS the active user. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the caller's OAuth token expires during a long run, the runner falls back to BOT_TOKEN. Previously this could only access the owner's credentials. Now the backend tracks which user initiated the current run (sessionActiveUserMap, set by the AG-UI proxy) and allows BOT_TOKEN callers to access that user's credentials. Flow: 1. User B sends message → proxy records activeUser = "user-B" 2. Runner uses caller token → works while OAuth valid 3. Token expires → runner falls back to BOT_TOKEN + X-Runner-Current-User 4. Backend checks: effectiveUserID == activeUser → allowed 5. Returns User B's credentials (not owner's) Security: only the proxy can set the active user map (not the runner or Claude). An attacker with BOT_TOKEN can only access the user that the proxy last authenticated — not arbitrary users. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
components/runners/ambient-runner/ambient_runner/platform/auth.py (1)
90-147:⚠️ Potential issue | 🔴 CriticalValidate BACKEND_API_URL scheme and host before credential fetches.
BACKEND_API_URLis read from the environment without validation and passed directly tourlopen(). Since runner containers accept user-providedenvironmentVariablesthat can override this variable, a session can redirect credential fetches to an attacker-controlled host and exfiltratecaller_tokenorBOT_TOKEN. Additionally,urllib.request.urlopen()accepts non-HTTP schemes likefile://, which could enable further exploitation. Enforce that the URL useshttp://orhttps://and targets only trusted backend endpoints before issuing the request.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/runners/ambient-runner/ambient_runner/platform/auth.py` around lines 90 - 147, The code in _fetch_credential reads BACKEND_API_URL into base and constructs url then calls urlopen without validating scheme/host, allowing attacker-controlled environment to exfiltrate tokens; fix by validating base (from BACKEND_API_URL) before constructing url: parse base with urllib.parse.urlparse and ensure scheme is "http" or "https" and the netloc (hostname) matches the expected trusted backend host or a configured allowlist, raise/log and return {} if validation fails, and only then proceed to build url and perform the request (this validation should occur before creating req and before any use of context.caller_token or BOT_TOKEN and applies to both the primary request and the BOT_TOKEN fallback in _do_req).
♻️ Duplicate comments (4)
components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py (1)
113-117:⚠️ Potential issue | 🔴 CriticalSerialize credential env/file mutation across all runs.
These calls mutate process-wide
os.environand the shared Google credential file, but the only synchronization here isget_lock(thread_id). A second thread can clear or overwrite credentials while another turn is still using them. Use a bridge-global lock around populate/clear, or stop storing per-user secrets in process-global env/file state.Also applies to: 206-211
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py` around lines 113 - 117, The credential mutation sequence (clear_runtime_credentials, populate_runtime_credentials, populate_mcp_server_credentials) is not serialized across threads and must be protected by a bridge-global lock; introduce a module-level asyncio.Lock or threading.Lock in the claude bridge and wrap the entire sequence (including setting self._last_creds_refresh) inside that lock in the function where these calls occur (and likewise for the other occurrence around lines 206-211) so that clear_runtime_credentials and the two populate_* calls execute atomically for the bridge instance instead of relying only on get_lock(thread_id).components/backend/websocket/agui_proxy.go (1)
373-377:⚠️ Potential issue | 🟠 MajorDon't reintroduce raw user IDs into proxy logs.
logSuffixprints the fulluserIDagain, and in this codebase that identifier can be email-like PII. Log presence or a redacted/hash form instead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/backend/websocket/agui_proxy.go` around lines 373 - 377, The log currently appends the raw userID via logSuffix (userID) when printing "AGUI Proxy: connecting to runner" which may expose PII; change the code around logSuffix/userID so it does not emit the full identifier—either log a boolean like "user=present" or a non-reversible short hash/redaction of userID (e.g. first/last char masked or an HMAC/sha256 snippet) and include that sanitized value instead of userID in the log message that uses runnerURL and logSuffix; update any helper or inline code that constructs logSuffix to call the sanitizer (referencing logSuffix, userID, and the log.Printf call) so raw IDs are never written to logs.components/runners/ambient-runner/ambient_runner/platform/context.py (1)
29-31:⚠️ Potential issue | 🔴 CriticalKeep per-run identity off the shared
RunnerContext.This object is created once and shared, but these fields make request identity process-global.
components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.pymutates them per run andcomponents/runners/ambient-runner/ambient_runner/platform/auth.py::_fetch_credential()reads them for auth headers, so concurrent turns can fetch the wrong user's credentials and stale IDs can leak into later turns.Also applies to: 55-59
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/runners/ambient-runner/ambient_runner/platform/context.py` around lines 29 - 31, RunnerContext currently holds per-run identity fields (current_user_id, current_user_name, caller_token) which are mutated by bridges/claude/bridge.py and read by platform/auth.py::_fetch_credential(), causing cross-request leakage; remove these fields from RunnerContext and instead propagate per-request identity via a new ephemeral object or explicit parameters: create a small PerRunIdentity (or pass identity args) that bridges/claude/bridge.py instantiates per run and pass it into calls that need auth (e.g., platform/auth.py::_fetch_credential(identity) or other auth call sites) so no global RunnerContext state is mutated. Ensure you update all locations that read/write current_user_id/current_user_name/caller_token to use the new per-request carrier (or function params) and delete the shared fields from RunnerContext to prevent concurrent leaks.components/backend/handlers/runtime_credentials_test.go (1)
175-190:⚠️ Potential issue | 🟠 MajorSuccess tests don't verify which user's credentials were selected.
These test cases only assert
status != 401/403, which still passes if the handler ignoresX-Runner-Current-User, falls back to the wrong user, or returns404/500. For a security feature ensuring per-message credential scoping, the tests should:
- Seed distinct credentials for owner vs collaborator (or mock the credential provider)
- Assert the returned token/identity matches the expected user
- Add a negative case where owner supplies
X-Runner-Current-Userpointing to someone else whose credentials don't existWithout this, the test suite won't catch regressions where the wrong user's credentials are returned.
Also applies to: 208-220, 236-249, 254-265, 352-363
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/backend/handlers/runtime_credentials_test.go` around lines 175 - 190, The tests in runtime_credentials_test.go (the It blocks using createCredentialContext and calling GetGitHubTokenForSession) only assert non-401/403 and don't verify which user's credentials were returned; update these tests to seed distinct credentials for owner vs collaborator (or mock the credential provider), call GetGitHubTokenForSession (using createCredentialContext and httpUtils.GetResponseRecorder()) and assert the returned token/identity equals the expected owner's token, then add a negative case where X-Runner-Current-User is set to a different user whose creds do not exist and assert the handler does not return the owner's token (or returns the expected error/404); apply the same changes to the other similar It blocks in this file that test credential selection.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/backend/handlers/runtime_credentials.go`:
- Around line 28-38: The current global sessionActiveUserMap and
SetSessionActiveUser use only sessionName as the key which can collide across
projects and concurrent runs; change the keying to include project and run
context (e.g., projectID/sessionName/runID or bind to the runner/request token)
so entries are unique, update all accesses (including enforceCredentialRBAC
where the BOT_TOKEN fallback reads sessionActiveUserMap) to use the new
composite key or request-scoped binding, and remove reliance on a bare
sessionName to prevent wrong non-owner authorization.
In `@components/operator/internal/handlers/sessions.go`:
- Around line 2431-2437: The code currently proceeds to mint a new runner token
even if updating/reading the target ServiceAccount annotations fails, which can
leave BOT_TOKEN attribution tied to the previous owner; update the handler so
that after computing saAnnotations (from
unstructured.NestedString(session.Object, "spec", "userContext", "userId")) you
must reconcile the ServiceAccount annotations (ensure the ServiceAccount has
ambient-code.io/created-by-user-id set when ownerUID is present and remove that
annotation when ownerUID is empty) and treat any failure to read or update the
ServiceAccount as a hard error that aborts before calling CreateToken(...);
specifically change the path around ServiceAccount annotation lookup/update (the
code manipulating saAnnotations and the ServiceAccount resource) to perform a
guaranteed successful reconciliation or return the error, then only call
CreateToken(...) when reconciliation succeeded.
In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`:
- Around line 123-133: The code currently skips rebuilding when current_user_id
becomes the empty-owner string because user_changed checks prev_user != "";
change user_changed to simply detect any change in owner id by using
user_changed = current_user_id != prev_user (or equivalent) so transitions like
"userB" -> "" trigger the rebuild path; keep the existing guard that checks
self._session_manager.get_existing(thread_id) and continue to call
self._session_manager.destroy(thread_id), self._rebuild_mcp_servers(), and set
self._adapter = None so MCP subprocesses and adapter are rebuilt with owner
credentials.
---
Outside diff comments:
In `@components/runners/ambient-runner/ambient_runner/platform/auth.py`:
- Around line 90-147: The code in _fetch_credential reads BACKEND_API_URL into
base and constructs url then calls urlopen without validating scheme/host,
allowing attacker-controlled environment to exfiltrate tokens; fix by validating
base (from BACKEND_API_URL) before constructing url: parse base with
urllib.parse.urlparse and ensure scheme is "http" or "https" and the netloc
(hostname) matches the expected trusted backend host or a configured allowlist,
raise/log and return {} if validation fails, and only then proceed to build url
and perform the request (this validation should occur before creating req and
before any use of context.caller_token or BOT_TOKEN and applies to both the
primary request and the BOT_TOKEN fallback in _do_req).
---
Duplicate comments:
In `@components/backend/handlers/runtime_credentials_test.go`:
- Around line 175-190: The tests in runtime_credentials_test.go (the It blocks
using createCredentialContext and calling GetGitHubTokenForSession) only assert
non-401/403 and don't verify which user's credentials were returned; update
these tests to seed distinct credentials for owner vs collaborator (or mock the
credential provider), call GetGitHubTokenForSession (using
createCredentialContext and httpUtils.GetResponseRecorder()) and assert the
returned token/identity equals the expected owner's token, then add a negative
case where X-Runner-Current-User is set to a different user whose creds do not
exist and assert the handler does not return the owner's token (or returns the
expected error/404); apply the same changes to the other similar It blocks in
this file that test credential selection.
In `@components/backend/websocket/agui_proxy.go`:
- Around line 373-377: The log currently appends the raw userID via logSuffix
(userID) when printing "AGUI Proxy: connecting to runner" which may expose PII;
change the code around logSuffix/userID so it does not emit the full
identifier—either log a boolean like "user=present" or a non-reversible short
hash/redaction of userID (e.g. first/last char masked or an HMAC/sha256 snippet)
and include that sanitized value instead of userID in the log message that uses
runnerURL and logSuffix; update any helper or inline code that constructs
logSuffix to call the sanitizer (referencing logSuffix, userID, and the
log.Printf call) so raw IDs are never written to logs.
In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`:
- Around line 113-117: The credential mutation sequence
(clear_runtime_credentials, populate_runtime_credentials,
populate_mcp_server_credentials) is not serialized across threads and must be
protected by a bridge-global lock; introduce a module-level asyncio.Lock or
threading.Lock in the claude bridge and wrap the entire sequence (including
setting self._last_creds_refresh) inside that lock in the function where these
calls occur (and likewise for the other occurrence around lines 206-211) so that
clear_runtime_credentials and the two populate_* calls execute atomically for
the bridge instance instead of relying only on get_lock(thread_id).
In `@components/runners/ambient-runner/ambient_runner/platform/context.py`:
- Around line 29-31: RunnerContext currently holds per-run identity fields
(current_user_id, current_user_name, caller_token) which are mutated by
bridges/claude/bridge.py and read by platform/auth.py::_fetch_credential(),
causing cross-request leakage; remove these fields from RunnerContext and
instead propagate per-request identity via a new ephemeral object or explicit
parameters: create a small PerRunIdentity (or pass identity args) that
bridges/claude/bridge.py instantiates per run and pass it into calls that need
auth (e.g., platform/auth.py::_fetch_credential(identity) or other auth call
sites) so no global RunnerContext state is mutated. Ensure you update all
locations that read/write current_user_id/current_user_name/caller_token to use
the new per-request carrier (or function params) and delete the shared fields
from RunnerContext to prevent concurrent leaks.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9c9c9b4c-32c7-46c7-bd21-752986bff29a
📒 Files selected for processing (9)
components/backend/handlers/runtime_credentials.gocomponents/backend/handlers/runtime_credentials_test.gocomponents/backend/websocket/agui_proxy.gocomponents/operator/internal/handlers/sessions.gocomponents/runners/ambient-runner/ambient_runner/bridges/claude/bridge.pycomponents/runners/ambient-runner/ambient_runner/endpoints/run.pycomponents/runners/ambient-runner/ambient_runner/platform/auth.pycomponents/runners/ambient-runner/ambient_runner/platform/context.pycomponents/runners/ambient-runner/tests/test_shared_session_credentials.py
components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
components/backend/handlers/runtime_credentials.go (1)
143-157:⚠️ Potential issue | 🔴 CriticalDon't re-expose the legacy project GitHub token to non-owners.
After
enforceCredentialRBACadmits a non-ownereffectiveUserID, this handler still callsgit.GetGitHubToken; the helper incomponents/backend/handlers/git/operations.gofalls back toambient-non-vertex-integrations/GITHUB_TOKENwhen that user has no PAT/App. In a shared session, that gives any participant the same project-wide token and undoes the per-message credential isolation this PR is adding. HaveenforceCredentialRBACreturnsameAsOwner/ownerUserIDand suppress the legacy fallback when it is false.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/backend/handlers/runtime_credentials.go` around lines 143 - 157, enforceCredentialRBAC currently only returns ok; update it to also return a flag or ownerUserID (e.g., sameAsOwner bool / ownerUserID string) so callers know whether the effectiveUserID is treated as the project owner, then in runtime_credentials.go change the call site to capture that flag and, when sameAsOwner==false, call git.GetGitHubToken in a mode that disables the legacy project-wide fallback (or pass ownerUserID so GetGitHubToken skips ambient-non-vertex-integrations/GITHUB_TOKEN fallback); specifically, modify enforceCredentialRBAC and the call in runtime_credentials.go (where enforceCredentialRBAC and git.GetGitHubToken are used) so non-owners cannot receive the legacy project GitHub token.
♻️ Duplicate comments (7)
components/backend/handlers/runtime_credentials_test.go (1)
174-190:⚠️ Potential issue | 🟠 MajorThese “success” cases still don't prove which user's credentials were selected.
status != 401/403also passes when the handler ignoresX-Runner-Current-User, falls back to the owner, or just returns404/500because no credentials were found. Since the test setup uses generic provider stubs, this suite still misses the main regression this PR is about. Seed distinct owner/collaborator credentials or capture the requestedeffectiveUserID, then assert the response body matches the expected user.Also applies to: 208-219, 235-248, 253-264, 350-363
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/backend/handlers/runtime_credentials_test.go` around lines 174 - 190, The test only checks for non-401/403 but not which user's credentials were used; update the tests around createCredentialContext and GetGitHubTokenForSession to assert the effective user explicitly by either (a) seeding the fake K8s/provider with distinct owner vs collaborator GitHub token records and asserting the response body or returned token matches the expected user's token, or (b) instrumenting/mocking the git.GetGitHubToken (or the provider stub used by GetGitHubTokenForSession) to capture the effectiveUserID passed in and assert it equals the owner or collaborator as intended; apply the same change pattern to the other affected cases (the blocks noted at lines ~208-219, 235-248, 253-264, 350-363) so each test validates which user was selected rather than only checking status codes.components/operator/internal/handlers/sessions.go (1)
2431-2473:⚠️ Potential issue | 🔴 CriticalMake ServiceAccount attribution reconciliation a hard prerequisite before minting a new token.
Line 2454 skips reconciliation entirely when
ownerUIDbecomes empty, so a reused ServiceAccount can keep a staleambient-code.io/created-by-user-id. And Lines 2468-2470 only warn on update failure beforeCreateToken(...)still runs. That can keep BOT_TOKEN attribution pinned to the previous owner in the restart path this PR is hardening.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/operator/internal/handlers/sessions.go` around lines 2431 - 2473, Ensure ServiceAccount attribution reconciliation always runs before minting a token: when handling the existing SA (variables saAnnotations, ownerUID, existingSA) always compare and reconcile the ambient-code.io/created-by-user-id annotation — if ownerUID is empty remove the annotation if present, otherwise set it to ownerUID; do this regardless of ownerUID being empty or not (don’t skip when len(saAnnotations)==0). Also treat an Update failure as a hard error (return the updErr) instead of only logging a warning so CreateToken (or any subsequent token minting) cannot proceed with stale attribution.components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py (2)
107-117:⚠️ Potential issue | 🔴 CriticalShared user/credential state is still racing across concurrent turns.
Line 109 writes request identity onto the shared
RunnerContext, Lines 114-116 repopulate process-global env, and Lines 208-211 clear it again, but the only lock here is the per-thread lock acquired later at Line 165. Two overlapping turns can therefore overwrite or clear each other's identity and credentials before worker setup or_fetch_credential()reads them. Either make this state per-run, or guard the full setup→run→cleanup lifecycle with process-wide synchronization.Also applies to: 165-211
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py` around lines 107 - 117, The code races shared user/credential state because set_current_user on self._context and calls to clear_runtime_credentials / populate_runtime_credentials / populate_mcp_server_credentials run outside the process-wide synchronization used later; to fix, either make credentials and identity per-run (pass a run-scoped credential/identity object into _ensure_ready/_fetch_credential and into any helper that reads env) or wrap the whole setup→run→cleanup lifecycle (the sequence that calls self._context.set_current_user, await self._ensure_ready(), clear_runtime_credentials(), populate_runtime_credentials(...), populate_mcp_server_credentials(...), and the later cleanup) in a process-wide lock so no two turns can interleave; move the set_current_user and _last_creds_refresh assignment inside that protected section (or change helpers to accept run-scoped state) and ensure _fetch_credential reads from the run-scoped object rather than global process env.
119-133:⚠️ Potential issue | 🔴 CriticalRebuild on collaborator → owner/API-key transitions too.
With Line 123,
prev_user="userB"andcurrent_user_id=""yieldsuser_changed == False, so the existing worker and MCP subprocesses survive the transition and keep user B's env-derived credentials.Suggested fix
- user_changed = current_user_id and current_user_id != prev_user and prev_user != "" + user_changed = current_user_id != prev_user🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py` around lines 119 - 133, The current user-change detection misses transitions where current_user_id is empty (owner/API-key) but prev_user was a collaborator; update the user_changed logic so it treats an empty current_user_id as a change when prev_user is non-empty. Replace the current expression using current_user_id and prev_user with something like: user_changed = prev_user != "" and (not current_user_id or current_user_id != prev_user) so that when prev_user is set and current_user_id is empty or different it triggers the existing destroy/rebuild path (affecting self._session_manager.get_existing(thread_id), await self._session_manager.destroy(thread_id), self._rebuild_mcp_servers(), and resetting self._adapter).components/runners/ambient-runner/ambient_runner/platform/auth.py (1)
372-377:⚠️ Potential issue | 🟠 MajorOnly clear MCP keys that were injected for this turn.
The heuristic here still deletes any preexisting
MCP_{SERVER}_{FIELD}env var. That includes static per-server config supplied by the image/operator/user, so the first cleanup can break MCP servers even when no credential was injected. Track the keys set bypopulate_mcp_server_credentials()and clear only that set.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/runners/ambient-runner/ambient_runner/platform/auth.py` around lines 372 - 377, The current cleanup loop removes any env var matching MCP_{SERVER}_{FIELD} which can delete preexisting static config; change populate_mcp_server_credentials to record and return the exact keys it injects (or append them to a module-level list like injected_mcp_keys), then in the cleanup section use that returned/recorded list instead of computing mcp_cred_keys via os.environ scan; specifically update populate_mcp_server_credentials(...) to return List[str] of injected keys (or push names into injected_mcp_keys), and replace the mcp_cred_keys comprehension and loop with iterating over that returned/recorded list and popping only those keys and appending them to cleared.components/backend/handlers/runtime_credentials.go (2)
28-42:⚠️ Potential issue | 🔴 CriticalDon't key BOT_TOKEN fallback by bare
sessionName.
sessionActiveUserMapstill collapses all active-user state to a single session string, sofooin project A can overwritefooin project B, and overlapping runs in one session race each other. Downstream at Lines 112-113 that can authorize the wrong non-owner or clear the wrong entry. Key it by at least project/session/run, or bind it to the runner/request token instead of a global map.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/backend/handlers/runtime_credentials.go` around lines 28 - 42, sessionActiveUserMap currently keys only by sessionName which allows collisions across projects and concurrent runs; update SetSessionActiveUser, ClearSessionActiveUser and the consumers (e.g., enforceCredentialRBAC) to use a stronger composite key (for example "projectID:sessionName:runID" or the runner/request token) instead of plain sessionName, store and delete entries using that composite key in sessionActiveUserMap, and update lookup logic in enforceCredentialRBAC to read by the same composite key so entries cannot be overwritten or misattributed across projects or concurrent runs.
121-123:⚠️ Potential issue | 🟠 MajorRedact principal IDs from ordinary backend logs.
These paths still emit raw
authenticatedUserID/effectiveUserIDvalues on RBAC denials and credential fetches. That recreates a general-log trail of sensitive principal identifiers outside the dedicated audit sink; hash or redact principals here and keep full identities only in the audit channel.Also applies to: 159-159, 207-230, 262-262, 299-299
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/backend/handlers/runtime_credentials.go` around lines 121 - 123, The RBAC-denial logs currently print raw principal IDs (authenticatedUserID, effectiveUserID, ownerUserID) which must be redacted; update the handlers that log RBAC denials (the block that prints "RBAC violation..." and the other occurrences around the credential fetch paths) to replace or hash principal identifiers before calling log.Printf or c.JSON logs—e.g., compute a short deterministic hash or redact string for authenticatedUserID and effectiveUserID and log those instead, while ensuring the full principal values continue to be sent only to the existing audit sink; locate uses of authenticatedUserID, effectiveUserID, ownerUserID in the runtime credential handlers and substitute redacted/hashed values in all ordinary backend logs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/backend/handlers/runtime_credentials.go`:
- Around line 56-75: resolveAuthenticatedUser currently swallows
resolveTokenIdentity failures and returns "" which is treated as BOT_TOKEN
downstream; change resolveAuthenticatedUser to return an explicit result and
error (e.g., (string, error) or a small enum) so callers can distinguish three
outcomes: 1) resolved user id, 2) explicit BOT_TOKEN/service-account marker when
token is a caller SA, and 3) an error for any lookup failure; update
resolveAuthenticatedUser to propagate resolveTokenIdentity errors (don’t convert
them to ""), set the cache key "resolvedUserID" only on successful or BOT_TOKEN
results, and modify callers (the code paths around the existing usage in
resolveAuthenticatedUser callers at the area noted ~lines 104-119) to deny the
request when an error is returned instead of treating "" as BOT_TOKEN.
In `@components/backend/websocket/agui_proxy.go`:
- Line 100: The ClearSessionActiveUser call (and corresponding
SetSessionActiveUser uses) currently keys active-user state by sessionName only,
which causes collisions across namespaces; change the keying to a composite key
that includes the project/namespace plus session (e.g., projectID + "/" +
sessionName or a structured {project,session} key) and update all uses in this
file and the credential-handler to consistently pass and consume that composite
key (replace handlers.ClearSessionActiveUser(sessionName) and
handlers.SetSessionActiveUser(sessionName, ...) with
handlers.ClearSessionActiveUser(compositeKey) /
handlers.SetSessionActiveUser(compositeKey, ...)), ensuring the BOT_TOKEN
fallback validation uses the same composite key.
In `@components/runners/ambient-runner/ambient_runner/platform/auth.py`:
- Around line 109-145: The code is attaching context.caller_token and BOT_TOKEN
to requests built from a user-overridable BACKEND_API_URL; before adding
Authorization headers in the block that prepares req (and fallback_req),
validate the request target (the variable url / req) against a
non-user-overridable, operator-controlled backend origin or an allowlist of
trusted hostnames resolved from secure config; if the url's origin does not
match the trusted origin, do not add context.caller_token or BOT_TOKEN and
either reject the request or log/warn and return an error/empty response. Update
the logic around use_caller_token / fallback_req to perform this host-origin
check first (use the same check when constructing the initial req and the
fallback_req) so tokens are never sent to untrusted endpoints.
---
Outside diff comments:
In `@components/backend/handlers/runtime_credentials.go`:
- Around line 143-157: enforceCredentialRBAC currently only returns ok; update
it to also return a flag or ownerUserID (e.g., sameAsOwner bool / ownerUserID
string) so callers know whether the effectiveUserID is treated as the project
owner, then in runtime_credentials.go change the call site to capture that flag
and, when sameAsOwner==false, call git.GetGitHubToken in a mode that disables
the legacy project-wide fallback (or pass ownerUserID so GetGitHubToken skips
ambient-non-vertex-integrations/GITHUB_TOKEN fallback); specifically, modify
enforceCredentialRBAC and the call in runtime_credentials.go (where
enforceCredentialRBAC and git.GetGitHubToken are used) so non-owners cannot
receive the legacy project GitHub token.
---
Duplicate comments:
In `@components/backend/handlers/runtime_credentials_test.go`:
- Around line 174-190: The test only checks for non-401/403 but not which user's
credentials were used; update the tests around createCredentialContext and
GetGitHubTokenForSession to assert the effective user explicitly by either (a)
seeding the fake K8s/provider with distinct owner vs collaborator GitHub token
records and asserting the response body or returned token matches the expected
user's token, or (b) instrumenting/mocking the git.GetGitHubToken (or the
provider stub used by GetGitHubTokenForSession) to capture the effectiveUserID
passed in and assert it equals the owner or collaborator as intended; apply the
same change pattern to the other affected cases (the blocks noted at lines
~208-219, 235-248, 253-264, 350-363) so each test validates which user was
selected rather than only checking status codes.
In `@components/backend/handlers/runtime_credentials.go`:
- Around line 28-42: sessionActiveUserMap currently keys only by sessionName
which allows collisions across projects and concurrent runs; update
SetSessionActiveUser, ClearSessionActiveUser and the consumers (e.g.,
enforceCredentialRBAC) to use a stronger composite key (for example
"projectID:sessionName:runID" or the runner/request token) instead of plain
sessionName, store and delete entries using that composite key in
sessionActiveUserMap, and update lookup logic in enforceCredentialRBAC to read
by the same composite key so entries cannot be overwritten or misattributed
across projects or concurrent runs.
- Around line 121-123: The RBAC-denial logs currently print raw principal IDs
(authenticatedUserID, effectiveUserID, ownerUserID) which must be redacted;
update the handlers that log RBAC denials (the block that prints "RBAC
violation..." and the other occurrences around the credential fetch paths) to
replace or hash principal identifiers before calling log.Printf or c.JSON
logs—e.g., compute a short deterministic hash or redact string for
authenticatedUserID and effectiveUserID and log those instead, while ensuring
the full principal values continue to be sent only to the existing audit sink;
locate uses of authenticatedUserID, effectiveUserID, ownerUserID in the runtime
credential handlers and substitute redacted/hashed values in all ordinary
backend logs.
In `@components/operator/internal/handlers/sessions.go`:
- Around line 2431-2473: Ensure ServiceAccount attribution reconciliation always
runs before minting a token: when handling the existing SA (variables
saAnnotations, ownerUID, existingSA) always compare and reconcile the
ambient-code.io/created-by-user-id annotation — if ownerUID is empty remove the
annotation if present, otherwise set it to ownerUID; do this regardless of
ownerUID being empty or not (don’t skip when len(saAnnotations)==0). Also treat
an Update failure as a hard error (return the updErr) instead of only logging a
warning so CreateToken (or any subsequent token minting) cannot proceed with
stale attribution.
In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`:
- Around line 107-117: The code races shared user/credential state because
set_current_user on self._context and calls to clear_runtime_credentials /
populate_runtime_credentials / populate_mcp_server_credentials run outside the
process-wide synchronization used later; to fix, either make credentials and
identity per-run (pass a run-scoped credential/identity object into
_ensure_ready/_fetch_credential and into any helper that reads env) or wrap the
whole setup→run→cleanup lifecycle (the sequence that calls
self._context.set_current_user, await self._ensure_ready(),
clear_runtime_credentials(), populate_runtime_credentials(...),
populate_mcp_server_credentials(...), and the later cleanup) in a process-wide
lock so no two turns can interleave; move the set_current_user and
_last_creds_refresh assignment inside that protected section (or change helpers
to accept run-scoped state) and ensure _fetch_credential reads from the
run-scoped object rather than global process env.
- Around line 119-133: The current user-change detection misses transitions
where current_user_id is empty (owner/API-key) but prev_user was a collaborator;
update the user_changed logic so it treats an empty current_user_id as a change
when prev_user is non-empty. Replace the current expression using
current_user_id and prev_user with something like: user_changed = prev_user !=
"" and (not current_user_id or current_user_id != prev_user) so that when
prev_user is set and current_user_id is empty or different it triggers the
existing destroy/rebuild path (affecting
self._session_manager.get_existing(thread_id), await
self._session_manager.destroy(thread_id), self._rebuild_mcp_servers(), and
resetting self._adapter).
In `@components/runners/ambient-runner/ambient_runner/platform/auth.py`:
- Around line 372-377: The current cleanup loop removes any env var matching
MCP_{SERVER}_{FIELD} which can delete preexisting static config; change
populate_mcp_server_credentials to record and return the exact keys it injects
(or append them to a module-level list like injected_mcp_keys), then in the
cleanup section use that returned/recorded list instead of computing
mcp_cred_keys via os.environ scan; specifically update
populate_mcp_server_credentials(...) to return List[str] of injected keys (or
push names into injected_mcp_keys), and replace the mcp_cred_keys comprehension
and loop with iterating over that returned/recorded list and popping only those
keys and appending them to cleared.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2cbcb335-8236-4854-89f6-ff2957170466
📒 Files selected for processing (9)
components/backend/handlers/runtime_credentials.gocomponents/backend/handlers/runtime_credentials_test.gocomponents/backend/websocket/agui_proxy.gocomponents/operator/internal/handlers/sessions.gocomponents/runners/ambient-runner/ambient_runner/bridges/claude/bridge.pycomponents/runners/ambient-runner/ambient_runner/endpoints/run.pycomponents/runners/ambient-runner/ambient_runner/platform/auth.pycomponents/runners/ambient-runner/ambient_runner/platform/context.pycomponents/runners/ambient-runner/tests/test_shared_session_credentials.py
| // resolveAuthenticatedUser returns the authenticated user ID for the request. | ||
| // Prefers the middleware-resolved userID; falls back to SelfSubjectReview for | ||
| // forwarded caller tokens that bypass the OAuth proxy. Caches the result on | ||
| // the gin context to avoid repeated K8s API calls within the same request. | ||
| func resolveAuthenticatedUser(c *gin.Context, reqK8s kubernetes.Interface) string { | ||
| if uid := c.GetString("userID"); uid != "" { | ||
| return uid | ||
| } | ||
| // Check cache from a previous call in this request | ||
| if uid := c.GetString("resolvedUserID"); uid != "" { | ||
| return uid | ||
| } | ||
| if reqK8s != nil { | ||
| if resolved, err := resolveTokenIdentity(c.Request.Context(), reqK8s); err == nil { | ||
| uid := strings.ReplaceAll(resolved, ":", "-") | ||
| c.Set("resolvedUserID", uid) | ||
| return uid | ||
| } | ||
| } | ||
| return "" |
There was a problem hiding this comment.
Fail closed when caller identity lookup errors.
Lines 69-75 convert every resolveTokenIdentity failure into "", and Lines 106-114 treat "" as the BOT_TOKEN path. A transient SelfSubjectReview failure can therefore fall through to owner/active-user credentials instead of being rejected. Return an explicit BOT_TOKEN/service-account state from resolveAuthenticatedUser and deny on any other resolution error.
Also applies to: 104-119
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/backend/handlers/runtime_credentials.go` around lines 56 - 75,
resolveAuthenticatedUser currently swallows resolveTokenIdentity failures and
returns "" which is treated as BOT_TOKEN downstream; change
resolveAuthenticatedUser to return an explicit result and error (e.g., (string,
error) or a small enum) so callers can distinguish three outcomes: 1) resolved
user id, 2) explicit BOT_TOKEN/service-account marker when token is a caller SA,
and 3) an error for any lookup failure; update resolveAuthenticatedUser to
propagate resolveTokenIdentity errors (don’t convert them to ""), set the cache
key "resolvedUserID" only on successful or BOT_TOKEN results, and modify callers
(the code paths around the existing usage in resolveAuthenticatedUser callers at
the area noted ~lines 104-119) to deny the request when an error is returned
instead of treating "" as BOT_TOKEN.
| sessionLastSeen.Delete(sessionName) | ||
| sessionPortMap.Delete(sessionName) | ||
| sessionProjectMap.Delete(sessionName) | ||
| handlers.ClearSessionActiveUser(sessionName) |
There was a problem hiding this comment.
Key active-user state by project/session, not sessionName alone.
These writes use only sessionName, but AgenticSession names are namespace-scoped. projectA/foo and projectB/foo will overwrite or clear each other's active user in handlers.SetSessionActiveUser(...) / ClearSessionActiveUser(...), so the BOT_TOKEN fallback can validate against the wrong human. Use a composite key consistently on both the proxy and credential-handler sides.
Also applies to: 261-265
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/backend/websocket/agui_proxy.go` at line 100, The
ClearSessionActiveUser call (and corresponding SetSessionActiveUser uses)
currently keys active-user state by sessionName only, which causes collisions
across namespaces; change the keying to a composite key that includes the
project/namespace plus session (e.g., projectID + "/" + sessionName or a
structured {project,session} key) and update all uses in this file and the
credential-handler to consistently pass and consume that composite key (replace
handlers.ClearSessionActiveUser(sessionName) and
handlers.SetSessionActiveUser(sessionName, ...) with
handlers.ClearSessionActiveUser(compositeKey) /
handlers.SetSessionActiveUser(compositeKey, ...)), ensuring the BOT_TOKEN
fallback validation uses the same composite key.
…validation - Key sessionActiveUserMap by project/session (not bare sessionName) to prevent cross-namespace collisions - Distinguish BOT_TOKEN from resolution failure in resolveAuthenticatedUser (fail closed on SelfSubjectReview errors, open on confirmed SA tokens) - Make SA annotation update a hard failure (blocks token minting if annotation can't be reconciled) - Validate BACKEND_API_URL is a cluster-internal host before sending bearer tokens (prevents token exfiltration via user-overridden env) - Fix user_changed detection to trigger on all transitions including non-owner → owner and owner → non-owner Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nstant - Fix unchecked type assertion on sessionActiveUserMap value (could panic) - Parallelize credential fetches with asyncio.gather (4 sequential HTTP calls → concurrent, reduces latency from ~4x to ~1x timeout) - Extract _GOOGLE_WORKSPACE_CREDS_FILE constant to keep populate and clear paths in sync Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (4)
components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py (1)
113-117:⚠️ Potential issue | 🔴 CriticalProcess-wide credential env is still unsynchronized across concurrent turns.
clear_runtime_credentials()/populate_*()mutateos.environ, but they run before the per-thread lock is acquired and are cleared again under that same per-thread lock. Two differentthread_ids can still overwrite or delete each other's credentials mid-run. Guard the whole credential lifecycle with a process-wide lock, or snapshot/restore per-run state instead of mutating shared process env.Also applies to: 166-211
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py` around lines 113 - 117, Concurrent runs mutate os.environ via clear_runtime_credentials(), populate_runtime_credentials(), and populate_mcp_server_credentials() outside the per-thread lock causing credentials races; protect the entire credential lifecycle by either (A) introducing and acquiring a process-wide lock (e.g., credential_lock) before calling clear_runtime_credentials(), populate_runtime_credentials(), populate_mcp_server_credentials() and hold it until credentials are no longer needed (also update where _last_creds_refresh is set), or (B) avoid mutating shared env by snapshotting os.environ before modifying it and restoring it after the run (wrap snapshot/modify/restore around those calls); update all similar blocks (including the 166-211 region) to use the same strategy and reference the same functions so concurrent thread_id runs can’t overwrite each other’s credentials.components/backend/handlers/runtime_credentials.go (1)
80-89:⚠️ Potential issue | 🔴 CriticalSelfSubjectReview failures still fail open as BOT token access.
This branch converts any
resolveTokenIdentity(...)error intoisBotToken=truewheneverX-Forwarded-Useris absent. A transient SSR failure or a direct bearer-token request that bypasses the OAuth proxy can then fall through to owner/active-user credential access. Only return BOT token status after positive ServiceAccount identification; deny on lookup errors.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/backend/handlers/runtime_credentials.go` around lines 80 - 89, The code currently treats any resolveTokenIdentity/SelfSubjectReview error as isBotToken=true when X-Forwarded-User is empty; change that so you only return isBotToken=true after a positive ServiceAccount identification (e.g., when the resolved identity explicitly indicates kind "ServiceAccount" or similar), and on lookup/SSR errors without confirmation you must deny (return "", false). Update the branch in resolveTokenIdentity (the block checking c.GetHeader("X-Forwarded-User") and the SelfSubjectReview error handling) to: if X-Forwarded-User is empty and the resolved subject is confirmed ServiceAccount then return "", true; otherwise log the error and return "", false so transient SSR failures do not fall back to BOT_TOKEN.components/operator/internal/handlers/sessions.go (1)
2452-2474:⚠️ Potential issue | 🔴 CriticalRemove the stale
created-by-user-idannotation when the session owner is now empty.This reconciliation only runs when
saAnnotationsis non-empty. Ifspec.userContext.userIdhas been cleared, the oldambient-code.io/created-by-user-idstays on the existingServiceAccount, and the freshly minted BOT token is still attributed to the previous owner. Reconcile deletion as well before callingCreateToken(...).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/operator/internal/handlers/sessions.go` around lines 2452 - 2474, The current logic only updates annotations when saAnnotations is non-empty, so a cleared session owner leaves the stale ambient-code.io/created-by-user-id on the ServiceAccount; change reconciliation to always fetch the existing ServiceAccount (call CoreV1().ServiceAccounts(...).Get(...) regardless of len(saAnnotations)) and then compare existingSA.Annotations to saAnnotations: for each key in saAnnotations set/overwrite as now, and for any annotation key present in existingSA.Annotations but missing from saAnnotations (specifically "ambient-code.io/created-by-user-id") delete that key from existingSA.Annotations, set needsUpdate = true, and call Update(...) when needsUpdate; keep using the same existingSA/saAnnotations/needsUpdate variables and ensure CreateToken(...) is called after this cleanup.components/runners/ambient-runner/ambient_runner/platform/auth.py (1)
106-113:⚠️ Potential issue | 🔴 CriticalLoopback should not be trusted here.
BACKEND_API_URLis still user-overridable incomponents/operator/internal/handlers/sessions.go, Lines 1203-1205. Allowinglocalhost/127.0.0.1means a session can repoint credential fetches to a listener inside the runner pod and receivecaller_tokenorBOT_TOKEN. Restrict this to the operator-controlled service DNS (or another non-user-overridable allowlist).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/runners/ambient-runner/ambient_runner/platform/auth.py` around lines 106 - 113, Current code in auth.py uses parsed = urlparse(base) and allows localhost/127.0.0.1 which lets a user override BACKEND_API_URL to exfiltrate tokens; remove "localhost" and "127.0.0.1" from the allowed-host check so only cluster-controlled hosts are permitted (e.g., parsed.hostname.endswith(".svc.cluster.local") or an explicit operator-controlled DNS constant/setting), update the condition around parsed.hostname accordingly, and ensure the rejection path (the logger.error at "Refusing to send credentials to external host: {parsed.hostname}") actually prevents sending credentials (e.g., return/raise or otherwise stop credential use) so loopback addresses can no longer be targeted.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/backend/websocket/agui_proxy.go`:
- Around line 263-267: The code sets a session-scoped fallback via
handlers.SetSessionActiveUser(projectName, sessionName, senderUserID) but never
clears it at turn end; update proxyRunnerStream to accept run-scoped context
(e.g., projectName + runID or sessionRunID) and set the active-user using that
run-scoped key instead of session-only, then ensure you clear or rotate the
entry when the proxied turn finishes (use a defer/cleanup in proxyRunnerStream
exit path) so enforceCredentialRBAC cannot authorize a BOT_TOKEN after the turn
boundary; apply the same change to the other occurrence noted (around lines
359-360) to make the fallback run-scoped and removed on exit.
In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`:
- Around line 119-133: The current flow calls populate_mcp_server_credentials()
each run but only calls _rebuild_mcp_servers() when the user changes, leaving
self._mcp_servers stale for same-user turns; change the logic so that after
fetching fresh credentials you always refresh the MCP server definitions (call
_rebuild_mcp_servers()) before any adapter reuse or agent spawn so MCP
subprocesses get up-to-date tokens, and keep the existing forced adapter rebuild
(self._adapter = None) when appropriate; update the block around
user_changed/current_user_id handling to invoke _rebuild_mcp_servers() for both
user-changed and same-user runs (or alternatively defer credential expansion
into the code that spawns MCP subprocesses) so that
populate_mcp_server_credentials(), _rebuild_mcp_servers(), self._mcp_servers and
self._adapter are kept in sync.
---
Duplicate comments:
In `@components/backend/handlers/runtime_credentials.go`:
- Around line 80-89: The code currently treats any
resolveTokenIdentity/SelfSubjectReview error as isBotToken=true when
X-Forwarded-User is empty; change that so you only return isBotToken=true after
a positive ServiceAccount identification (e.g., when the resolved identity
explicitly indicates kind "ServiceAccount" or similar), and on lookup/SSR errors
without confirmation you must deny (return "", false). Update the branch in
resolveTokenIdentity (the block checking c.GetHeader("X-Forwarded-User") and the
SelfSubjectReview error handling) to: if X-Forwarded-User is empty and the
resolved subject is confirmed ServiceAccount then return "", true; otherwise log
the error and return "", false so transient SSR failures do not fall back to
BOT_TOKEN.
In `@components/operator/internal/handlers/sessions.go`:
- Around line 2452-2474: The current logic only updates annotations when
saAnnotations is non-empty, so a cleared session owner leaves the stale
ambient-code.io/created-by-user-id on the ServiceAccount; change reconciliation
to always fetch the existing ServiceAccount (call
CoreV1().ServiceAccounts(...).Get(...) regardless of len(saAnnotations)) and
then compare existingSA.Annotations to saAnnotations: for each key in
saAnnotations set/overwrite as now, and for any annotation key present in
existingSA.Annotations but missing from saAnnotations (specifically
"ambient-code.io/created-by-user-id") delete that key from
existingSA.Annotations, set needsUpdate = true, and call Update(...) when
needsUpdate; keep using the same existingSA/saAnnotations/needsUpdate variables
and ensure CreateToken(...) is called after this cleanup.
In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`:
- Around line 113-117: Concurrent runs mutate os.environ via
clear_runtime_credentials(), populate_runtime_credentials(), and
populate_mcp_server_credentials() outside the per-thread lock causing
credentials races; protect the entire credential lifecycle by either (A)
introducing and acquiring a process-wide lock (e.g., credential_lock) before
calling clear_runtime_credentials(), populate_runtime_credentials(),
populate_mcp_server_credentials() and hold it until credentials are no longer
needed (also update where _last_creds_refresh is set), or (B) avoid mutating
shared env by snapshotting os.environ before modifying it and restoring it after
the run (wrap snapshot/modify/restore around those calls); update all similar
blocks (including the 166-211 region) to use the same strategy and reference the
same functions so concurrent thread_id runs can’t overwrite each other’s
credentials.
In `@components/runners/ambient-runner/ambient_runner/platform/auth.py`:
- Around line 106-113: Current code in auth.py uses parsed = urlparse(base) and
allows localhost/127.0.0.1 which lets a user override BACKEND_API_URL to
exfiltrate tokens; remove "localhost" and "127.0.0.1" from the allowed-host
check so only cluster-controlled hosts are permitted (e.g.,
parsed.hostname.endswith(".svc.cluster.local") or an explicit
operator-controlled DNS constant/setting), update the condition around
parsed.hostname accordingly, and ensure the rejection path (the logger.error at
"Refusing to send credentials to external host: {parsed.hostname}") actually
prevents sending credentials (e.g., return/raise or otherwise stop credential
use) so loopback addresses can no longer be targeted.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: f4aff76b-f053-4e99-8fe6-004bcc4b51e8
📒 Files selected for processing (5)
components/backend/handlers/runtime_credentials.gocomponents/backend/websocket/agui_proxy.gocomponents/operator/internal/handlers/sessions.gocomponents/runners/ambient-runner/ambient_runner/bridges/claude/bridge.pycomponents/runners/ambient-runner/ambient_runner/platform/auth.py
| // Track active user so credential endpoints can validate BOT_TOKEN | ||
| // fallback requests when the caller token expires during long runs. | ||
| if senderUserID != "" { | ||
| handlers.SetSessionActiveUser(projectName, sessionName, senderUserID) | ||
| } |
There was a problem hiding this comment.
Expire the active-user fallback when the proxied turn ends.
This stores the non-owner fallback user, but nothing clears it when the proxied turn finishes; the only cleanup is the 1-hour stale-session sweeper. Since enforceCredentialRBAC() uses this map to authorize BOT_TOKEN requests for non-owners, the last collaborator can remain authorized after the turn boundary. Pass project/run context into proxyRunnerStream and clear or rotate the entry on exit; if overlapping turns are possible, make it run-scoped instead of session-scoped.
Also applies to: 359-360
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/backend/websocket/agui_proxy.go` around lines 263 - 267, The code
sets a session-scoped fallback via handlers.SetSessionActiveUser(projectName,
sessionName, senderUserID) but never clears it at turn end; update
proxyRunnerStream to accept run-scoped context (e.g., projectName + runID or
sessionRunID) and set the active-user using that run-scoped key instead of
session-only, then ensure you clear or rotate the entry when the proxied turn
finishes (use a defer/cleanup in proxyRunnerStream exit path) so
enforceCredentialRBAC cannot authorize a BOT_TOKEN after the turn boundary;
apply the same change to the other occurrence noted (around lines 359-360) to
make the fallback run-scoped and removed on exit.
| # If the caller changed, destroy the worker and rebuild MCP servers + | ||
| # adapter so the new ClaudeSDKClient gets fresh mcp_servers config. | ||
| # The session ID is preserved — --resume works because each SDK client | ||
| # is a new CLI subprocess that spawns fresh MCP servers from os.environ. | ||
| user_changed = current_user_id != prev_user | ||
| if user_changed and self._session_manager.get_existing(thread_id): | ||
| logger.info( | ||
| f"User changed for thread={thread_id}, " | ||
| "rebuilding MCP servers and adapter with new credentials" | ||
| ) | ||
| await self._session_manager.destroy(thread_id) | ||
| self._rebuild_mcp_servers() | ||
| # Force adapter rebuild so ClaudeAgentOptions uses new mcp_servers | ||
| self._adapter = None | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the file structure and find relevant files
find . -type f -name "*.py" | grep -E "(bridge|credential|auth|mcp)" | head -20Repository: ambient-code/platform
Length of output: 1627
🏁 Script executed:
# Get more context around the reviewed code section
cd components/runners/ambient-runner/ambient_runner/bridges/claude
wc -l bridge.pyRepository: ambient-code/platform
Length of output: 78
🏁 Script executed:
# Look at the full context of the bridge file, especially around credentials and rebuilding
head -150 bridge.py | tail -80Repository: ambient-code/platform
Length of output: 133
🏁 Script executed:
# Search for credential/token refresh mechanisms
rg -n "credential|token|refresh|env|environ" bridge.py -A 2 -B 2 | head -100Repository: ambient-code/platform
Length of output: 114
🏁 Script executed:
cat -n components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py | head -150 | tail -100Repository: ambient-code/platform
Length of output: 4700
🏁 Script executed:
# Search for _rebuild_mcp_servers and credential handling
rg -n "_rebuild_mcp_servers|credential|token|refresh" components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py -A 3 -B 2Repository: ambient-code/platform
Length of output: 4468
🏁 Script executed:
# Check the auth.py file for credential refresh logic
cat -n components/runners/ambient-runner/ambient_runner/bridges/claude/auth.pyRepository: ambient-code/platform
Length of output: 5370
🏁 Script executed:
# Check mcp.py to understand how MCP servers are spawned
head -100 components/runners/ambient-runner/ambient_runner/bridges/claude/mcp.pyRepository: ambient-code/platform
Length of output: 3226
🏁 Script executed:
# Check _ensure_adapter to see if it rebuilds the adapter for same-user turns
rg -n "_ensure_adapter|_adapter" components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py -B 2 -A 5 | head -80Repository: ambient-code/platform
Length of output: 3037
🏁 Script executed:
# Check session manager to understand when workers are reused
rg -n "session_manager|get_existing|destroy" components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py -B 1 -A 3Repository: ambient-code/platform
Length of output: 4308
🏁 Script executed:
# Check how populate_mcp_server_credentials sets credentials
rg -n "populate_mcp_server_credentials|populate_runtime_credentials" components/runners/ambient-runner/ambient_runner/platform/auth.py -B 2 -A 10 | head -100Repository: ambient-code/platform
Length of output: 2323
🏁 Script executed:
# Let's understand how the adapter's mcp_servers are used and when they're set
rg -n "mcp_servers|ClaudeAgentOptions" components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py -B 2 -A 3 | head -100Repository: ambient-code/platform
Length of output: 3397
🏁 Script executed:
# Check how credentials flow through the MCP config expansion
cat -n components/runners/ambient-runner/ambient_runner/platform/auth.py | sed -n '412,500p'Repository: ambient-code/platform
Length of output: 4192
🏁 Script executed:
# Check how the adapter uses mcp_servers when building options
rg -n "build_mcp_servers|load_mcp_config" components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py -B 2 -A 8Repository: ambient-code/platform
Length of output: 2642
Credentials refreshed per-turn don't reach MCP servers on same-user turns.
You call populate_mcp_server_credentials() every run to fetch fresh tokens and set them in os.environ, but _rebuild_mcp_servers() only rebuilds the _mcp_servers dict when the user changes. For same-user turns, the adapter reuses the stale dict from the previous expansion, and any short-lived credentials (e.g., JIRA tokens) become expired when MCP subprocesses spawn. Rebuild _mcp_servers on every run, or defer credential expansion from dict-build-time to MCP-spawn-time.
🧰 Tools
🪛 Ruff (0.15.6)
[warning] 126-127: Logging statement uses f-string
(G004)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py`
around lines 119 - 133, The current flow calls populate_mcp_server_credentials()
each run but only calls _rebuild_mcp_servers() when the user changes, leaving
self._mcp_servers stale for same-user turns; change the logic so that after
fetching fresh credentials you always refresh the MCP server definitions (call
_rebuild_mcp_servers()) before any adapter reuse or agent spawn so MCP
subprocesses get up-to-date tokens, and keep the existing forced adapter rebuild
(self._adapter = None) when appropriate; update the block around
user_changed/current_user_id handling to invoke _rebuild_mcp_servers() for both
user-changed and same-user runs (or alternatively defer credential expansion
into the code that spawns MCP subprocesses) so that
populate_mcp_server_credentials(), _rebuild_mcp_servers(), self._mcp_servers and
self._adapter are kept in sync.
## Summary - **Security fix**: Shared sessions previously used the session owner's credentials for all runs, allowing User B to impersonate User A's GitHub/Jira/Google/GitLab credentials. Each message now uses the sender's own credentials. - **Backward compatible**: API keys and scheduled sessions continue using creator's credentials via fallback pattern (`effectiveUserID = currentUserID || ownerUserID`) - **Credential isolation**: Credentials are cleared from environment after each turn to prevent cross-user leakage ## Changes ### Backend (`runtime_credentials.go`) - Extract `getEffectiveUserID()` and `checkCredentialRBAC()` helpers used by all 4 credential handlers (GitHub, Google, Jira, GitLab) - Resolve effective user from `X-Runner-Current-User` header, falling back to session owner - Add audit logging: `CREDENTIAL_ACCESS` log line on every credential fetch ### Proxy (`agui_proxy.go`) - Forward `X-Current-User-ID` / `X-Current-User-Name` headers from backend to runner - Merged into existing `proxyRunnerStream` / `connectToRunner` functions (no duplication) ### Runner (Python) - `context.py`: Added `current_user_id`, `current_user_name` fields + `set_current_user()` method - `auth.py`: Send `X-Runner-Current-User` header when fetching credentials; added `clear_runtime_credentials()` and `sanitize_user_context()` - `endpoints/run.py`: Extract user headers from request, set on context before credential fetch - `bridge.py`: Clear credentials after each turn (opt-out via `KEEP_CREDENTIALS_PERSISTENT=true`) ### Tests - **17 Go tests** (Ginkgo): shared sessions, API keys, scheduled sessions, RBAC violations - **13 Python tests** (pytest): context management, credential clearing, header sending, lifecycle ### Docs - `features/session-sharing.md`: User guide for credential behavior - `guides/migrating-shared-sessions.md`: Migration guide with rollback flag ## Test plan - [ ] Verify shared session: User A creates session, User B sends message → B's credentials are used - [ ] Verify owner access: Session owner's messages still use owner's credentials - [ ] Verify API key sessions: API key sessions use creator's credentials (no header sent) - [ ] Verify scheduled sessions: Scheduled sessions use creator's credentials (fallback) - [ ] Verify credential clearing: After turn completes, env vars are wiped - [ ] Run `go test ./handlers` — all pass - [ ] Run `uv run pytest tests/test_shared_session_credentials.py` — all 13 pass 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Ambient Code Bot <bot@ambient-code.local> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
effectiveUserID = currentUserID || ownerUserID)Changes
Backend (
runtime_credentials.go)getEffectiveUserID()andcheckCredentialRBAC()helpers used by all 4 credential handlers (GitHub, Google, Jira, GitLab)X-Runner-Current-Userheader, falling back to session ownerCREDENTIAL_ACCESSlog line on every credential fetchProxy (
agui_proxy.go)X-Current-User-ID/X-Current-User-Nameheaders from backend to runnerproxyRunnerStream/connectToRunnerfunctions (no duplication)Runner (Python)
context.py: Addedcurrent_user_id,current_user_namefields +set_current_user()methodauth.py: SendX-Runner-Current-Userheader when fetching credentials; addedclear_runtime_credentials()andsanitize_user_context()endpoints/run.py: Extract user headers from request, set on context before credential fetchbridge.py: Clear credentials after each turn (opt-out viaKEEP_CREDENTIALS_PERSISTENT=true)Tests
Docs
features/session-sharing.md: User guide for credential behaviorguides/migrating-shared-sessions.md: Migration guide with rollback flagTest plan
go test ./handlers— all passuv run pytest tests/test_shared_session_credentials.py— all 13 pass🤖 Generated with Claude Code