feat(auth): external trusted JWT auth + JIT user mapping#13293
feat(auth): external trusted JWT auth + JIT user mapping#13293erichare wants to merge 3 commits into
Conversation
Adds the OSS half of trusted external identity support: - New EXTERNAL_AUTH_* AuthSettings (off by default) covering provider key, token transport (header/cookie), JWKS or trusted-decode, claim mapping, and a pluggable EXTERNAL_AUTH_IDENTITY_RESOLVER import path. - New services/auth/external.py with JWT/JWKS validation, identity resolver protocol, and token extraction helpers. - AuthService.get_or_create_user_from_claims + extract_user_info_from_claims implement the existing BaseAuthService JIT hook through SSOUserProfile - no new tables. - _authenticate_with_token falls back to external resolution when the native JWT path fails, so Authorization-header callers transparently upgrade to external auth. - Token extractors in services/auth/utils.py consult the configured external header/cookie after the native JWT path on session, WebSocket, SSE, and optional-user dependencies. - /api/v1/session catches AuthenticationError so external-credential failures resolve to authenticated=False rather than 500. Tests: 14 unit tests for external.py (JWT decode, claim mapping, custom resolver) plus 3 integration tests in test_login.py exercising the session endpoint JIT path (header + cookie + expired-token). Co-Authored-By: phact <estevezsebastian@gmail.com> Co-Authored-By: Lucas Oliveira <62335616+lucaseduoli@users.noreply.github.com> Based-On: #13280
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughThis PR introduces external trusted-identity authentication with just-in-time user provisioning. Langflow can now accept external JWTs from upstream identity providers, resolve them to ChangesExternal Trusted-Identity Authentication with JIT Provisioning
🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 7 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (7 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ Test Coverage AdvisorNo source changes detected without accompanying tests. Thanks for keeping coverage up! 🎉
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## release-1.10.0 #13293 +/- ##
==================================================
- Coverage 55.90% 55.65% -0.25%
==================================================
Files 2180 2180
Lines 206712 206034 -678
Branches 30620 31102 +482
==================================================
- Hits 115561 114670 -891
- Misses 89824 90036 +212
- Partials 1327 1328 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/backend/base/langflow/services/auth/utils.py (1)
43-57:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftKeep the external credential available when a native token is present but invalid.
These helpers reduce auth to a single token string. If a browser sends a stale
access_token_lfcookie and a valid external header/cookie, the stale local token wins here, andAuthService's external fallback never sees the real upstream credential. That blocks the advertised “native first, external second” behavior until the old cookie is cleared. Consider threading both candidates through auth, or re-resolving the external credential on native JWT failure.Also applies to: 212-216, 238-239, 308-309
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/backend/base/langflow/services/auth/utils.py` around lines 43 - 57, The current __call__ in the token resolver returns the first present token (Authorization header or access_token_lf cookie) and never preserves the external credential returned by _get_external_token, which prevents AuthService from falling back when the native token is present but later determined invalid; change the logic so both candidates are made available to AuthService (e.g., return a small struct/tuple or attach the external token to the request context) rather than returning a single string: keep the native token as the primary candidate (from get_authorization_scheme_param or access_token_lf) but also capture/return the external value from _get_external_token (or re-resolve it on native JWT failure) so AuthService can attempt native validation first and then use the external credential if native validation fails; update all analogous spots (lines referenced: 212-216, 238-239, 308-309) to follow the same dual-token pattern and ensure AuthService consumes the combined result.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/backend/base/langflow/services/auth/external.py`:
- Around line 134-146: The JWKS cache can serve a stale key after IdP rotation;
modify the logic so that when _select_jwk(...) fails to find a matching kid you
perform a one-time forced refresh of the JWKS and retry before rejecting the
token. Implement this by adding a cache-bypass or force_refresh argument to
_fetch_jwks(jwks_url, force_refresh=False) (or by deleting _jwks_cache[jwks_url]
before calling), then in the code path that selects the JWK (the _select_jwk or
token verification flow) if the kid lookup returns None: clear or bypass the
cached entry, call _fetch_jwks(..., force_refresh=True) to reload, and attempt
the kid lookup once more; only if it still misses then return the original
error.
- Around line 306-309: The current extraction logic returns the whole stripped
header when a bare "Bearer" is provided (scheme == "bearer" and token is empty),
causing the string "Bearer" to be treated as a credential; change the branch so
that when scheme.lower() == "bearer" and token is empty you return None instead
of falling through to return stripped. Locate the code that assigns scheme, _,
token = stripped.partition(" ") (using variables scheme, token, stripped) and
update the conditional to explicitly return None for bare Bearer headers so
downstream logic can try alternate credentials (e.g., cookies).
In `@src/backend/base/langflow/services/auth/service.py`:
- Around line 292-295: The code currently unconditionally assigns profile.email
= identity.email which can erase a previously stored email when
ExternalIdentity.email is omitted; modify the logic in the authentication/update
block that handles ExternalIdentity (the code that sets profile.email,
profile.sso_last_login_at, profile.updated_at) so that you only overwrite
profile.email when identity.email is non-null/non-empty (e.g., check
identity.email is not None/empty before assigning), while still updating
profile.sso_last_login_at and profile.updated_at as before.
---
Outside diff comments:
In `@src/backend/base/langflow/services/auth/utils.py`:
- Around line 43-57: The current __call__ in the token resolver returns the
first present token (Authorization header or access_token_lf cookie) and never
preserves the external credential returned by _get_external_token, which
prevents AuthService from falling back when the native token is present but
later determined invalid; change the logic so both candidates are made available
to AuthService (e.g., return a small struct/tuple or attach the external token
to the request context) rather than returning a single string: keep the native
token as the primary candidate (from get_authorization_scheme_param or
access_token_lf) but also capture/return the external value from
_get_external_token (or re-resolve it on native JWT failure) so AuthService can
attempt native validation first and then use the external credential if native
validation fails; update all analogous spots (lines referenced: 212-216,
238-239, 308-309) to follow the same dual-token pattern and ensure AuthService
consumes the combined result.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 2b1360dd-f11f-4349-b3c8-af9f60eb1000
📒 Files selected for processing (8)
.secrets.baselinesrc/backend/base/langflow/api/v1/login.pysrc/backend/base/langflow/services/auth/external.pysrc/backend/base/langflow/services/auth/service.pysrc/backend/base/langflow/services/auth/utils.pysrc/backend/tests/unit/services/auth/test_external_auth.pysrc/backend/tests/unit/test_login.pysrc/lfx/src/lfx/services/settings/auth.py
| async def _fetch_jwks(jwks_url: str) -> dict[str, Any]: | ||
| cached = _jwks_cache.get(jwks_url) | ||
| now = time.monotonic() | ||
| if cached and cached[0] > now: | ||
| return cached[1] | ||
|
|
||
| async with httpx.AsyncClient(timeout=10.0) as client: | ||
| response = await client.get(jwks_url) | ||
| response.raise_for_status() | ||
| jwks = response.json() | ||
|
|
||
| _jwks_cache[jwks_url] = (now + JWKS_CACHE_TTL_SECONDS, jwks) | ||
| return jwks |
There was a problem hiding this comment.
Refresh JWKS once on kid misses.
With the current cache behavior, an IdP key rotation can break every new token for up to 300 seconds: _fetch_jwks() returns the stale set, _select_jwk() cannot find the new kid, and we fail without reloading. Please force a one-time JWKS refresh before rejecting the token.
Also applies to: 203-205
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/backend/base/langflow/services/auth/external.py` around lines 134 - 146,
The JWKS cache can serve a stale key after IdP rotation; modify the logic so
that when _select_jwk(...) fails to find a matching kid you perform a one-time
forced refresh of the JWKS and retry before rejecting the token. Implement this
by adding a cache-bypass or force_refresh argument to _fetch_jwks(jwks_url,
force_refresh=False) (or by deleting _jwks_cache[jwks_url] before calling), then
in the code path that selects the JWK (the _select_jwk or token verification
flow) if the kid lookup returns None: clear or bypass the cached entry, call
_fetch_jwks(..., force_refresh=True) to reload, and attempt the kid lookup once
more; only if it still misses then return the original error.
| scheme, _, token = stripped.partition(" ") | ||
| if scheme.lower() == "bearer" and token: | ||
| return token.strip() or None | ||
| return stripped |
There was a problem hiding this comment.
Treat bare Bearer headers as missing credentials.
Authorization: Bearer currently falls through and returns "Bearer" as the token. That makes us validate the scheme name itself and, in the header-first path, it also prevents a valid cookie credential from being tried.
Proposed fix
def extract_bearer_or_raw_token(value: str | None) -> str | None:
"""Strip a 'Bearer ' prefix if present and return the credential."""
if not value:
return None
stripped = value.strip()
if not stripped:
return None
scheme, _, token = stripped.partition(" ")
- if scheme.lower() == "bearer" and token:
- return token.strip() or None
+ if scheme.lower() == "bearer":
+ return token.strip() or None
return stripped📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| scheme, _, token = stripped.partition(" ") | |
| if scheme.lower() == "bearer" and token: | |
| return token.strip() or None | |
| return stripped | |
| scheme, _, token = stripped.partition(" ") | |
| if scheme.lower() == "bearer": | |
| return token.strip() or None | |
| return stripped |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/backend/base/langflow/services/auth/external.py` around lines 306 - 309,
The current extraction logic returns the whole stripped header when a bare
"Bearer" is provided (scheme == "bearer" and token is empty), causing the string
"Bearer" to be treated as a credential; change the branch so that when
scheme.lower() == "bearer" and token is empty you return None instead of falling
through to return stripped. Locate the code that assigns scheme, _, token =
stripped.partition(" ") (using variables scheme, token, stripped) and update the
conditional to explicitly return None for bare Bearer headers so downstream
logic can try alternate credentials (e.g., cookies).
| now = datetime.now(timezone.utc) | ||
| profile.email = identity.email | ||
| profile.sso_last_login_at = now | ||
| profile.updated_at = now |
There was a problem hiding this comment.
Don't clear the stored email when the upstream token omits it.
ExternalIdentity.email is optional, so this assignment erases a previously known address on any later login where the resolver does not emit email. Preserve the existing value unless a non-null email is supplied.
Proposed fix
now = datetime.now(timezone.utc)
- profile.email = identity.email
+ if identity.email is not None:
+ profile.email = identity.email
profile.sso_last_login_at = now
profile.updated_at = now🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/backend/base/langflow/services/auth/service.py` around lines 292 - 295,
The code currently unconditionally assigns profile.email = identity.email which
can erase a previously stored email when ExternalIdentity.email is omitted;
modify the logic in the authentication/update block that handles
ExternalIdentity (the code that sets profile.email, profile.sso_last_login_at,
profile.updated_at) so that you only overwrite profile.email when identity.email
is non-null/non-empty (e.g., check identity.email is not None/empty before
assigning), while still updating profile.sso_last_login_at and
profile.updated_at as before.
Summary
Adds OSS support for trusted external identity: accept a JWT (or other credential) issued or validated by an upstream IdP / proxy, and JIT-provision a local Langflow user via the existing
SSOUserProfiletable. Off by default; opt in withLANGFLOW_EXTERNAL_AUTH_ENABLED=true.This PR is independent of #13153 / #13290 (RBAC work) — it targets
release-1.10.0directly. Authentication and authorization are separable concerns; RBAC doesn't require this, and this works with or without RBAC. Originally proposed as part of #13280 by @lucaseduoli and @phact.What's in this PR
EXTERNAL_AUTH_*AuthSettings (off by default) covering provider key, token transport (header / cookie), JWKS or trusted-decode, claim mapping, and a pluggableEXTERNAL_AUTH_IDENTITY_RESOLVERimport path.services/auth/external.pywith JWT/JWKS validation, identity resolver protocol, and token-extraction helpers.AuthService.get_or_create_user_from_claims+extract_user_info_from_claimsimplement the existingBaseAuthServiceJIT hook throughSSOUserProfile— no new tables._authenticate_with_tokenfalls back to external resolution when the native JWT path fails, soAuthorization-header callers transparently upgrade to external auth.services/auth/utils.pyconsult the configured external header / cookie after the native JWT path on session, WebSocket, SSE, and optional-user dependencies./api/v1/sessioncatchesAuthenticationErrorso external-credential failures resolve toauthenticated=Falserather than 500.Defaults & safety
LANGFLOW_EXTERNAL_AUTH_ENABLED=falseby default. Zero behavior change unless flipped on.EXTERNAL_AUTH_TRUSTED_JWT_DECODE=falseby default. The "decode without signature verification" path exists for deployments where a trusted upstream proxy already validates the JWT — operators must opt in explicitly. JWKS-backed validation viaEXTERNAL_AUTH_JWKS_URLis the recommended production path.SSOUserProfile(sso_provider, sso_user_id). Subsequent requests reuse the same user.Why ship separately
Originally this was bundled with the RBAC work in #13290. Splitting it out has three benefits:
Configuration
For unusual credential formats, point
LANGFLOW_EXTERNAL_AUTH_IDENTITY_RESOLVERat amodule:attrimport path implementing the resolver Protocol (async def resolve(token, auth_settings) -> ExternalIdentity | dict).Test plan
Green locally:
uv run pytest src/backend/tests/unit/services/auth/test_external_auth.py— 14 passed (JWT decode, claim mapping, custom resolver, header/cookie extraction)uv run pytest src/backend/tests/unit/test_login.py— 9 passed, includes 3 new integration tests exercising the/api/v1/sessionJIT path:SSOUserProfilerow written + second request reuses it.authenticated=False(no 500).Pre-merge follow-ups (manual smoke):
LANGFLOW_EXTERNAL_AUTH_ENABLED=true LANGFLOW_EXTERNAL_AUTH_TRUSTED_JWT_DECODE=true LANGFLOW_EXTERNAL_AUTH_PROVIDER=test-providerandcurl -H "Authorization: Bearer <jwt>" http://localhost:7860/api/v1/sessionto confirm the JIT path end-to-end.LANGFLOW_EXTERNAL_AUTH_JWKS_URLpointed at a real IdP, exercise the full signature-verification path.What's intentionally not here
principal_roles/principal_groups). Out of scope for this auth-only PR; the authorization layer (feat: OSS authorization foundations for enterprise RBAC #13153) determines what users can do, and the standard enterprise pattern is to sync IdP roles/groups intoauthz_role_assignment/authz_team_membervia a pluginPolicySyncService.MANAGEaction / share-backed prefilter. Lives in feat(authz): MANAGE action + share-backed list prefilter #13290 / feat: OSS authorization foundations for enterprise RBAC #13153.🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes