fix(security): carry SAML pending-session id in RelayState#28760
Conversation
The SAML callback from the IdP is a cross-site POST, so the SameSite=Lax OM_SESSION cookie set at login is dropped by the browser. The server-side pending session introduced by #26314 can no longer be found on that callback, and login fails with "No pending session" for any real external IdP on non-secure transport. The localhost-only SSO nightly cannot reproduce this because a co-located IdP is same-site. Carry the pending-session id in the SAML RelayState (POST body, immune to SameSite/transport rules), the same mechanism MCP-over-SAML already uses: - handleLogin now issues auth.login(pendingSession.getId()) - handleCallback resolves the pending session from RelayState first via the new cookie-free SessionService.getPendingSessionById, falling back to the cookie for backward compatibility Add a cross-site (127.0.0.1 IdP <-> localhost SP) nightly variant that drops the Lax cookie so this regression cannot return. Unit tests cover the RelayState wiring (SamlAuthServletHandlerTest) and the cookie-free lookup (SessionServiceTest). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
✅ PR checks passedThe linked issue has a description and all required Shipping project fields set. Thanks! |
🔴 Playwright Results — 1 failure(s), 9 flaky✅ 4272 passed · ❌ 1 failed · 🟡 9 flaky · ⏭️ 88 skipped
Genuine Failures (failed on all attempts)❌
|
Adds a mock SAML 2.0 Identity Provider that mints signed SAML Responses which OpenMetadata's onelogin Service Provider accepts, as the reusable building block for SAML SSO integration tests. It is a pure assertion factory (no HTTP server), since a non-browser test drives the ACS POST itself and simulates the cross-site cookie drop by controlling the cookie jar. Responses are signed with onelogin's own Util.addSign (RSA-SHA256) so there is no algorithm/canonicalization mismatch with the validating SP. MockSamlIdpTest runs the real SamlResponse.isValid path that SamlAuthServletHandler.handleCallback uses, proving the mock's signed responses are accepted and that unsigned and tampered responses are rejected. The keypair under test/resources/saml is a self-signed test fixture only. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… accessor Address review on samlAuthCallback(): hoist the "/auth/callback" route literal into AUTH_CALLBACK_PATH and extract a shared samlSp() accessor so samlAuthCallback() and samlSpCallback() no longer duplicate the SamlConfiguration/getSp() null-guard. Both stay single-return. Behavior-preserving; SamlAuthServletHandlerTest (15) green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
|
Code Review ✅ Approved 2 resolved / 2 findingsImplements a cookie-free SAML pending-session lookup using RelayState to resolve cross-site authentication failures. The implementation addresses potential security concerns by validating the relay state and refactoring redundant SP configuration logic. ✅ 2 resolved✅ Security: RelayState pending-id is attacker-supplied, not browser-bound
✅ Quality: Magic string and duplicated SP-config lookup in samlAuthCallback
OptionsDisplay: compact → Showing less information. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |



Describe your changes:
Fixes #28763
SAML SSO logins fail with
401 "No pending session"on the IdP callback because that callback is a cross-site POST, so the browser drops theSameSite=LaxOM_SESSIONcookie that the server-side pending session (introduced by #26314) depends on. I carry the pending-session id in the SAMLRelayStateinstead — a POST-body field immune to SameSite/transport rules, already used by the MCP-over-SAML flow — falling back to the cookie for backward compatibility. This is the second of the two SAML regressions from #26314; the redirect-allowlist one is fixed separately in #28695.Type of change:
High-level design:
handleLoginnow passes the pending-session id as the SAMLRelayState(auth.login(pendingSession.getId())).handleCallbackresolves the pending session fromRelayStatefirst, via a new cookie-freeSessionService.getPendingSessionById, and only falls back to theOM_SESSIONcookie when RelayState is absent/stale — so same-site deployments and in-flight logins keep working.keycloak-azure-saml-crosssitenightly variant fronts the IdP on127.0.0.1while OM stays onlocalhost— a different site (SameSite ignores port), so the callback POST is genuinely cross-site and the cookie drops. The existing localhost-only job is same-site and structurally cannot reproduce the bug.Tests:
Use cases covered
OM_SESSIONcookie) now completes instead of failing with "No pending session".Unit tests
SamlAuthServletHandlerTest(13/13 pass):handleLogincarries the pending id in RelayState;resolvePendingSessionprefers RelayState (cookie-free) and falls back to the cookie.SessionServiceTest(28/28 pass):getPendingSessionByIdresolves a PENDING session by id alone (no cookie/request) and rejects active/expired/unknown/malformed ids.Backend integration tests
Ingestion integration tests
Playwright (UI) tests
keycloak-azure-saml-crosssitecross-site nightly variant (playwright-sso-login-nightly.yml,playwright/utils/sso-providers/index.ts).Manual testing performed
mvn -pl openmetadata-service test -Dtest=SamlAuthServletHandlerTest,SessionServiceTest— all pass.mvn -pl openmetadata-service spotless:checkandactionlinton the workflow — clean.UI screen recording / screenshots:
Not applicable — no UI changes; the only change under the UI tree is a Playwright test-config registration.
Checklist:
I have read the CONTRIBUTING document.
My PR title is
Fixes <issue-number>: <short explanation>My PR is linked to a GitHub issue via
Fixes #28763above.I have commented on my code, particularly in hard-to-understand areas.
I have added tests (unit / Playwright as applicable) and listed them above.
I have added a test that covers the exact scenario we are fixing.