feat(mcp): add detailed JWT error messages and default auth factory fallback#37972
feat(mcp): add detailed JWT error messages and default auth factory fallback#37972aminghadersohi wants to merge 3 commits intoapache:masterfrom
Conversation
…allback Replace generic "invalid_token" error with specific failure reasons (algorithm mismatch, expired token, issuer/audience mismatch, bad signature, missing scopes) by introducing DetailedJWTVerifier that overrides load_access_token() with step-by-step validation. - Add DetailedJWTVerifier with contextvar-based error reporting - Add DetailedBearerAuthBackend that raises AuthenticationError with reason - Add JSON error handler returning structured 401 responses - Update create_default_mcp_auth_factory to use DetailedJWTVerifier - Add server.py fallback to default factory when MCP_AUTH_ENABLED=True - Improve get_user_from_request() diagnostic error messages - Add 21 unit tests covering all failure modes
superset/mcp_service/jwt_verifier.py
Outdated
| return JSONResponse( | ||
| status_code=401, | ||
| content={ | ||
| "error": "invalid_token", | ||
| "error_description": str(exc), | ||
| }, | ||
| headers={ | ||
| "WWW-Authenticate": f'Bearer error="invalid_token", ' | ||
| f'error_description="{exc}"', |
There was a problem hiding this comment.
Suggestion: The error message is interpolated directly into the WWW-Authenticate header without sanitization, so a crafted token value that surfaces in the message (e.g., issuer or algorithm) could inject newline characters or quotes and lead to HTTP header injection or malformed headers; you should sanitize or normalize the message before placing it in the header. [security]
Severity Level: Critical 🚨
- ❌ Attacker-controlled data injected into WWW-Authenticate header.
- ⚠️ Possible HTTP response splitting or cache poisoning.
- ⚠️ Risk of 500s from invalid response headers.| return JSONResponse( | |
| status_code=401, | |
| content={ | |
| "error": "invalid_token", | |
| "error_description": str(exc), | |
| }, | |
| headers={ | |
| "WWW-Authenticate": f'Bearer error="invalid_token", ' | |
| f'error_description="{exc}"', | |
| reason = str(exc) | |
| # Sanitize reason for use in HTTP header to prevent header injection | |
| safe_reason = reason.replace("\r", " ").replace("\n", " ").replace('"', "'") | |
| return JSONResponse( | |
| status_code=401, | |
| content={ | |
| "error": "invalid_token", | |
| "error_description": reason, | |
| }, | |
| headers={ | |
| "WWW-Authenticate": f'Bearer error="invalid_token", ' | |
| f'error_description="{safe_reason}"', |
Steps of Reproduction ✅
1. The MCP service configures authentication using `DetailedJWTVerifier.get_middleware()`
at `superset/mcp_service/jwt_verifier.py:265-280`, which registers Starlette's
`AuthenticationMiddleware` with `on_error=_json_auth_error_handler`.
2. A client sends an HTTP request to any MCP endpoint protected by this middleware with an
`Authorization: Bearer <token>` header where the token contains an issuer (`iss`) or
audience (`aud`) claim including newline characters and quotes (e.g.,
`"evil\"\r\nX-Injected: value"`).
3. During token validation in `DetailedJWTVerifier.load_access_token()` (lines 119–263),
an issuer or audience mismatch occurs at lines 185–207 or 211–233, creating an error
string like `f"Issuer mismatch: token has '{iss}', expected '{self.issuer}'"` that
includes the attacker-controlled newline and quote characters.
4. `DetailedBearerAuthBackend.authenticate()` at lines 76–106 reads this reason from
`_jwt_failure_reason` and raises `AuthenticationError(reason)`, which is passed as `exc`
into `_json_auth_error_handler()` at lines 59–73; the handler interpolates `exc` directly
into the `WWW-Authenticate` header at line 71 without sanitization, allowing the crafted
CR/LF and quotes to break the header syntax and potentially inject additional HTTP headers
or cause malformed-header errors, depending on the ASGI server's enforcement.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** superset/mcp_service/jwt_verifier.py
**Line:** 63:71
**Comment:**
*Security: The error message is interpolated directly into the WWW-Authenticate header without sanitization, so a crafted token value that surfaces in the message (e.g., issuer or algorithm) could inject newline characters or quotes and lead to HTTP header injection or malformed headers; you should sanitize or normalize the message before placing it in the header.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.There was a problem hiding this comment.
Good catch — fixed in 8708817. Added _sanitize_header_value() that strips CR/LF and replaces quotes before interpolating into the WWW-Authenticate header. The JSON body still has the raw reason for debugging, only the header is sanitized. Added test coverage for this.
superset/mcp_service/jwt_verifier.py
Outdated
| ) | ||
| header_b64 = parts[0] | ||
| # Add padding | ||
| header_b64 += "=" * (4 - len(header_b64) % 4) |
There was a problem hiding this comment.
Suggestion: The JWT header padding calculation always appends four '=' characters when the header length is already a multiple of 4, which can make otherwise valid JWTs fail to decode with a padding error and be incorrectly treated as having a malformed header. [logic error]
Severity Level: Critical 🚨
- ❌ Valid JWTs rejected as "Malformed token header".
- ❌ MCP JWT authentication fails for common HS256 tokens.
- ⚠️ Breaks all callers relying on DetailedJWTVerifier auth.| header_b64 += "=" * (4 - len(header_b64) % 4) | |
| # Add padding only if needed | |
| header_b64 += "=" * (-len(header_b64) % 4) |
Steps of Reproduction ✅
1. In a test module, import `DetailedJWTVerifier` and its helper `_decode_token_header`
from `superset/mcp_service/jwt_verifier.py` (class `DetailedJWTVerifier` is defined above
line 119, `_decode_token_header` at line 283).
2. Construct a structurally valid JWT whose header segment length is a multiple of 4, for
example using the common HS256 header `"eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9"` and forming
a dummy token `"eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.foo.bar"`.
3. Call `await DetailedJWTVerifier(...).load_access_token(token)` from
`superset/mcp_service/jwt_verifier.py:119`, which immediately invokes
`_decode_token_header(token)` at line 132 and reaches the padding logic at line 291
(`header_b64 += "=" * (4 - len(header_b64) % 4)`).
4. Observe that for a header length already divisible by 4, four `"="` characters are
appended, and `base64.urlsafe_b64decode(header_b64)` at line 293 raises a `binascii.Error`
(`ValueError`) for incorrect padding, causing the inner `except (ValueError, DecodeError)`
at lines 133–135 in `load_access_token()` to treat the header as malformed and return
`None` with reason `"Malformed token header"` even though the JWT is valid.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** superset/mcp_service/jwt_verifier.py
**Line:** 292:292
**Comment:**
*Logic Error: The JWT header padding calculation always appends four '=' characters when the header length is already a multiple of 4, which can make otherwise valid JWTs fail to decode with a padding error and be incorrectly treated as having a malformed header.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.There was a problem hiding this comment.
Good catch — fixed in 8708817. Changed (4 - len(header_b64) % 4) to (-len(header_b64) % 4) which correctly produces 0 when the length is already a multiple of 4. Added a test case that specifically exercises this edge case with a standard HS256/JWT header.
Address review feedback: - Sanitize error messages in WWW-Authenticate header to prevent header injection via CR/LF/quotes in attacker-controlled JWT claims - Fix base64 padding calculation that incorrectly appended 4 '=' when header length was already a multiple of 4 - Add tests for header sanitization and padding edge case
There was a problem hiding this comment.
Code Review Agent Run #2e51e6
Actionable Suggestions - 1
-
superset/mcp_service/jwt_verifier.py - 1
- Blind exception catch too broad · Line 258-260
Review Details
-
Files reviewed - 5 · Commit Range:
e994695..e994695- superset/mcp_service/auth.py
- superset/mcp_service/jwt_verifier.py
- superset/mcp_service/mcp_config.py
- superset/mcp_service/server.py
- tests/unit_tests/mcp_service/test_jwt_verifier.py
-
Files skipped - 0
-
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful
Bito Usage Guide
Commands
Type the following command in the pull request comment and save the comment.
-
/review- Manually triggers a full AI review. -
/pause- Pauses automatic reviews on this pull request. -
/resume- Resumes automatic reviews. -
/resolve- Marks all Bito-posted review comments as resolved. -
/abort- Cancels all in-progress reviews.
Refer to the documentation for additional commands.
Configuration
This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.
Documentation & Help
| return None | ||
|
|
||
| # All validations passed |
There was a problem hiding this comment.
Avoid catching the broad Exception class on line 259. Catch specific exception types that are expected (e.g., ValueError, JoseError, KeyError) to prevent masking unexpected errors.
Code suggestion
Check the AI-generated fix before applying
| return None | |
| # All validations passed | |
| except (ValueError, JoseError, KeyError, AttributeError) as e: | |
| reason = f"Token validation failed: {e}" |
Code Review Run #2e51e6
Should Bito avoid suggestions like this for future reviews? (Manage Rules)
- Yes, avoid them
There was a problem hiding this comment.
Fixed in 74354ee — narrowed to (ValueError, JoseError, KeyError, AttributeError, TypeError) which covers the specific exception types that can occur from JWT library operations, dict access, and type conversion.
There was a problem hiding this comment.
The exception handling in jwt_verifier.py has been narrowed to catch ValueError, JoseError, KeyError, AttributeError, and TypeError, covering JWT library operations, dict access, and type conversion errors.
superset/mcp_service/jwt_verifier.py
except (ValueError, JoseError, KeyError, AttributeError, TypeError) as e:
reason = f"Token validation failed: {e}"
_jwt_failure_reason.set(reason)
logger.warning(reason)
return None
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #37972 +/- ##
===========================================
+ Coverage 0 66.24% +66.24%
===========================================
Files 0 669 +669
Lines 0 51471 +51471
Branches 0 5793 +5793
===========================================
+ Hits 0 34096 +34096
- Misses 0 15989 +15989
- Partials 0 1386 +1386
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Replace broad Exception catch with specific exception types (ValueError, JoseError, KeyError, AttributeError, TypeError) per project coding standards.
Code Review Agent Run #359749Actionable Suggestions - 0Additional Suggestions - 1
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
SUMMARY
Replaces the generic
"invalid_token"JWT error response with specific, actionable error messages for each validation failure type in the MCP service.The Problem:
When JWT authentication fails, the MCP service returns the same generic error regardless of what actually went wrong:
{"error": "invalid_token", "error_description": "Authentication required"}This makes debugging JWT configuration nearly impossible — you can't tell if the token is expired, signed with the wrong algorithm, has the wrong issuer, etc.
The Solution:
A
DetailedJWTVerifierthat performs step-by-step JWT validation and returns the specific failure reason:{"error": "invalid_token", "error_description": "Algorithm mismatch: token uses 'RS256', expected 'HS256'"} {"error": "invalid_token", "error_description": "Token expired for client 'auth0|65b03fdcf56f87da5abb00ad'"} {"error": "invalid_token", "error_description": "Issuer mismatch: token has 'wrong-issuer', expected 'http://manager.local.preset.zone'"}How it works
flowchart TD A["Client sends Bearer token"] --> B["AuthenticationMiddleware"] B --> C["DetailedBearerAuthBackend.authenticate()"] C --> D["DetailedJWTVerifier.verify_token()"] D --> E["load_access_token() — step-by-step validation"] E --> F{"1. Decode header"} F -->|"Malformed"| FAIL1["Set reason: 'Malformed token header'"] F -->|"OK"| G{"2. Check algorithm"} G -->|"Mismatch"| FAIL2["Set reason: 'Algorithm mismatch: token uses X, expected Y'"] G -->|"OK"| H{"3. Get verification key"} H -->|"JWKS unreachable"| FAIL3["Set reason: 'Failed to get verification key'"] H -->|"OK"| I{"4. Verify signature"} I -->|"Bad signature"| FAIL4["Set reason: 'Signature verification failed'"] I -->|"OK"| J{"5. Check expiration"} J -->|"Expired"| FAIL5["Set reason: 'Token expired for client X'"] J -->|"OK"| K{"6. Validate issuer"} K -->|"Mismatch"| FAIL6["Set reason: 'Issuer mismatch'"] K -->|"OK"| L{"7. Validate audience"} L -->|"Mismatch"| FAIL7["Set reason: 'Audience mismatch'"] L -->|"OK"| M{"8. Check scopes"} M -->|"Missing"| FAIL8["Set reason: 'Missing required scopes'"] M -->|"OK"| SUCCESS["Return AccessToken ✅"] FAIL1 & FAIL2 & FAIL3 & FAIL4 & FAIL5 & FAIL6 & FAIL7 & FAIL8 --> CTX["Store reason in ContextVar"] CTX --> RETURN_NONE["Return None"] RETURN_NONE --> BACKEND["DetailedBearerAuthBackend reads ContextVar"] BACKEND --> RAISE["Raise AuthenticationError with specific reason"] RAISE --> HANDLER["JSON error handler → 401 with reason"]Before vs After
flowchart LR subgraph BEFORE["❌ Before — Base JWTVerifier"] direction TB B1["Any failure"] --> B2["Catch exception → return None"] B2 --> B3["DEBUG log only"] B3 --> B4["RequireAuthMiddleware"] B4 --> B5["'invalid_token' — always the same"] end subgraph AFTER["✅ After — DetailedJWTVerifier"] direction TB A1["Step-by-step validation"] A1 --> A2["Store specific reason in ContextVar"] A2 --> A3["WARNING log with details"] A3 --> A4["DetailedBearerAuthBackend"] A4 --> A5["JSON 401 with exact reason"] end BEFORE ~~~ AFTERClass hierarchy
classDiagram class JWTVerifier { +load_access_token(token) AccessToken|None +get_middleware() list -_get_verification_key(token) str -_extract_scopes(claims) list } class DetailedJWTVerifier { +load_access_token(token) AccessToken|None +get_middleware() list +_decode_token_header(token) dict } class BearerAuthBackend { +authenticate(conn) tuple|None } class DetailedBearerAuthBackend { +authenticate(conn) tuple|None } JWTVerifier <|-- DetailedJWTVerifier : extends BearerAuthBackend <|-- DetailedBearerAuthBackend : extends DetailedJWTVerifier ..> _jwt_failure_reason : writes DetailedBearerAuthBackend ..> _jwt_failure_reason : readsAuth factory fallback (server.py)
flowchart TD START["server.py: run_server()"] --> CHECK_FACTORY{"MCP_AUTH_FACTORY set?"} CHECK_FACTORY -->|"Yes"| USE_CUSTOM["Use custom factory"] CHECK_FACTORY -->|"No"| CHECK_ENABLED{"MCP_AUTH_ENABLED = True?"} CHECK_ENABLED -->|"Yes"| USE_DEFAULT["Use create_default_mcp_auth_factory()"] CHECK_ENABLED -->|"No"| NO_AUTH["No auth provider"] USE_DEFAULT --> DETAILED["Creates DetailedJWTVerifier"] USE_CUSTOM --> PROVIDER["Auth provider ready"] DETAILED --> PROVIDERFiles changed
superset/mcp_service/jwt_verifier.pyDetailedJWTVerifier,DetailedBearerAuthBackend, JSON error handler, ContextVarsuperset/mcp_service/mcp_config.pycreate_default_mcp_auth_factory()now usesDetailedJWTVerifiersuperset/mcp_service/server.pyMCP_AUTH_ENABLED=Truebut noMCP_AUTH_FACTORYsuperset/mcp_service/auth.pyget_user_from_request()shows diagnostic details on failuretests/unit_tests/mcp_service/test_jwt_verifier.pyBEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
Before — Every JWT failure returns this:
After — Each failure gets a specific message:
TESTING INSTRUCTIONS
1. Run unit tests (quick check)
pytest tests/unit_tests/mcp_service/test_jwt_verifier.py -v # Expected: 24 tests pass2. Set up local MCP server for manual testing
Step 2a: Configure
superset_config.pyfor HS256 testing (easiest)Add these lines to your
superset_config.py:Step 2b: Start the MCP server
You should see this log line confirming the new verifier is active:
Step 2c: Generate a valid test token
Use this Python snippet to generate an HS256 JWT token matching the config above:
Save the printed tokens — you'll use them in curl commands below.
3. Manual test: Expired token
Expected response:
{ "error": "invalid_token", "error_description": "Token expired for client 'admin'" }What to check: The
error_descriptionsays "Token expired" with the client name, NOT the generic "Authentication required".4. Manual test: Wrong issuer
Expected response:
{ "error": "invalid_token", "error_description": "Issuer mismatch: token has 'wrong-issuer', expected 'test-issuer'" }5. Manual test: Wrong audience
Expected response:
{ "error": "invalid_token", "error_description": "Audience mismatch: token has 'wrong-audience', expected 'test-audience'" }6. Manual test: Wrong algorithm
Expected response:
{ "error": "invalid_token", "error_description": "Algorithm mismatch: token uses 'RS256', expected 'HS256'" }7. Manual test: Completely garbage token
Expected response:
{ "error": "invalid_token", "error_description": "Malformed token header: ..." }8. Manual test: Valid token works
Expected: A successful JSON-RPC response with the list of tools (NOT a 401 error).
9. Manual test: No auth header (fallback to MCP_DEV_USERNAME)
Expected: Works normally — when no Bearer token is provided, the system falls back to
MCP_DEV_USERNAMEif configured.10. Check server logs
While running the curl tests above, check the MCP server terminal output. You should see
WARNINGlevel log lines like:ADDITIONAL INFORMATION