test(presto): 401 Unauthorized must surface as CONNECTION_ACCESS_DENIED_ERROR (#33554)#40618
Conversation
…ED_ERROR (#33554) Regression for #33554: pyhive raises "presto error: Unexpected status code 401 b'Unauthorized'" when Presto rejects the connection with HTTP 401. The current custom_errors map only matches "Access Denied: Invalid credentials" (LDAP path), leaving the HTTP-401 message unhandled and exposing a raw error to SQL Lab users. Closes #33554 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Code Review Agent Run #c91ff0Actionable Suggestions - 0Filtered by Review RulesBito filtered these suggestions based on rules created automatically for your feedback. Manage rules.
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
There was a problem hiding this comment.
Pull request overview
This test-only PR adds a regression test asserting that Presto's extract_errors maps the pyhive Unexpected status code 401 message to CONNECTION_ACCESS_DENIED_ERROR. It is filed as a TDD-style failing test for issue #33554; the corresponding production fix in superset/db_engine_specs/presto.py is not included.
Changes:
- Adds
test_extract_errors_maps_401_to_access_deniedintests/unit_tests/db_engine_specs/test_presto.py.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #40618 +/- ##
=======================================
Coverage 63.94% 63.94%
=======================================
Files 2658 2658
Lines 143011 143012 +1
Branches 32866 32866
=======================================
+ Hits 91453 91454 +1
Misses 49995 49995
Partials 1563 1563
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Closes #33554 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
✅ Deploy Preview for superset-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Code Review Agent Run #9b3415Actionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
aminghadersohi
left a comment
There was a problem hiding this comment.
HEAD SHA: e463c6d
Scan Coverage
- Scan 1 (inline imports): pre-existing file-wide pattern; new test follows same pattern at lines 406-407 — see NIT
- Scan 2 (config.get): 0 violations
- Scan 3 (f-string logging): 0 violations
- Scan 4 (broad exception): 0 violations
- Scan 5 (sensitive info in error messages): 0 violations
- Scan 6 (mutable globals): 0 violations — new regex is an immutable compiled
re.Pattern - Scan 7 (print statements): 0 violations
- Scan 8 (db.session.commit): 0 violations
- Scan 9 (JWT decode without verification): 0 violations
- Scan 10 (hardcoded values): 0 violations
- Scan 11 (missing
-> Noneon test methods): 0 violations — new test has-> None - Scan 12 (Any type): pre-existing at line 129, not in new code — 0 new violations
- Scan 13 (module-level app context refs): 0 violations
- Scan 14 (SQL injection): 0 violations in new code
- Scan 15 (private symbol imports): 0 violations
- Scan 16 (or-default collapsing falsy): 0 violations
- Scan 17 (compound boolean without parens): 0 violations
- Scan 18 (migration boolean-filter data loss): N/A — no alembic files
- Scan 19 (async HTTP client without timeout): 0 violations
- Scan 20 (asyncio.get_event_loop inside async): 0 violations
BLOCKER — None.
HIGH — None.
MEDIUM
superset/db_engine_specs/presto.py (new custom_errors entry) / tests/unit_tests/db_engine_specs/test_presto.py — PR title uses test(presto) and the description calls this a "test-only PR," but the diff includes production code changes to presto.py (new CONNECTION_ACCESS_DENIED_401_REGEX constant and custom_errors entry). By convention test(scope) commits contain only test changes; a fix that also adds a test should be fix(presto). The description's TDD narrative ("CI red → add fix → CI green") also describes a two-step workflow but both steps are included in this single PR, making the narrative misleading for reviewers.
Suggestion: Change the conventional commit type to fix(presto) and update the description to reflect that both the mapping and the regression test are included.
NIT
-
tests/unit_tests/db_engine_specs/test_presto.py:412— Test assertsresult[0].error_typebut notresult[0].message. If the human-readable message is later changed to something unhelpful, this test still passes. Addingassert "Check your credentials" in result[0].messagemakes the regression guard complete for both the error type and the user-visible copy. -
superset/db_engine_specs/presto.py:974— New entry uses{"invalid_credentials": True}inextra, but the pre-existingCONNECTION_ACCESS_DENIED_REGEXentry directly above it (line 969) uses{}. Both map toCONNECTION_ACCESS_DENIED_ERROR. Ifinvalid_credentialsdrives frontend UI state, the LDAP-auth path is missing it and users will see inconsistent behavior for two errors that mean the same thing. If it has no consumer (no references found in the codebase beyond this line), remove it.
Suggestion: Either add{"invalid_credentials": True}to the existingCONNECTION_ACCESS_DENIED_REGEXentry for consistency, or drop it from the new entry. -
tests/unit_tests/db_engine_specs/test_presto.py:406-407— Inline imports (PrestoEngineSpec,SupersetErrorType) follow the pre-existing pattern throughout this file, but both belong at module top-level per Python convention. This is a file-wide issue predating this PR, but the new test adds two more instances. -
superset/db_engine_specs/presto.py:81—r"Unexpected status code 401"has no anchor to the pyhive prefix (presto error:). A tighter pattern liker"presto error:.*\bUnexpected status code 401\b"would reduce false-positive risk from other libraries surfacing a 401 through this code path. Low probability but worth considering.
PRAISE
- The test docstring is excellent — it names the regression issue (#33554), distinguishes the two pyhive auth paths (LDAP vs HTTP 401), and explains the gap in
custom_errors. That context will save the next maintainer significant archaeology. - Proper behavioral test: reverting the new production code would cause the test to fail with
GENERIC_DB_ENGINE_ERRORinstead ofCONNECTION_ACCESS_DENIED_ERROR. Not a false positive. - The fix is minimal and surgical — one module-level regex constant (immutable, no thread-safety concern) and one
custom_errorsentry that follows the exact pattern of all sibling entries. No unnecessary abstraction. - New regex correctly placed at module top-level, consistent with all other
*_REGEXconstants in the file.
aminghadersohi
left a comment
There was a problem hiding this comment.
PR #40618 — test(presto): 401 Unauthorized must surface as CONNECTION_ACCESS_DENIED_ERROR (#33554)
HEAD SHA: e463c6d
All 20 automated scans run against both changed files. Summary:
- Scans 2–3, 5–7, 9, 11, 13, 15–20: 0 violations in diff
- Scans 4, 8, 10, 12, 14: pre-existing issues not introduced by this PR
- Scan 1 (inline imports): pre-existing file-wide pattern; new test adds 2 more instances — see NIT
- Scan 18 (migration boolean filter): N/A — no alembic files changed
BLOCKER
None.
HIGH
None.
MEDIUM
[superset/db_engine_specs/presto.py:974] {"invalid_credentials": True} in the new custom_errors entry is dead metadata. Repo-wide grep finds zero consumers in Python or TypeScript/frontend. The CONNECTION_ACCESS_DENIED_REVIEW_COMMENTS check in superset-frontend/src/views/CRUD/hooks.ts destructures extra.catalog and extra.invalid, not invalid_credentials. The existing sibling entry (CONNECTION_ACCESS_DENIED_REGEX, line 965–969) uses {} for the same CONNECTION_ACCESS_DENIED_ERROR type. This creates an undocumented inconsistency: LDAP-auth denials won't set the flag; HTTP-401 denials will. If invalid_credentials is intended to drive frontend UI behavior, it must also be added to the LDAP-auth entry and a consumer must exist. Otherwise, drop it to {} to match the established pattern.
[PR description + title] The title says test(presto) and the description calls this a "test-only PR" with TDD framing ("Fails today — CI red → add the mapping..."). The actual diff includes the production fix too: CONNECTION_ACCESS_DENIED_401_REGEX at presto.py:81 and the custom_errors entry at presto.py:971–975. By conventionalcommits.org convention, test(scope) means a test-only commit; a fix that includes a test should be fix(presto). The TDD narrative in the description describes the workflow when the test was first written — it does not describe the current state of the PR. Update the title to fix(presto) and revise the description to reflect that both the mapping and the test are present.
NIT
-
[tests/unit_tests/db_engine_specs/test_presto.py:411]
assert len(result) == 1is not a behavioral assertion.extract_errorsalways returns exactly oneSupersetError(either a match or the generic fallback), so this assertion passes whether or not the production fix exists. It provides no regression value. Remove it or replace withassert result. -
[tests/unit_tests/db_engine_specs/test_presto.py:412] The test asserts
result[0].error_typebut notresult[0].message. If the human-readable copy is later changed to something unhelpful, the test still passes. Addingassert "credentials" in result[0].message(or checking for the full string) makes the regression guard complete for both the type and the user-visible copy. -
[superset/db_engine_specs/presto.py:81]
r"Unexpected status code 401"is unanchored — it would match any error message from any library that includes this substring. The actual pyhive error message ispresto error: Unexpected status code 401 b'Unauthorized'. A tighter pattern liker"presto error:.*\bUnexpected status code 401\b"reduces false-positive risk. Low probability given where this code runs, but worth considering. -
[tests/unit_tests/db_engine_specs/test_presto.py:406–407] Inline
from superset.db_engine_specs.presto import PrestoEngineSpec/from superset.errors import SupersetErrorTypefollows the pre-existing pattern throughout this file. Moving just these two to the module-level import block (lines 17–34) while leaving the rest inline would be inconsistent — file-wide cleanup is a separate task. Non-blocking.
PRAISE
- The test docstring is excellent: it names the regression issue (#33554), explains the two distinct pyhive auth paths (LDAP
"Access Denied: Invalid credentials"vs HTTP 401"Unexpected status code 401"), and describes why the existing regex doesn't cover this case. Future maintainers won't need to re-excavate this context. - The fix is minimal and surgical: one module-level immutable
re.Patternconstant and onecustom_errorsdict entry, following the exact established pattern. Zero unnecessary abstraction. result[0].error_type == CONNECTION_ACCESS_DENIED_ERRORis a genuine regression guard: reverting thecustom_errorsentry would produceGENERIC_DB_ENGINE_ERRORand the test would fail.- The new regex is correctly placed at module top-level, consistent with all other
*_REGEXconstants in the file. "Unexpected HTTP 401 response. Check your credentials."is a clear, actionable user-facing message.- Test function has
-> Noneannotation.
Verdict
Two MEDIUMs to address before merge: (1) drop or document {"invalid_credentials": True} and ensure consistency with the LDAP-auth sibling entry; (2) fix the conventional commit type to fix(presto) and update the description to reflect that both the test and the production mapping are included. NITs are non-blocking. The core fix and regression test are correct.
SUMMARY
This is a test-only PR opened as a TDD-style validation of issue #33554.
#33554 reports that SQL Lab queries against Presto return a raw pyhive error —
presto error: Unexpected status code 401 b'Unauthorized'— instead of a user-readable message. Datasets and charts work fine with the same connection; the 401 surfaces only in SQL Lab.Root cause gap:
PrestoEngineSpec.custom_errorshandles"Access Denied: Invalid credentials"(the pyhive LDAP-auth path) but has no pattern for"Unexpected status code 401"(the HTTP-401 path pyhive takes when the Presto server rejects the initial request). Trino's engine spec has explicitneeds_oauth2/ error handling for HTTP 401; Presto's does not.This PR adds one test to
tests/unit_tests/db_engine_specs/test_presto.py:test_extract_errors_maps_401_to_access_denied— callsextract_errorswith the exact pyhive 401 message and assertsCONNECTION_ACCESS_DENIED_ERROR. Fails today (no matching regex incustom_errors).How to interpret CI
r"Unexpected status code 401"tocustom_errors(or aneeds_oauth2-style handler) insuperset/db_engine_specs/presto.py.TESTING INSTRUCTIONS
ADDITIONAL INFORMATION
🤖 Generated with Claude Code