fix(mcp): filter sensitive database columns from list_databases loaded-metadata#40771
fix(mcp): filter sensitive database columns from list_databases loaded-metadata#40771aminghadersohi wants to merge 4 commits into
Conversation
…d-metadata `ModelListCore._get_columns_to_load()` only stripped USER_DIRECTORY_FIELDS from caller-supplied `select_columns`, but did not enforce the `all_columns` allowlist that `get_database_columns()` already uses to exclude credential fields (password, sqlalchemy_uri, encrypted_extra, server_cert). A caller could therefore pass those names via `select_columns` and have them appear in `columns_loaded`/`columns_requested` in the response and be forwarded to the DAO query. After `filter_user_directory_columns`, restrict the list to `self._all_columns` when it is set. For `list_databases` this means only columns surfaced in `columns_available` (which excludes the four credential fields) can be loaded.
There was a problem hiding this comment.
Pull request overview
This PR hardens MCP list tooling by enforcing an all_columns allowlist when processing caller-supplied select_columns, preventing sensitive/credential-related ORM columns (e.g., password, sqlalchemy_uri, encrypted_extra, server_cert) from being requested, loaded, or reflected in MCP response metadata for tools like list_databases.
Changes:
- Enforce
self._all_columnsas an allowlist inModelListCore._get_columns_to_load()after user-directory column filtering. - Add unit tests ensuring non-allowlisted columns are dropped (or raise when none remain).
- Add an integration-style MCP unit test ensuring
list_databasescannot surface credential columns viaselect_columns.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
superset/mcp_service/mcp_core.py |
Filters select_columns to the declared all_columns allowlist to prevent probing/loading excluded model fields. |
tests/unit_tests/mcp_service/system/tool/test_mcp_core.py |
Adds coverage for allowlist enforcement behavior in ModelListCore column selection. |
tests/unit_tests/mcp_service/database/tool/test_database_tools.py |
Verifies list_databases does not expose credential columns in requested/loaded/available metadata or row payloads. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #40771 +/- ##
==========================================
- Coverage 64.19% 64.02% -0.17%
==========================================
Files 2666 2664 -2
Lines 143991 143596 -395
Branches 33108 32947 -161
==========================================
- Hits 92428 91932 -496
- Misses 49950 50051 +101
Partials 1613 1613
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Address two follow-up issues from code review: 1. _has_explicit_all_columns flag: self._all_columns is always populated (falls back to default_columns when all_columns is omitted), so the previous `if self._all_columns:` guard in _get_columns_to_load() was always True, silently restricting select_columns to default_columns for any ModelListCore that did not declare an explicit allowlist. The fix adds self._has_explicit_all_columns = all_columns is not None and gates the intersection check on that flag, preserving the old passthrough behaviour for tools without a declared allowlist. 2. DAO columns assertion in the database integration test: directly verify that DatabaseDAO.list never receives sensitive column names in its columns kwarg, covering the original exploit path. 3. New unit test to pin the no-all_columns contract: confirms that non-default columns are selectable when all_columns is not declared.
`native_filters` is a relationship/property on the Dashboard model that is not picked up by `inspect(model_cls).columns`, so it was absent from `DASHBOARD_EXTRA_COLUMNS` and therefore missing from the `all_columns` allowlist passed to `ModelListCore`. The allowlist enforcement added in the preceding commits then silently dropped it from `columns_to_load`, breaking `test_list_dashboards_sanitizes_dashboard_descriptions_and_filter_text`. Add `native_filters` to `DASHBOARD_EXTRA_COLUMNS` so it is included in the advertised allowlist and selectable via `select_columns`.
Code Review Agent Run #383173Actionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
rusackas
left a comment
There was a problem hiding this comment.
LGTM, but noting the native_filters entry added to DASHBOARD_EXTRA_COLUMNS is unrelated to "sensitive database columns", which is minor scope creep (harmless, opt-in non-default column).
|
@rusackas good catch — the |
SUMMARY
ModelListCore._get_columns_to_load()only strippedUSER_DIRECTORY_FIELDSfrom caller-supplied
select_columns, but did not enforce theall_columnsallowlist that
get_database_columns()already uses to exclude credentialfields (
password,sqlalchemy_uri,encrypted_extra,server_cert). Acaller could therefore pass those names via
select_columnsand have themappear in
columns_loaded/columns_requestedin the MCP response metadataand be forwarded to the DAO query.
Fix: after
filter_user_directory_columns, restrict the column list toself._all_columnswhen it is set. Forlist_databasesthis means onlycolumns surfaced in
columns_available(which already excludes the fourcredential fields via
DATABASE_EXCLUDE_COLUMNS) can be loaded.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
N/A — backend-only change.
TESTING INSTRUCTIONS
New unit tests cover the fix:
ADDITIONAL INFORMATION