Skip to content

fix(mcp): filter sensitive database columns from list_databases loaded-metadata#40771

Open
aminghadersohi wants to merge 4 commits into
masterfrom
amin/mcp-list-db-columns-sensitive
Open

fix(mcp): filter sensitive database columns from list_databases loaded-metadata#40771
aminghadersohi wants to merge 4 commits into
masterfrom
amin/mcp-list-db-columns-sensitive

Conversation

@aminghadersohi
Copy link
Copy Markdown
Contributor

SUMMARY

ModelListCore._get_columns_to_load() only stripped USER_DIRECTORY_FIELDS
from caller-supplied select_columns, but did not enforce the all_columns
allowlist that get_database_columns() already uses to exclude credential
fields (password, sqlalchemy_uri, encrypted_extra, server_cert). A
caller could therefore pass those names via select_columns and have them
appear in columns_loaded/columns_requested in the MCP response metadata
and be forwarded to the DAO query.

Fix: after filter_user_directory_columns, restrict the column list to
self._all_columns when it is set. For list_databases this means only
columns surfaced in columns_available (which already excludes the four
credential fields via DATABASE_EXCLUDE_COLUMNS) can be loaded.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

N/A — backend-only change.

TESTING INSTRUCTIONS

New unit tests cover the fix:

pytest tests/unit_tests/mcp_service/system/tool/test_mcp_core.py \
  -k "test_model_list_tool_rejects_columns_not_in_all_columns or test_model_list_tool_rejects_only_excluded_columns_raises"

pytest tests/unit_tests/mcp_service/database/tool/test_database_tools.py \
  -k "test_list_databases_does_not_expose_sensitive_credential_columns"

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

…d-metadata

`ModelListCore._get_columns_to_load()` only stripped USER_DIRECTORY_FIELDS
from caller-supplied `select_columns`, but did not enforce the `all_columns`
allowlist that `get_database_columns()` already uses to exclude credential
fields (password, sqlalchemy_uri, encrypted_extra, server_cert). A caller
could therefore pass those names via `select_columns` and have them appear in
`columns_loaded`/`columns_requested` in the response and be forwarded to the
DAO query.

After `filter_user_directory_columns`, restrict the list to `self._all_columns`
when it is set. For `list_databases` this means only columns surfaced in
`columns_available` (which excludes the four credential fields) can be loaded.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens MCP list tooling by enforcing an all_columns allowlist when processing caller-supplied select_columns, preventing sensitive/credential-related ORM columns (e.g., password, sqlalchemy_uri, encrypted_extra, server_cert) from being requested, loaded, or reflected in MCP response metadata for tools like list_databases.

Changes:

  • Enforce self._all_columns as an allowlist in ModelListCore._get_columns_to_load() after user-directory column filtering.
  • Add unit tests ensuring non-allowlisted columns are dropped (or raise when none remain).
  • Add an integration-style MCP unit test ensuring list_databases cannot surface credential columns via select_columns.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
superset/mcp_service/mcp_core.py Filters select_columns to the declared all_columns allowlist to prevent probing/loading excluded model fields.
tests/unit_tests/mcp_service/system/tool/test_mcp_core.py Adds coverage for allowlist enforcement behavior in ModelListCore column selection.
tests/unit_tests/mcp_service/database/tool/test_database_tools.py Verifies list_databases does not expose credential columns in requested/loaded/available metadata or row payloads.

@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 4, 2026

Codecov Report

❌ Patch coverage is 0% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.02%. Comparing base (0984839) to head (81e4cde).
⚠️ Report is 7 commits behind head on master.

Files with missing lines Patch % Lines
superset/mcp_service/mcp_core.py 0.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #40771      +/-   ##
==========================================
- Coverage   64.19%   64.02%   -0.17%     
==========================================
  Files        2666     2664       -2     
  Lines      143991   143596     -395     
  Branches    33108    32947     -161     
==========================================
- Hits        92428    91932     -496     
- Misses      49950    50051     +101     
  Partials     1613     1613              
Flag Coverage Δ
hive 39.76% <0.00%> (-0.01%) ⬇️
mysql 58.41% <0.00%> (-0.01%) ⬇️
postgres 58.47% <0.00%> (-0.01%) ⬇️
presto 41.35% <0.00%> (-0.01%) ⬇️
python 59.94% <0.00%> (-0.01%) ⬇️
sqlite 58.10% <0.00%> (-0.01%) ⬇️
unit 100.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Address two follow-up issues from code review:

1. _has_explicit_all_columns flag: self._all_columns is always populated
   (falls back to default_columns when all_columns is omitted), so the
   previous `if self._all_columns:` guard in _get_columns_to_load() was
   always True, silently restricting select_columns to default_columns for
   any ModelListCore that did not declare an explicit allowlist. The fix
   adds self._has_explicit_all_columns = all_columns is not None and gates
   the intersection check on that flag, preserving the old passthrough
   behaviour for tools without a declared allowlist.

2. DAO columns assertion in the database integration test: directly verify
   that DatabaseDAO.list never receives sensitive column names in its
   columns kwarg, covering the original exploit path.

3. New unit test to pin the no-all_columns contract: confirms that
   non-default columns are selectable when all_columns is not declared.
@pull-request-size pull-request-size Bot added size/L and removed size/M labels Jun 4, 2026
`native_filters` is a relationship/property on the Dashboard model that is
not picked up by `inspect(model_cls).columns`, so it was absent from
`DASHBOARD_EXTRA_COLUMNS` and therefore missing from the `all_columns`
allowlist passed to `ModelListCore`. The allowlist enforcement added in the
preceding commits then silently dropped it from `columns_to_load`, breaking
`test_list_dashboards_sanitizes_dashboard_descriptions_and_filter_text`.

Add `native_filters` to `DASHBOARD_EXTRA_COLUMNS` so it is included in the
advertised allowlist and selectable via `select_columns`.
@aminghadersohi aminghadersohi marked this pull request as ready for review June 4, 2026 17:37
@aminghadersohi aminghadersohi requested a review from rusackas June 4, 2026 17:41
@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented Jun 4, 2026

Code Review Agent Run #383173

Actionable Suggestions - 0
Review Details
  • Files reviewed - 4 · Commit Range: 9a9c8a5..3ba404e
    • superset/mcp_service/common/schema_discovery.py
    • superset/mcp_service/mcp_core.py
    • tests/unit_tests/mcp_service/database/tool/test_database_tools.py
    • tests/unit_tests/mcp_service/system/tool/test_mcp_core.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Copy link
Copy Markdown
Member

@rusackas rusackas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but noting the native_filters entry added to DASHBOARD_EXTRA_COLUMNS is unrelated to "sensitive database columns", which is minor scope creep (harmless, opt-in non-default column).

@aminghadersohi
Copy link
Copy Markdown
Contributor Author

@rusackas good catch — the native_filters addition is indeed outside the intended scope, but it was a required regression fix rather than intentional feature work. The allowlist enforcement we added means any column absent from all_columns is silently dropped from select_columns; native_filters was missing from DASHBOARD_EXTRA_COLUMNS so get_all_column_names(get_dashboard_columns()) excluded it, and the existing test_list_dashboards_sanitizes_dashboard_descriptions_and_filter_text broke when it tried to select it. Adding it restores the pre-existing behaviour rather than introducing new capability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants