Skip to content

feat(mcp): add list and get tools for saved queries and query history#40346

Open
aminghadersohi wants to merge 3 commits into
masterfrom
amin/mcp-list-queries
Open

feat(mcp): add list and get tools for saved queries and query history#40346
aminghadersohi wants to merge 3 commits into
masterfrom
amin/mcp-list-queries

Conversation

@aminghadersohi
Copy link
Copy Markdown
Contributor

SUMMARY

Adds four new MCP tools across two new domains (saved_query/ and query/):

  • list_saved_queries — List saved SQL queries owned by the current user with filtering (label, db_id, schema), search, and pagination
  • get_saved_query_info — Get saved query details by numeric ID or UUID
  • list_queries — List SQL query history with filtering (status, database_id, schema), defaulting to most-recent-first (ordered by start_time desc, page_size 25)
  • get_query_info — Get query history details by numeric ID

Both domains follow the existing database/, dataset/, chart/, and dashboard/ patterns: ModelListCore/ModelGetInfoCore for reuse, Pydantic schemas with field-level serialization context, @tool decorators with RBAC, and event_logger instrumentation.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

N/A — backend API tools only

TESTING INSTRUCTIONS

# Run unit tests for the new tools
pytest tests/unit_tests/mcp_service/saved_query/
pytest tests/unit_tests/mcp_service/query/

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration
  • Introduces new feature or API
  • Removes existing feature or API

aminghadersohi and others added 2 commits May 21, 2026 20:44
Implements list_saved_queries, get_saved_query_info, list_queries, and
get_query_info MCP tools in new saved_query/ and query/ domains.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@gabotorresruiz
Copy link
Copy Markdown
Contributor

Review: list/get tools for saved queries and query history

Thorough pass over the 4 new MCP tools. Summary: clean, well-tested, correctly secured, and faithful to the existing list_datasets / get_dataset_info pattern. Approve with minor comments — nothing blocking.

Verified (all good ✅)

  • RBAC / row-level security is correctly enforced. QueryDAO.base_filter = QueryFilter restricts results to Query.user_id == current_user unless the user has can_access_all_queries; SavedQueryDAO.base_filter = SavedQueryFilter restricts to created_by == current_user. BaseDAO.list and find_by_id (default skip_base_filter=False) both apply these, so the new tools cannot leak another user's queries. The docstrings ("...or all queries for admins") match the actual filter behavior.
  • class_permission_name="Query" / "SavedQuery" match the REST API view names exactly, and method_permission_name defaults to read → gates on can_read. Consistent with other read tools.
  • Schema fields map to real columns: Query has start_time/end_time (Numeric, epoch seconds), changed_on, schema, status, database_id. Query has no uuid, so get_query_info correctly accepts numeric ID only, while get_saved_query_info correctly supports ID or UUID.
  • Tests cover the meaningful paths: basic list, filters, search, empty results, pagination, not-found, default page size (25), default ordering (start_time desc), and the search+filters mutual-exclusion validator.

Suggested changes (none blocking)

1. sql in DEFAULT_QUERY_COLUMNS + page_size=25 produces heavy default payloads. DEFAULT_QUERY_COLUMNS includes sql and DEFAULT_QUERY_PAGE_SIZE = 25, so a bare list_queries() returns 25 full SQL bodies. This contrasts with DEFAULT_SAVED_QUERY_COLUMNS (no sql) and DEFAULT_DATASET_COLUMNS, whose comment explicitly says "Minimal defaults for reduced token usage." For an LLM-facing tool, consider dropping sql from the query default columns — callers can request it via select_columns or use get_query_info.

2. QueryError.create() / SavedQueryError.create() are dead code. Both classmethods are defined but never called: the get-tools build QueryError(...) directly, ModelGetInfoCore builds error_schema(...) directly, and the list-tools re-raise. Either drop them or use them in the except handlers for consistency.

3. changed_on is selectable but not sortable for queries. It is in ALL_QUERY_COLUMNS but not SORTABLE_QUERY_COLUMNS, despite Query having an indexed changed_on column (ti_user_id_changed_on). Minor asymmetry — consider adding it to sortable, or leave as-is since start_time is the natural recency sort.

4. Minor test-coverage gaps (optional). No test exercises select_columns projection (the context-based field filtering is a core feature), rejection of a non-sortable order_column, or the InternalError exception path in the get-tools.

5. app.py instruction-text inconsistency (trivial). The list_saved_queries line says "(1-based pagination)" while list_queries says "(most recent first)". Both are 1-based; cosmetic only.

Notes (consistent with existing code, not PR issues)

  • List tools annotate -> QueryList | QueryError but actually return a dict from model_dump() and re-raise on error rather than returning QueryError. This mirrors list_datasets, so it's an established pattern.
  • error_type="InternalError" (PascalCase) vs "not_found" (snake_case) inconsistency is pre-existing (get_dataset_info does the same).
  • populate_by_name=True in the schemas is unnecessary (no aliases) but harmless and matches the dataset schema.

Item #1 is the only one I'd suggest addressing before merge; the rest are nits.

…ies tools

- Drop sql from DEFAULT_QUERY_COLUMNS (25 rows × full SQL bodies is
  too heavy for default LLM responses; callers use select_columns or
  get_query_info to access SQL)
- Add changed_on to SORTABLE_QUERY_COLUMNS for queries (column is
  indexed on the model, same treatment as saved queries)
- Remove QueryError.create() and SavedQueryError.create() dead code
- Fix list_queries instruction text in app.py: use '(1-based pagination)'
  to match the wording used by all other list tools
- Add tests: select_columns field projection and invalid order_column rejection
@github-actions github-actions Bot removed the api Related to the REST API label May 22, 2026
@aminghadersohi
Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review! Addressed in ecec0a0:

#1 — Addressed: Dropped sql from DEFAULT_QUERY_COLUMNS. A bare list_queries() no longer returns full SQL bodies by default; callers use select_columns or get_query_info.

#2 — Addressed: Removed both QueryError.create() and SavedQueryError.create() dead classmethods.

#3 — Addressed: Added changed_on to SORTABLE_QUERY_COLUMNS for queries. The column is indexed (ti_user_id_changed_on) and the DAO supports it, same treatment as saved queries.

#4 — Addressed (partially): Added a select_columns projection test and an invalid order_column rejection test. Skipped the InternalError path since the list tools re-raise and that path is already covered by the middleware tests.

#5 — Addressed: Fixed the list_queries entry in app.py to say "(1-based pagination)" consistent with all other list tools.

@netlify
Copy link
Copy Markdown

netlify Bot commented May 22, 2026

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit ecec0a0
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/6a10ba3285913f00088418c4
😎 Deploy Preview https://deploy-preview-40346--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
🤖 Make changes Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

Comment on lines +125 to +129
return SavedQueryError(
error=f"Failed to get saved query info: {str(e)}",
error_type="InternalError",
timestamp=datetime.now(timezone.utc),
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The error response includes raw exception text (str(e)) in the payload returned to clients, which can disclose internal implementation details (database errors, query internals, stack context). Return a generic user-facing message instead and keep full exception details only in server logs. [security]

Severity Level: Critical 🚨
- ❌ MCP `get_saved_query_info` leaks raw backend exception text.
- ⚠️ Internal database or ORM errors exposed to MCP clients.
- ⚠️ LLM consumers may see sensitive implementation details.
Steps of Reproduction ✅
1. Start the MCP server, which imports and registers the saved query tools in
`superset/mcp_service/app.py` (see imports of `get_saved_query_info` and
`list_saved_queries` at `superset/mcp_service/app.py:628-651`, including the specific
`from superset.mcp_service.saved_query.tool import get_saved_query_info` line reported by
Grep at `app.py:648`).

2. An MCP client (e.g., an LLM tool consumer) invokes the `get_saved_query_info` tool by
name, which causes FastMCP to execute the `get_saved_query_info` coroutine defined with
the `@tool` decorator in
`superset/mcp_service/saved_query/tool/get_saved_query_info.py:43-54`.

3. Inside `get_saved_query_info`, the function constructs a `ModelGetInfoCore` with
`SavedQueryDAO` and calls `get_tool.run_tool(request.identifier)` within the `try` block
at `get_saved_query_info.py:82-95`; if this DAO call or any code in the `try` (for
example, a database connection failure or unexpected ORM error) raises an exception,
control flows into the `except Exception as e` block at `get_saved_query_info.py:115-123`.

4. In the `except` block, the function logs the error via `ctx.error` and then returns a
`SavedQueryError` instance at `get_saved_query_info.py:125-129` where the `error` field is
set to `f"Failed to get saved query info: {str(e)}"`, so the full raw exception message
`str(e)` is serialized according to the `SavedQueryError` schema in
`superset/mcp_service/saved_query/schemas.py:1-3` and sent back to the MCP client,
potentially exposing internal database error messages or other sensitive implementation
details to the caller.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/saved_query/tool/get_saved_query_info.py
**Line:** 125:129
**Comment:**
	*Security: The error response includes raw exception text (`str(e)`) in the payload returned to clients, which can disclose internal implementation details (database errors, query internals, stack context). Return a generic user-facing message instead and keep full exception details only in server logs.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

@codecov
Copy link
Copy Markdown

codecov Bot commented May 22, 2026

Codecov Report

❌ Patch coverage is 67.47405% with 94 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.11%. Comparing base (f09fd63) to head (ecec0a0).
⚠️ Report is 30 commits behind head on master.

Files with missing lines Patch % Lines
superset/mcp_service/query/tool/list_queries.py 33.33% 20 Missing ⚠️
...mcp_service/saved_query/tool/list_saved_queries.py 33.33% 20 Missing ⚠️
superset/mcp_service/query/schemas.py 84.61% 13 Missing and 1 partial ⚠️
superset/mcp_service/saved_query/schemas.py 83.52% 13 Missing and 1 partial ⚠️
superset/mcp_service/query/tool/get_query_info.py 43.47% 13 Missing ⚠️
...p_service/saved_query/tool/get_saved_query_info.py 43.47% 13 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #40346      +/-   ##
==========================================
- Coverage   64.15%   64.11%   -0.04%     
==========================================
  Files        2592     2600       +8     
  Lines      138861   139243     +382     
  Branches    32208    32237      +29     
==========================================
+ Hits        89085    89278     +193     
- Misses      48244    48430     +186     
- Partials     1532     1535       +3     
Flag Coverage Δ
hive 39.40% <67.47%> (+0.07%) ⬆️
mysql 58.82% <67.47%> (-0.04%) ⬇️
postgres 58.90% <67.47%> (-0.04%) ⬇️
presto 41.06% <67.47%> (+0.06%) ⬆️
python 60.45% <67.47%> (-0.05%) ⬇️
sqlite 58.54% <67.47%> (-0.04%) ⬇️
unit 100.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@bito-code-review bito-code-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Agent Run #bd7d73

Actionable Suggestions - 3
  • superset/mcp_service/saved_query/tool/get_saved_query_info.py - 1
  • superset/mcp_service/query/tool/list_queries.py - 1
  • tests/unit_tests/mcp_service/query/tool/test_query_tools.py - 1
Additional Suggestions - 1
  • tests/unit_tests/mcp_service/saved_query/tool/test_saved_query_tools.py - 1
    • Incomplete pagination assertions · Line 258-273
      Test `test_list_saved_queries_pagination_info` validates pagination metadata but is missing an assertion for the `count` field, which is part of the SavedQueryList schema (line 122 in schemas.py).
      Code suggestion
      --- a/tests/unit_tests/mcp_service/saved_query/tool/test_saved_query_tools.py
      +++ b/tests/unit_tests/mcp_service/saved_query/tool/test_saved_query_tools.py
       @@ -269,6 +269,7 @@ async def test_list_saved_queries_pagination_info(mock_list, mcp_server):
                )
                data = json.loads(result.content[0].text)
                assert data["total_count"] == 25
      +        assert data["count"] == 3
                assert data["page_size"] == 3
                assert data["has_next"] is True
                assert data["has_previous"] is False
Review Details
  • Files reviewed - 17 · Commit Range: 5de1fb7..ecec0a0
    • superset/mcp_service/app.py
    • superset/mcp_service/query/__init__.py
    • superset/mcp_service/query/schemas.py
    • superset/mcp_service/query/tool/__init__.py
    • superset/mcp_service/query/tool/get_query_info.py
    • superset/mcp_service/query/tool/list_queries.py
    • superset/mcp_service/saved_query/__init__.py
    • superset/mcp_service/saved_query/schemas.py
    • superset/mcp_service/saved_query/tool/__init__.py
    • superset/mcp_service/saved_query/tool/get_saved_query_info.py
    • superset/mcp_service/saved_query/tool/list_saved_queries.py
    • tests/unit_tests/mcp_service/query/__init__.py
    • tests/unit_tests/mcp_service/query/tool/__init__.py
    • tests/unit_tests/mcp_service/query/tool/test_query_tools.py
    • tests/unit_tests/mcp_service/saved_query/__init__.py
    • tests/unit_tests/mcp_service/saved_query/tool/__init__.py
    • tests/unit_tests/mcp_service/saved_query/tool/test_saved_query_tools.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo


return result

except Exception as e:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent error handling

Inconsistent exception handling between tools in the same module: list_saved_queries raises exceptions (line 159) while get_saved_query_info returns SavedQueryError responses (line 125). This inconsistency can confuse callers expecting uniform behavior.

Code Review Run #bd7d73


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

Comment on lines +146 to +156
except Exception as e:
await ctx.error(
"Query listing failed: page=%s, page_size=%s, error=%s, error_type=%s"
% (
request.page,
request.page_size,
str(e),
type(e).__name__,
)
)
raise
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CWE-390: Redundant except-Raise

At line 146, a bare except Exception is caught but then immediately re-raised at line 156, which defeats the purpose of the catch block. The error logging is redundant since the GlobalErrorHandlerMiddleware will log the exception anyway. Compare with get_query_info.py (lines 118-121) which returns a QueryError response instead of re-raising, providing a consistent API contract that callers can handle. If re-raising is intended, remove the catch block entirely. (See also: CWE-390)

Code Review Run #bd7d73


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

"""order_column not in SORTABLE_QUERY_COLUMNS must be rejected."""
request = ListQueriesRequest(page=1, page_size=10, order_column="tab_name")
async with Client(mcp_server) as client:
with pytest.raises(Exception, match="Invalid order_column"):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broad exception catch in test

Replace broad Exception catch with ValueError to match what the implementation raises at mcp_core.py:205. This prevents masking unrelated errors.

Code Review Run #bd7d73


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants