Skip to content

mcp: auto-route read_data_product to the data product's catalog cluster#36619

Merged
bobbyiliev merged 2 commits into
MaterializeInc:mainfrom
bobbyiliev:mcp-fix-read-data-product-cluster-parameter
May 25, 2026
Merged

mcp: auto-route read_data_product to the data product's catalog cluster#36619
bobbyiliev merged 2 commits into
MaterializeInc:mainfrom
bobbyiliev:mcp-fix-read-data-product-cluster-parameter

Conversation

@bobbyiliev
Copy link
Copy Markdown
Contributor

@bobbyiliev bobbyiliev requested a review from a team as a code owner May 19, 2026 13:53
Copy link
Copy Markdown
Contributor

@ggevay ggevay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just minor comments.

Comment thread src/environmentd/src/http/mcp.rs Outdated
//
// TODO: Remove this extra round-trip once catalog errors get specific
// SQL error codes (see TODO in src/adapter/src/error.rs `fn code()`),
// then we can translate the query error directly and drop the lookup.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The TODO went a bit stale. Solving the error stuff now wouldn't be enough to remove this extra query.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, removed it!

Comment thread src/environmentd/src/http/mcp.rs Outdated

// Override beats catalog; catalog beats session default. The override
// path is unchanged from before, so explicit callers are not affected
// by the auto-routing change.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This is more a PR comment than a code comment. I'd just remove this from the code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep agreed, trimmed it down to just the precedence rule.

"SELECT cluster FROM mz_internal.mz_mcp_data_products \
WHERE object_name = {} \
ORDER BY cluster NULLS LAST \
LIMIT 1",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether the

ORDER BY cluster NULLS LAST \
LIMIT 1

is needed. Could this even return multiple rows without the LIMIT 1? I would expect not, because object_name seems fully qualified. We could maybe soft_assert_or_log! that it has exactly 1 result row.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So looking at the view, it joins mz_indexes and the cluster column is COALESCE(c_idx.name, c_obj.name). An MV indexed on more than one cluster would surface once per index/cluster pair, so a soft_assert for exactly 1 row would fire on that case. Does that match what you'd expect or am I missing something?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, ok!

@bobbyiliev bobbyiliev force-pushed the mcp-fix-read-data-product-cluster-parameter branch from 5f2b464 to 5a5a3ce Compare May 24, 2026 19:26
@bobbyiliev bobbyiliev requested review from Copilot and ggevay and removed request for Copilot and ggevay May 24, 2026 19:30
@bobbyiliev bobbyiliev requested a review from jubrad May 24, 2026 19:31
@bobbyiliev bobbyiliev merged commit 84591a7 into MaterializeInc:main May 25, 2026
126 checks passed
@def-
Copy link
Copy Markdown
Contributor

def- commented May 26, 2026

QA LLM review noted some issues, which I think can be considered acceptable? Since I'm not sure, posting them here in case it's helpful:

MEDIUM — Auto-routing requires USAGE on the data product's home cluster, silently breaking valid RBAC patterns

Location: src/environmentd/src/http/mcp.rs:1034-1041

The new code unconditionally routes a default-cluster read to whatever cluster
is recorded in mz_mcp_data_products.cluster:

let target_cluster = cluster_override.or(catalog_cluster);
...
let read_query = build_read_query(&safe_name, limit, target_cluster);

The catalog view mz_mcp_data_products filters rows by SELECT privilege on the
data product (mz_show_my_object_privileges) but does not filter by USAGE
on the cluster — mz_clusters is unrestricted, and the view simply joins to
mz_clusters to surface the cluster name. So a role may see a data product
whose home cluster it has no USAGE privilege on.

In Materialize this is a normal, supported pattern:

  • "compute" cluster hosts an expensive MV's dataflow,
  • "serving" cluster (smaller, owned by the agent team) reads MVs from persist,
  • agent role has SELECT on the MV and USAGE on the serving cluster only.

Before this commit, read_data_product ran on the session-default cluster (the
serving cluster) — slower, but correct. After the commit, the bare read now
emits BEGIN READ ONLY; SET CLUSTER = '<home-cluster>'; SELECT * FROM ...; COMMIT;. The SET CLUSTER itself succeeds (Materialize doesn't check USAGE at
SET time, only emits a notice if the cluster is unknown), but the SELECT
fails with a permission-denied error from generate_cluster_usage_privileges
in src/sql/src/rbac.rs:1797, because non-constant SELECT requires USAGE on
the active cluster.

So a call that previously succeeded silently degrades to a hard failure. The
documented workaround — passing cluster: "<serving-cluster>" — requires the
agent to know which cluster it has USAGE on, which is exactly the information
the MCP tool was designed to hide. The MCP catalog view does not expose this
information.

The PR description (DEX-27) frames this as fixing a docs/behavior mismatch,
which is true, but the docs-aligned behavior is now stricter than the old
behavior and there is no graceful fallback for the case the docs implicitly
assumed (agent role with USAGE on every relevant cluster).

Suggested fixes (in order of preference):

  1. Filter mz_mcp_data_products.cluster to only show clusters the role has
    USAGE on (e.g. join against mz_internal.mz_show_my_cluster_privileges).
    This both fixes the auto-routing failure and lets the agent see which
    override values are actually valid.
  2. In read_data_product, look up the role's USAGE on catalog_cluster
    before issuing the SET CLUSTER, and skip the auto-routing if absent
    (fall back to the session default, matching old behavior). This preserves
    the suboptimal-but-working path for restricted roles.
  3. At minimum, expand the error path: if the auto-routed SELECT fails with
    a cluster-USAGE error, retry on the session default and surface a notice
    to the agent so the workflow keeps working.

LOW — Auto-routing fast-path almost never fires

build_read_query keeps a "single bare SELECT" branch for the None case,
justified in the doc comment as avoiding three extra round trips on the hot
path of session-default reads. After this commit, target_cluster is None
only when both cluster_override is absent and catalog_cluster is NULL.
Looking at the view definition (src/catalog/src/builtin/mz_internal.rs:5287),
cluster is COALESCE(c_idx.name, c_obj.name):

  • For MVs, c_obj.name is the home cluster — always set.
  • For views in the result set, the WHERE clause requires i.id IS NOT NULL,
    so c_idx.name is always set.

So in practice the fast-path is unreachable for legitimate data products.
This is not a correctness issue, but the comment ("hot path of session-default
reads") is misleading post-change and the fast path is effectively dead code.
Worth either deleting the fast path or rewording the comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants