Skip to content

catalog: convert mz_materialized_views into a view over the catalog#35819

Open
teskje wants to merge 3 commits intoMaterializeInc:mainfrom
teskje:mz_materialized_views-view
Open

catalog: convert mz_materialized_views into a view over the catalog#35819
teskje wants to merge 3 commits intoMaterializeInc:mainfrom
teskje:mz_materialized_views-view

Conversation

@teskje
Copy link
Copy Markdown
Contributor

@teskje teskje commented Apr 1, 2026

This PR converts the mz_materialized_views builtin table into a materialized view over the catalog.

It includes some pre-work to make this possible:

  • Introduces a constant builtin view mz_builtin_materialized_views that exposes... builtin materialized views. The mz_materialized_views query joins against that view to augment the catalog contents (which only include user MVs) with all the builtin MVs.
  • Introduces two new internal SQL functions:
    • parse_catalog_create_sql, to extract information from the MV create_sql
    • redact_sql, to derived redacted_create_sql from the MV create_sql

Alternatives to having mz_builtin_materialized_views as a constant builtin view:

  • Make it a builtin table that gets updated during bootstrap. We want to move away from builtin tables as they complicate having concurrent envds, so this seems like a step in the wrong direction. Builtin views pose no issues for concurrent envds.
  • Make it a builtin table function. This doesn't work because SQL funcs are defined in mz-expr, while builtins are defined in mz-catalog, and the dependency relationship is the wrong way round. We could introduce a way to add table functions that get evaluated during planning time, but afaict that infrastructure doesn't exist yet.
  • Insert builtins into the catalog so we can simply treat them like user MVs. This might be nice but is also a lot of work, and might also complicate 0dt upgrades by adding more state to the catalog that will need to be migrated.

TODO

Motivation

Part of SQL-118

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone.

PR title guidelines

  • Use imperative mood: "Fix X" not "Fixed X" or "Fixes X"
  • Be specific: "Fix panic in catalog sync when controller restarts" not "Fix bug" or "Update catalog code"
  • Prefix with area if helpful: compute: , storage: , adapter: , sql:

Pre-merge checklist

  • The PR title is descriptive and will make sense in the git log.
  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

This commit adds the first constant builtin view exposing builtin
objects to the catalog. Specifically, `mz_builtin_materialized_views`
reports what builtin materialized views exist. It is required to define
`mz_materialized_views` as a view over the catalog, since the catalog
does not contain builtin objects.

Further builtin views exposing other builtin object types will be added
as needed to support converting more builtin tables.
@teskje teskje force-pushed the mz_materialized_views-view branch 2 times, most recently from f27e0e4 to 0ae57d9 Compare April 1, 2026 13:28
@teskje teskje marked this pull request as ready for review April 1, 2026 13:52
@teskje teskje requested review from a team as code owners April 1, 2026 13:52
@teskje teskje requested a review from ggevay April 1, 2026 13:52
Comment on lines +367 to +368
// TODO: This function isn't parsing JSONB and therefore shouldn't live in the `jsonb` module.
// Consider moving all the `parse_catalog_*` functions into their own module.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't want to include this in this PR because of the noise. I'll make a separate code movement PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok


use mz_repr::{Datum, InputDatumType, OutputDatumType, ReprColumnType, RowArena, SqlColumnType};

use crate::scalar::func::RedactSql;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put this in mz_expr::scalar::func so it's next to pretty_sql, which is the only similar function we have, afaict. But now it's the only unary func that doesn't live in mz_expr::scalar::func::impls. Should I move it?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say it's ok. We can move it later if needed.

Comment on lines -1816 to -1828

new_mz_tables_gid = c.sql_query(
"SELECT id FROM mz_tables WHERE name = 'mz_tables'",
service="mz_new",
)[0][0]
new_mv_gid = c.sql_query(
"SELECT id FROM mz_materialized_views WHERE name = 'mv'",
service="mz_new",
)[0][0]
assert new_mz_tables_gid == mz_tables_gid
assert new_mv_gid == mv_gid
# mz_internal.mz_storage_shards won't update until this instance becomes the leader

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to remove this part because mz_materialized_views is now a builtin MV and becomes unreadable in read-only mode after a replacement migration. I think this is fine, the checks here didn't add much anyway.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

Comment on lines +97 to +100
.with_key(vec![0])
.with_key(vec![2])
.with_key(vec![4])
.with_key(vec![6])
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The optimizer decides these keys based on the concretes values in the VALUES list. It determines that name, definition and create_sql values are keys because they are unique... which isn't wrong!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, these are somewhat surprising keys, but I'd say it's ok. (At some point, we'll hopefully introduce more rigorous testing for the keys of builtins.)

@teskje
Copy link
Copy Markdown
Contributor Author

teskje commented Apr 1, 2026

Green nightly (except that sqlsmith stubles over redact_sql): https://buildkite.com/materialize/nightly/builds/15939#_

This commit adds two new internal sqlfuncs to support converting
`mz_materialzied_views` into a view over the catalog.

* `parse_catalog_create_sql` parses a create_sql string and returns
  information extracted from it as JSON
* `redact_sql` redacts a given SQL statement
@teskje teskje force-pushed the mz_materialized_views-view branch from 0ae57d9 to ddfb2e6 Compare April 2, 2026 08:24
Copy link
Copy Markdown
Contributor

@ggevay ggevay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment on lines +97 to +100
.with_key(vec![0])
.with_key(vec![2])
.with_key(vec![4])
.with_key(vec![6])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, these are somewhat surprising keys, but I'd say it's ok. (At some point, we'll hopefully introduce more rigorous testing for the keys of builtins.)

Comment on lines -1816 to -1828

new_mz_tables_gid = c.sql_query(
"SELECT id FROM mz_tables WHERE name = 'mz_tables'",
service="mz_new",
)[0][0]
new_mv_gid = c.sql_query(
"SELECT id FROM mz_materialized_views WHERE name = 'mv'",
service="mz_new",
)[0][0]
assert new_mz_tables_gid == mz_tables_gid
assert new_mv_gid == mv_gid
# mz_internal.mz_storage_shards won't update until this instance becomes the leader

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok


use mz_repr::{Datum, InputDatumType, OutputDatumType, ReprColumnType, RowArena, SqlColumnType};

use crate::scalar::func::RedactSql;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say it's ok. We can move it later if needed.

Comment on lines +367 to +368
// TODO: This function isn't parsing JSONB and therefore shouldn't live in the `jsonb` module.
// Consider moving all the `parse_catalog_*` functions into their own module.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants