Skip to content

adapter: Refactor how we store query plans in the catalog#35834

Merged
ggevay merged 3 commits intoMaterializeInc:mainfrom
ggevay:implications-optimizer-preparation
Apr 17, 2026
Merged

adapter: Refactor how we store query plans in the catalog#35834
ggevay merged 3 commits intoMaterializeInc:mainfrom
ggevay:implications-optimizer-preparation

Conversation

@ggevay
Copy link
Copy Markdown
Contributor

@ggevay ggevay commented Apr 2, 2026

This is preparation for moving those catalog operations into the catalog implications framework that involve query plans, such as CREATE MATERIALIZED VIEW.

This PR solves a long-standing weirdness in our catalog: We used to have query plans and other metainfo in a separate part of the catalog (CatalogPlans), instead of inside the catalog items. This happened only due to historical reasons. This PR fixes this.

I recommend reviewing commit by commit.

The 1st is just some renaming to avoid confusion between locally optimized plans (with just optimize_mir_local without view inlining) and fully (globally) optimized plans (that is, after view inlining, with optimize_dataflow).

The 2nd commit is the main thing, see commit msg for details.

(Note that even after this PR, we are still adding the plans in the side effect closure of the catalog transactions that adds the catalog items, e.g., in create_materialized_view_finish. #35837 will be a follow-up PR, which will instead add the plans to the catalog::Ops that comprise the catalog transactions.)

Nightly: https://buildkite.com/materialize/nightly/builds/15973, but there is a staggering amount of unrelated redness, so it's hard to read.

@ggevay ggevay added the A-ADAPTER Topics related to the ADAPTER layer label Apr 2, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 2, 2026

Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone.

PR title guidelines

  • Use imperative mood: "Fix X" not "Fixed X" or "Fixes X"
  • Be specific: "Fix panic in catalog sync when controller restarts" not "Fix bug" or "Update catalog code"
  • Prefix with area if helpful: compute: , storage: , adapter: , sql:

Pre-merge checklist

  • The PR title is descriptive and will make sense in the git log.
  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

@ggevay ggevay force-pushed the implications-optimizer-preparation branch 8 times, most recently from ebb0a8f to b55299a Compare April 9, 2026 11:37
/// reason we end up with two identical notices being dropped by the same
/// call, the result will contain only one instance of that notice.
#[mz_ore::instrument(level = "trace")]
pub fn drop_plans_and_metainfos(
Copy link
Copy Markdown
Contributor Author

@ggevay ggevay Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This is getting replaced by drop_optimizer_notices. Plans don't need separate dropping anymore.)

@ggevay ggevay force-pushed the implications-optimizer-preparation branch 2 times, most recently from d2db776 to 374b9f8 Compare April 9, 2026 12:03
@ggevay ggevay marked this pull request as ready for review April 9, 2026 12:03
@ggevay ggevay requested review from a team as code owners April 9, 2026 12:03
@ggevay ggevay requested a review from mtabebe April 9, 2026 12:03
Copy link
Copy Markdown
Contributor

@mtabebe mtabebe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change makes sense to me, a few thoughts on testing:

  • is there a test that would test behaviour of a drop with cascade?
  • is there any test for drop after a restart, or migration (I care less about this, but I'm just curious)

Comment thread src/adapter/src/catalog/apply.rs Outdated
CatalogItem::Index(idx) => idx.optimized_plan = Some(Arc::new(plan)),
CatalogItem::MaterializedView(mv) => mv.optimized_plan = Some(Arc::new(plan)),
CatalogItem::ContinualTask(ct) => ct.optimized_plan = Some(Arc::new(plan)),
other => panic!("set_optimized_plan called on {:?}", other.typ()),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful to also panic with the id?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Same comment for other places with the panic)

Copy link
Copy Markdown
Contributor Author

@ggevay ggevay Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thx, changed the 3 places

/// Set the optimized plan for the item identified by `id`.
///
/// # Panics
/// If the item is not an `Index`, `MaterializedView`, or
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious if there is a way to enforce this outside of the match? Maybe it isn't necessary since you have follow up work planned?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I don't really see a way to enforce it outside at the moment. But yeah, some time later I'd like to make these plan fields non-optional, and just populate them already when we create the catalog item (mentioned also here), and then we'd just delete these setters.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that makes sense :)


// Clean up plans and optimizer notices for items that
// were retracted but not replaced (i.e., truly dropped).
let dropped_entries: Vec<CatalogEntry> = retractions
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make dropping happen atomically with the catalog operation now?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in the sense that it now happens inside the transact_inner call in transact, whereas before the PR it was happening after the tx.commit that is after the transact_inner call.

@ggevay ggevay force-pushed the implications-optimizer-preparation branch 2 times, most recently from f846ee4 to d22addd Compare April 10, 2026 11:49
ggevay and others added 2 commits April 17, 2026 14:03
Move optimized_plan, physical_plan, and dataflow_metainfo from
the CatalogPlans side-car maps onto the actual catalog objects
(Index, MaterializedView, ContinualTask), eliminating the
CatalogPlans struct entirely.

Key changes:

- Add transient plan fields (Option, #[serde(skip)]) to Index,
  MaterializedView, and ContinualTask. These are recomputed
  during bootstrap and must be skipped to avoid Testdrive
  catalog consistency mismatches.
- Add accessor methods on CatalogItem and item_mut() on
  CatalogEntry.
- Move notices_by_dep_id and all plan setter/getter/drop
  methods into CatalogState, improving encapsulation and
  snapshot consistency.
- Automate optimizer notice cleanup: apply_updates now collects
  truly-dropped entries after each timestamp group, cleans up
  notices_by_dep_id, and generates notice retraction builtin
  table updates inline. This eliminates the manual cleanup
  calls from Catalog::transact.
- Introduce pack_optimizer_notice_updates (unresolved form for
  apply_updates); pack_optimizer_notices now delegates to it.
- Update all construction sites to initialize plan fields as
  None; update apply_replacement to copy plan fields.

Co-authored-by: Junie <junie@jetbrains.com>
@ggevay ggevay force-pushed the implications-optimizer-preparation branch from d22addd to 5b2aabd Compare April 17, 2026 12:50
@ggevay
Copy link
Copy Markdown
Contributor Author

ggevay commented Apr 17, 2026

TFTR! I've added some tests, see last commit.

@ggevay ggevay enabled auto-merge (squash) April 17, 2026 12:51
@ggevay ggevay force-pushed the implications-optimizer-preparation branch from 5b2aabd to d65857f Compare April 17, 2026 12:52
@ggevay ggevay merged commit a632912 into MaterializeInc:main Apr 17, 2026
121 checks passed
ggevay added a commit to ggevay/materialize that referenced this pull request Apr 19, 2026
Since MaterializeInc#35834 the `optimized_plan`, `physical_plan`, and
`dataflow_metainfo` fields live on the `CatalogItem` itself rather
than in the separate `CatalogPlans` side table. However,
`parse_item_inner` hardcoded `None` for those fields when
reconstructing a `CatalogItem` from `create_sql`. A `RENAME` on a
materialized view (or index / continual task) goes through the
retract+add path in `apply_item_update`, which calls
`deserialize_item` → `parse_item` → `parse_item_inner` with the
previous `CatalogItem` as `previous_item`. The plan fields would be
silently dropped, so a subsequent `EXPLAIN OPTIMIZED PLAN FOR
MATERIALIZED VIEW ...` would fail with "cannot find dataflow
metainformation for materialized view ... in catalog".

Fix: carry the three plan fields from `previous_item` through to the
newly-reconstructed item. This is done as a post-match stamp to avoid
duplicating the logic into each of the three per-variant construction
sites and to keep the change local.

Also add a regression test to `test/sqllogictest/rename.slt` that
exercises `ALTER MATERIALIZED VIEW ... RENAME TO ...` and
`ALTER INDEX ... RENAME TO ...` followed by `EXPLAIN OPTIMIZED /
PHYSICAL PLAN`.

Co-authored-by: Junie <junie@jetbrains.com>
ggevay added a commit to ggevay/materialize that referenced this pull request Apr 19, 2026
Since MaterializeInc#35834 the `optimized_plan`, `physical_plan`, and
`dataflow_metainfo` fields live on the `CatalogItem` itself rather
than in the separate `CatalogPlans` side table. However,
`parse_item_inner` hardcoded `None` for those fields when
reconstructing a `CatalogItem` from `create_sql`. A `RENAME` on a
materialized view (or index / continual task) goes through the
retract+add path in `apply_item_update`, which calls
`deserialize_item` → `parse_item` → `parse_item_inner` with the
previous `CatalogItem` as `previous_item`. The plan fields would be
silently dropped, so a subsequent `EXPLAIN OPTIMIZED PLAN FOR
MATERIALIZED VIEW ...` would fail with "cannot find dataflow
metainformation for materialized view ... in catalog".

Fix: carry the three plan fields from `previous_item` through to the
newly-reconstructed item. This is done as a post-match stamp to avoid
duplicating the logic into each of the three per-variant construction
sites and to keep the change local.

Also add a regression test to `test/sqllogictest/rename.slt` that
exercises `ALTER MATERIALIZED VIEW ... RENAME TO ...` and
`ALTER INDEX ... RENAME TO ...` followed by `EXPLAIN OPTIMIZED /
PHYSICAL PLAN`.

Co-authored-by: Junie <junie@jetbrains.com>
ggevay added a commit to ggevay/materialize that referenced this pull request Apr 20, 2026
Since MaterializeInc#35834 the `optimized_plan`, `physical_plan`, and
`dataflow_metainfo` fields live on the `CatalogItem` itself rather
than in the separate `CatalogPlans` side table. However,
`parse_item_inner` hardcoded `None` for those fields when
reconstructing a `CatalogItem` from `create_sql`. A `RENAME` on a
materialized view (or index / continual task) goes through the
retract+add path in `apply_item_update`, which calls
`deserialize_item` → `parse_item` → `parse_item_inner` with the
previous `CatalogItem` as `previous_item`. The plan fields would be
silently dropped, so a subsequent `EXPLAIN OPTIMIZED PLAN FOR
MATERIALIZED VIEW ...` would fail with "cannot find dataflow
metainformation for materialized view ... in catalog".

Fix: carry the three plan fields from `previous_item` through to the
newly-reconstructed item. This is done as a post-match stamp to avoid
duplicating the logic into each of the three per-variant construction
sites and to keep the change local.

Also add a regression test to `test/sqllogictest/rename.slt` that
exercises `ALTER MATERIALIZED VIEW ... RENAME TO ...` and
`ALTER INDEX ... RENAME TO ...` followed by `EXPLAIN OPTIMIZED /
PHYSICAL PLAN`.

Co-authored-by: Junie <junie@jetbrains.com>
ggevay added a commit that referenced this pull request Apr 21, 2026
Fixes MaterializeInc/database-issues#11316

Since #35834 the `optimized_plan`, `physical_plan`, and
`dataflow_metainfo` fields live on the `CatalogItem` itself rather than
in the separate `CatalogPlans` side table. However, `parse_item_inner`
hardcoded `None` for those fields when reconstructing a `CatalogItem`
from `create_sql`. A `RENAME` on a materialized view (or index /
continual task) goes through the retract+add path in
`apply_item_update`, which calls `deserialize_item` → `parse_item` →
`parse_item_inner` with the previous `CatalogItem` as `previous_item`.
The plan fields would be silently dropped, so a subsequent `EXPLAIN
OPTIMIZED PLAN FOR MATERIALIZED VIEW ...` would fail with "cannot find
dataflow metainformation for materialized view ... in catalog".

Fix: carry the three plan fields from `previous_item` through to the
newly-reconstructed item. This is done as a post-match stamp to avoid
duplicating the logic into each of the three per-variant construction
sites and to keep the change local.

(Note: I'm still planning to make the plans inside catalog items
non-optional, as discussed
[here](https://materializeinc.slack.com/archives/C08A62E0751/p1774627385747809?thread_ts=1774623679.171309&cid=C08A62E0751),
so this is a temporary fix.)

Co-authored-by: Junie <junie@jetbrains.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-ADAPTER Topics related to the ADAPTER layer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants