[adapter] Batch-allocate user IDs to reduce persist writes during DDL by mtabebe · Pull Request #35310 · MaterializeInc/materialize

mtabebe · 2026-03-03T16:15:22Z

Problem:

Every DDL statement (CREATE TABLE, CREATE VIEW, etc.) performs a persist write + timestamp oracle call to allocate a single user ID. This represents one extra round trip to the timestamp oracle, in addition to the necessary catalog operation for DDL.

Solution:

Add IdPool struct with allocate(), allocate_many(), and refill() methods to pre-allocate user IDs in a contiguous range
Add USER_ID_POOL_BATCH_SIZE dyncfg (default 512) to control batch size
Add user_id_pool field to Coordinator initialized as empty
Add Coordinator wrapper methods allocate_user_id() and allocate_user_ids(count) that draw from the pool and auto-refill from catalog when exhausted
Replace all call sites that used the two-line pattern of get_catalog_write_ts() + catalog().allocate_user_id(ts) with the new pooled methods
This eliminates the timestamp oracle call, and catalog operation

Testing:

New IdPool unit tests covering empty pool, single/batch allocation, refill, mixed allocation, and invalid range panic
New SLT regression test (test/sqllogictest/id_pool.slt) exercising tables, views, materialized views, indexes, types, secrets, and drop/recreate cycles
New MZCompose restart test to ensure IDs remain unique across the restart

Closes SQL-115

github-actions · 2026-03-03T16:15:32Z

Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone.

PR title guidelines

Use imperative mood: "Fix X" not "Fixed X" or "Fixes X"
Be specific: "Fix panic in catalog sync when controller restarts" not "Fix bug" or "Update catalog code"
Prefix with area if helpful: compute: , storage: , adapter: , sql:

Pre-merge checklist

The PR title is descriptive and will make sense in the git log.
This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

mtabebe · 2026-03-03T19:36:26Z

src/adapter/src/coord/sequencer/inner.rs

+        let (connection_id, connection_gid) = match self.allocate_user_id().await {
            Ok(item_id) => item_id,
-            Err(err) => return ctx.retire(Err(err.into())),
+            Err(err) => return ctx.retire(Err(err)),


I got hit with a lint issue about this, so tidying it up

def- · 2026-03-03T22:12:16Z

Nightly triggered: https://buildkite.com/materialize/nightly/builds/15452

mtabebe · 2026-03-06T19:00:26Z

bugbot run

cursor · 2026-03-06T19:00:32Z

PR Summary

Medium Risk
Changes the coordinator’s ID allocation path used by many DDL statements, so any bug could affect catalog item creation and ID uniqueness, though coverage is improved with new unit/integration/restart tests and the behavior is gated/tunable via a new dyncfg.

Overview
Batches user ID allocation for DDL. Adds a Coordinator-owned IdPool plus new helper methods (allocate_user_id/allocate_user_ids) that refill from the catalog in configurable batches using the new dyncfg user_id_pool_batch_size (default 512), reducing per-DDL persist writes and timestamp-oracle calls.

Wires the pool through DDL call sites and updates tests. Replaces direct catalog().allocate_user_id(s) / allocate_user_ids(..., ts) usage across DDL sequencer paths (sources, tables, views/MVs, indexes, types, secrets, etc.), updates CI tooling to recognize/flip the new parameter, loosens ID assumptions in the localstack secrets test, adjusts a regex matcher in the MV refresh check, and adds new regression coverage (IdPool unit tests, test/sqllogictest/id_pool.slt, and an mzcompose restart test ensuring IDs aren’t reused across restarts).

^{Written by Cursor Bugbot for commit 6556509. This will update automatically on new commits. Configure here.}

cursor

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

mtabebe · 2026-03-06T20:42:55Z

The only remaining CI failure is Zero downtime 3 (upsert-sources) where we fail because we can't reach ReadyToPromote due to consensus contention under what appears to be load (50 concurrent upsert sources).

I think this is unrelated to the batch ID allocation changes.

So I'd like to get this reviewed

aljoscha

Very nice! Could we add some perf numbers to the PR / an Experiments section. Could be as simple as doing a DDL on an empty catalog before and after, a couple times, and we record that diff. Say before it takes 50ms, now it takes 40ms, or something like that.

aljoscha · 2026-03-11T16:16:00Z

test/sqllogictest/id_pool.slt

@@ -0,0 +1,125 @@
+# Copyright Materialize, Inc. and contributors. All rights reserved.


This test is fine, but I don't know if it adds much beyond the test coverage we already have. In case of doubt, leave it in, though!

mtabebe · 2026-03-11T16:32:42Z

Very nice! Could we add some perf numbers to the PR / an Experiments section. Could be as simple as doing a DDL on an empty catalog before and after, a couple times, and we record that diff. Say before it takes 50ms, now it takes 40ms, or something like that.

Sure, it was roughly a 10% saving: from 160ms to 145 ms

Problem: Every DDL statement (CREATE TABLE, CREATE VIEW, etc.) performs a persist write + timestamp oracle call to allocate a single user ID. This represents one extra round trip to the timestamp oracle, in addition to the necessary catalog operation for DDL. Solution: - Add `IdPool` struct with `allocate()`, `allocate_many()`, and `refill()` methods to pre-allocate user IDs in a contiguous range - Add `USER_ID_POOL_BATCH_SIZE` dyncfg (default 512) to control batch size - Add `user_id_pool` field to `Coordinator` initialized as empty - Add Coordinator wrapper methods `allocate_user_id()` and `allocate_user_ids(count)` that draw from the pool and auto-refill from catalog when exhausted - Replace all call sites that used the two-line pattern of `get_catalog_write_ts()` + `catalog().allocate_user_id(ts)` with the new pooled methods - This eliminates the timestamp oracle call, and catalog operation Testing: - New `IdPool` unit tests covering empty pool, single/batch allocation, refill, mixed allocation, and invalid range panic - New SLT regression test (`test/sqllogictest/id_pool.slt`) exercising tables, views, materialized views, indexes, types, secrets, and drop/recreate cycles - New MZCompose restart test to ensure IDs remain unique across the restart - update all_checks to handle larger IDs (as restart tests may burn IDs) - update secret id check to use actual allocated id (as restart burns IDs) Perf: On local machine: 10% saving: from 160ms to 145 ms on a empty catalog

…ring DDL (MaterializeInc#35310)" This reverts commit 7cdb382.

…ring DDL (MaterializeInc#35310)" This reverts commit 7cdb382. Due to failures in: MaterializeInc/database-issues#11232

…MaterializeInc#35310) # Problem: Every DDL statement (CREATE TABLE, CREATE VIEW, etc.) performs a persist write + timestamp oracle call to allocate a single user ID. This represents one extra round trip to the timestamp oracle, in addition to the necessary catalog operation for DDL. # Solution: - Add `IdPool` struct with `allocate()`, `allocate_many()`, and `refill()` methods to pre-allocate user IDs in a contiguous range - Add `USER_ID_POOL_BATCH_SIZE` dyncfg (default 512) to control batch size - Add `user_id_pool` field to `Coordinator` initialized as empty - Add Coordinator wrapper methods `allocate_user_id()` and `allocate_user_ids(count)` that draw from the pool and auto-refill from catalog when exhausted - Replace all call sites that used the two-line pattern of `get_catalog_write_ts()` + `catalog().allocate_user_id(ts)` with the new pooled methods - This eliminates the timestamp oracle call, and catalog operation # Testing: - New `IdPool` unit tests covering empty pool, single/batch allocation, refill, mixed allocation, and invalid range panic - New SLT regression test (`test/sqllogictest/id_pool.slt`) exercising tables, views, materialized views, indexes, types, secrets, and drop/recreate cycles - New MZCompose restart test to ensure IDs remain unique across the restart # Perf On local machine: 10% saving: from 160ms to 145 ms on a empty catalog Closes SQL-115

…#35460) [adapter] Batch-allocate user IDs to reduce persist writes during DDL (#35310) # Problem: Every DDL statement (CREATE TABLE, CREATE VIEW, etc.) performs a persist write + timestamp oracle call to allocate a single user ID. This represents one extra round trip to the timestamp oracle, in addition to the necessary catalog operation for DDL. # Solution: - Add `IdPool` struct with `allocate()`, `allocate_many()`, and `refill()` methods to pre-allocate user IDs in a contiguous range - Add `USER_ID_POOL_BATCH_SIZE` dyncfg (default 512) to control batch size - Add `user_id_pool` field to `Coordinator` initialized as empty - Add Coordinator wrapper methods `allocate_user_id()` and `allocate_user_ids(count)` that draw from the pool and auto-refill from catalog when exhausted - Replace all call sites that used the two-line pattern of `get_catalog_write_ts()` + `catalog().allocate_user_id(ts)` with the new pooled methods - This eliminates the timestamp oracle call, and catalog operation #Sub Problem/Solution In the preflight phase of a zero down time upgrade, get_next_ids() is used to determine if there has been ongoing DDL. Since the IdPool batch-allocates IDs, the counter can advance far ahead of actually-created items. New objects created from the pool then had IDs below the captured baseline and were missed by DDL detection, so the standby could fail to see new tables/views. So IDs in the catalog (transaction get_items / get_cluster_replicas) instead of the allocator counter. And compute next user item ID and next user replica ID from the max existing # Testing: - New `IdPool` unit tests covering empty pool, single/batch allocation, refill, mixed allocation, and invalid range panic - New SLT regression test (`test/sqllogictest/id_pool.slt`) exercising tables, views, materialized views, indexes, types, secrets, and drop/recreate cycles - New MZCompose restart test to ensure IDs remain unique across the restart - New cargo test: mz-catalog test_persist_ddl_detection_with_batch_allocated_ids - New MZ Compose test: workflow_ddl_detection_with_id_pool # Perf On local machine: 10% saving: from 160ms to 145 ms on a empty catalog

mtabebe requested a review from aljoscha March 3, 2026 16:15

mtabebe requested a review from a team as a code owner March 3, 2026 16:15

mtabebe force-pushed the ma/ddl/phase1-batch-ids branch from a0cba20 to 563d8af Compare March 3, 2026 19:30

mtabebe commented Mar 3, 2026

View reviewed changes

mtabebe force-pushed the ma/ddl/phase1-batch-ids branch 2 times, most recently from c3dab3b to 092af75 Compare March 3, 2026 20:34

mtabebe requested a review from a team as a code owner March 3, 2026 20:34

mtabebe force-pushed the ma/ddl/phase1-batch-ids branch from 092af75 to d3ee143 Compare March 3, 2026 20:43

mtabebe force-pushed the ma/ddl/phase1-batch-ids branch 3 times, most recently from e1bdcd0 to 6556509 Compare March 5, 2026 22:20

cursor bot reviewed Mar 6, 2026

View reviewed changes

teskje self-requested a review March 11, 2026 14:51

aljoscha approved these changes Mar 11, 2026

View reviewed changes

mtabebe force-pushed the ma/ddl/phase1-batch-ids branch from 6556509 to 2a7fd52 Compare March 11, 2026 17:48

mtabebe force-pushed the ma/ddl/phase1-batch-ids branch from 2a7fd52 to 82fff70 Compare March 11, 2026 18:07

mtabebe merged commit 7cdb382 into MaterializeInc:main Mar 11, 2026
126 checks passed

mtabebe deleted the ma/ddl/phase1-batch-ids branch March 11, 2026 18:32

mtabebe added a commit to mtabebe/materialize that referenced this pull request Mar 12, 2026

Revert "[adapter] Batch-allocate user IDs to reduce persist writes du…

6e2ec2b

…ring DDL (MaterializeInc#35310)" This reverts commit 7cdb382.

mtabebe mentioned this pull request Mar 12, 2026

[adapter] Batch-allocate user IDs to reduce persist writes during DDL #35460

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[adapter] Batch-allocate user IDs to reduce persist writes during DDL#35310

[adapter] Batch-allocate user IDs to reduce persist writes during DDL#35310
mtabebe merged 1 commit intoMaterializeInc:mainfrom
mtabebe:ma/ddl/phase1-batch-ids

mtabebe commented Mar 3, 2026 •

edited by aljoscha

Loading

Uh oh!

github-actions bot commented Mar 3, 2026

Uh oh!

mtabebe Mar 3, 2026

Uh oh!

def- commented Mar 3, 2026

Uh oh!

mtabebe commented Mar 6, 2026

Uh oh!

cursor bot commented Mar 6, 2026 •

edited

Loading

Uh oh!

cursor bot left a comment

Uh oh!

mtabebe commented Mar 6, 2026

Uh oh!

aljoscha left a comment

Uh oh!

aljoscha Mar 11, 2026

Uh oh!

mtabebe commented Mar 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -0,0 +1,125 @@
		# Copyright Materialize, Inc. and contributors. All rights reserved.

Conversation

mtabebe commented Mar 3, 2026 • edited by aljoscha Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem:

Solution:

Testing:

Uh oh!

github-actions bot commented Mar 3, 2026

PR title guidelines

Pre-merge checklist

Uh oh!

mtabebe Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

def- commented Mar 3, 2026

Uh oh!

mtabebe commented Mar 6, 2026

Uh oh!

cursor bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

mtabebe commented Mar 6, 2026

Uh oh!

aljoscha left a comment

Choose a reason for hiding this comment

Uh oh!

aljoscha Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

mtabebe commented Mar 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mtabebe commented Mar 3, 2026 •

edited by aljoscha

Loading

cursor bot commented Mar 6, 2026 •

edited

Loading