coord: seal all persisted tables in a single operation for efficiency #7607

danhhz · 2021-07-29T18:06:52Z

The multi-seal ability is newly exposed by persist in the preceding
commit. It would be unacceptably performance to recompute a new
MultiStreamHandle for every call to advance_local_inputs so instead
store it on Catalog and recompute it when it changes (tables are added
removed). Then, since we now have a nicely maintained MultiStreamHandle
available, use it in end transaction instead of constructing a one-off
one there.

ruchirK · 2021-07-29T18:17:35Z

src/persist/src/indexed/runtime.rs

        let updates = vec![
            (c1s1.stream_id(), data[..1].to_vec()),
            (c1s2.stream_id(), data[1..].to_vec()),
        ];
-        block_on(|res| atomic.write_atomic(updates, res))?;
+        block_on(|res| multi.write_atomic(updates, res))?;
        assert_eq!(c1s1_read.snapshot()?.read_to_end(), data[..1].to_vec());
        assert_eq!(c1s2_read.snapshot()?.read_to_end(), data[1..].to_vec());

        // Cannot write to streams not specified during construction.
        let (c1s3, _) = client1.create_or_load("3")?;


might be worth adding a seal test here as well?

ruchirK

LGTM!

danhhz · 2021-07-29T18:19:13Z

Anecdotally, I ran this locally with mz_metrics persisted and sealing now seems to be sufficiently fast to keep it from hosing the process (which is not true before this PR).

danhhz

Came up with a much more obvious way of keeping the cache invariant, which I've pushed. I feel a lot better about the maintainability of this now

danhhz · 2021-07-29T19:02:30Z

src/persist/src/indexed/runtime.rs

        let updates = vec![
            (c1s1.stream_id(), data[..1].to_vec()),
            (c1s2.stream_id(), data[1..].to_vec()),
        ];
-        block_on(|res| atomic.write_atomic(updates, res))?;
+        block_on(|res| multi.write_atomic(updates, res))?;
        assert_eq!(c1s1_read.snapshot()?.read_to_end(), data[..1].to_vec());
        assert_eq!(c1s2_read.snapshot()?.read_to_end(), data[1..].to_vec());

        // Cannot write to streams not specified during construction.
        let (c1s3, _) = client1.create_or_load("3")?;


This deserves nemesis coverage, but that'll be a followup.

The multi-seal ability is newly exposed by persist in the preceding commit. It would be unacceptably performance to recompute a new MultiStreamHandle for every call to `advance_local_inputs` so instead store it on Catalog and recompute it when it changes (tables are added removed). Then, since we now have a nicely maintained MultiStreamHandle available, use it in end transaction instead of constructing a one-off one there.

danhhz · 2021-07-29T21:25:27Z

TFTRs!

danhhz requested review from maddyblue and ruchirK July 29, 2021 18:06

ruchirK reviewed Jul 29, 2021

View reviewed changes

ruchirK approved these changes Jul 29, 2021

View reviewed changes

danhhz commented Jul 29, 2021

View reviewed changes

maddyblue approved these changes Jul 29, 2021

View reviewed changes

danhhz added 2 commits July 29, 2021 14:24

persist: support closing multiple streams for efficiency

9592238

This deserves nemesis coverage, but that'll be a followup.

danhhz force-pushed the persist_seal_multi branch from fbaf0b6 to fdb2078 Compare July 29, 2021 21:25

danhhz enabled auto-merge July 29, 2021 21:25

danhhz merged commit 95afbaa into MaterializeInc:main Jul 29, 2021

danhhz deleted the persist_seal_multi branch July 29, 2021 23:43

benesch mentioned this pull request Aug 9, 2021

release: v0.9.0 required reviews #7736

Closed

materialize-bot mentioned this pull request Aug 9, 2021

release: v0.9.0-rc1 required reviews #7739

Closed

29 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

coord: seal all persisted tables in a single operation for efficiency #7607

coord: seal all persisted tables in a single operation for efficiency #7607

danhhz commented Jul 29, 2021

ruchirK Jul 29, 2021

danhhz Jul 29, 2021

ruchirK left a comment

danhhz commented Jul 29, 2021 •

edited

Loading

danhhz left a comment

danhhz Jul 29, 2021

danhhz commented Jul 29, 2021

coord: seal all persisted tables in a single operation for efficiency #7607

coord: seal all persisted tables in a single operation for efficiency #7607

Conversation

danhhz commented Jul 29, 2021

ruchirK Jul 29, 2021

Choose a reason for hiding this comment

danhhz Jul 29, 2021

Choose a reason for hiding this comment

ruchirK left a comment

Choose a reason for hiding this comment

danhhz commented Jul 29, 2021 • edited Loading

danhhz left a comment

Choose a reason for hiding this comment

danhhz Jul 29, 2021

Choose a reason for hiding this comment

danhhz commented Jul 29, 2021

danhhz commented Jul 29, 2021 •

edited

Loading