Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coord: seal all persisted tables in a single operation for efficiency #7607

Merged
merged 2 commits into from
Jul 29, 2021

Conversation

danhhz
Copy link
Contributor

@danhhz danhhz commented Jul 29, 2021

The multi-seal ability is newly exposed by persist in the preceding
commit. It would be unacceptably performance to recompute a new
MultiStreamHandle for every call to advance_local_inputs so instead
store it on Catalog and recompute it when it changes (tables are added
removed). Then, since we now have a nicely maintained MultiStreamHandle
available, use it in end transaction instead of constructing a one-off
one there.

@danhhz danhhz requested review from maddyblue and ruchirK July 29, 2021 18:06
let updates = vec![
(c1s1.stream_id(), data[..1].to_vec()),
(c1s2.stream_id(), data[1..].to_vec()),
];
block_on(|res| atomic.write_atomic(updates, res))?;
block_on(|res| multi.write_atomic(updates, res))?;
assert_eq!(c1s1_read.snapshot()?.read_to_end(), data[..1].to_vec());
assert_eq!(c1s2_read.snapshot()?.read_to_end(), data[1..].to_vec());

// Cannot write to streams not specified during construction.
let (c1s3, _) = client1.create_or_load("3")?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth adding a seal test here as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@ruchirK ruchirK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@danhhz
Copy link
Contributor Author

danhhz commented Jul 29, 2021

Anecdotally, I ran this locally with mz_metrics persisted and sealing now seems to be sufficiently fast to keep it from hosing the process (which is not true before this PR).

Copy link
Contributor Author

@danhhz danhhz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Came up with a much more obvious way of keeping the cache invariant, which I've pushed. I feel a lot better about the maintainability of this now

let updates = vec![
(c1s1.stream_id(), data[..1].to_vec()),
(c1s2.stream_id(), data[1..].to_vec()),
];
block_on(|res| atomic.write_atomic(updates, res))?;
block_on(|res| multi.write_atomic(updates, res))?;
assert_eq!(c1s1_read.snapshot()?.read_to_end(), data[..1].to_vec());
assert_eq!(c1s2_read.snapshot()?.read_to_end(), data[1..].to_vec());

// Cannot write to streams not specified during construction.
let (c1s3, _) = client1.create_or_load("3")?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

This deserves nemesis coverage, but that'll be a followup.
The multi-seal ability is newly exposed by persist in the preceding
commit. It would be unacceptably performance to recompute a new
MultiStreamHandle for every call to `advance_local_inputs` so instead
store it on Catalog and recompute it when it changes (tables are added
removed). Then, since we now have a nicely maintained MultiStreamHandle
available, use it in end transaction instead of constructing a one-off
one there.
@danhhz
Copy link
Contributor Author

danhhz commented Jul 29, 2021

TFTRs!

@danhhz danhhz enabled auto-merge July 29, 2021 21:25
@danhhz danhhz merged commit 95afbaa into MaterializeInc:main Jul 29, 2021
@danhhz danhhz deleted the persist_seal_multi branch July 29, 2021 23:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants