-
Notifications
You must be signed in to change notification settings - Fork 466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
persist: remove production usages of futures_executor::block_on #12301
Conversation
Manually verified that stressing the concurrency test works with a tokio::sync::Mutex in MemConsensus. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I had one nit and one comment.
|| format!("data-generator-{}", idx), | ||
move || { | ||
let data_generator_handle = | ||
mz_ore::task::spawn(|| format!("data-generator-{}", idx), async move { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Ruchir was using a blocking task on purpose here, to ensure that the data generator got it's own CPU thread that it can use use as it wants. Not sure that reasoning is sound, though, and it's definitely nicer when we can use async
code in the block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm then maybe it should get its own Runtime or even just a Thread? The old impl was busy-waiting which also has its downsides
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this more and neither of those seem to be the "tokio way" (which I'm gradually learning to just accept and embrace). I think the problem we might hit here in practice is that the data generation call doesn't have any yield points. So, probably just that bit needs to be moved into a "blocking" call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meaning, move everything after this one async
call of into a spawn_blocking
? The whole generator code only had this one await
on the barrier. Side note: I don't know how useful having the barriers for synchronization is here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It only had one await call, but it should have had two :). One for the barrier, and one for the sleep that I've now added in this PR (this was the busy waiting "bug" I mentioned above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
went ahead and pushed just the data generation call into a spawn_blocking
@@ -343,17 +347,13 @@ where | |||
pub struct ReadHandle<K, V, T, D> | |||
where | |||
T: Timestamp + Lattice + Codec64, | |||
// TODO: Only the T bound should exist, the rest are a temporary artifact of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙌
src/persist-client/src/read.rs
Outdated
{ | ||
pub(crate) reader_id: ReaderId, | ||
pub(crate) machine: Machine<K, V, T, D>, | ||
pub(crate) blob: Arc<dyn BlobMulti + Send + Sync>, | ||
|
||
pub(crate) since: Antichain<T>, | ||
pub(crate) politely_expired: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe explicitly_expired
? I see what you mean, but I don't know if others will.
Keep in mind: this is a nit from a non-native English speaker... 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
@aljoscha did you want to take a look at the open_loop changes I made in response to your comment or is this ready to go? |
I did take a look, and the changes seem good! |
This can cause deadlocks when used with tokio Runtimes. Touches #12231
It was being used in the open loop benchmark because the data generator task busy-waited. Remove the busy waiting and the block_on. Touches #12231
TFTR! |
This can cause deadlocks when used with tokio Runtimes.
Touches MaterializeInc/database-issues#3543
Motivation
Testing
Manually verified that stressing the concurrency test works with a tokio::sync::Mutex in MemConsensus.
Release notes
This PR includes the following user-facing behavior changes: