Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjustable plotting concurrency #2095

Merged
merged 2 commits into from
Oct 12, 2023
Merged

Conversation

nazar-pc
Copy link
Member

This is the first of two steps to improve plotting performance. While encoding one sector at a time in parallel is the most performance way, we can still do async I/O in the meantime. This PR allows to download second sector while we're encoding the first one, cutting total plotting time in case more than one farm is used. Generally the same should be used for single-farm setup as well, but it is a bit more difficult and will be implemented later.

The defaults are optimal, but still exposed to users via CLI in case they want to tinker with it.

Code contributor checklist:

…-components`, reorder generics on `plot_sector`
Copy link
Member

@shamil-gadelshin shamil-gadelshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a couple of questions related to the semaphores and deadlocks.

pub sector_metadata_output: &'a mut Vec<u8>,
/// Semaphore for part of the plotting when farmer downloads new sector, allows to limit memory
/// usage of the plotting process, permit will be held until the end of the plotting process,
/// so must always be larger than `encoding_semaphore` or else plotting may cause deadlock
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind explaining the call sequence leading to deadlock when d_sem capacity is less than e_sem capacity? At first glance, it should work because we acquire semaphores in the same order every time and for a limited amount of time (d_sem first then e_sem). What am I missing?

Anyway, if it's indeed the case that the deadlock is possible due to the configuration issues - we should introduce a check and fail much earlier than during the plotting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think you're right, it'll reduce performance, but there should be no deadlock, will update comments.

/// the plotting process, must always be larger than `encoding_semaphore` or else plotting may
/// deadlock
#[arg(long, default_value_t = 2)]
sector_downloading_concurrency: usize,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about introducing either NonZero type here or a validation function for these parameters? Setting those parameters to zero leads to guaranteed stalling of the plotting.

Also, what will happen if we set a very large number here? Should we introduce an upper bound as well here?

Copy link
Member Author

@nazar-pc nazar-pc Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-zero makes sense. Very large number is still bound by the number of farms and will simply result in slower plotting with higher memory usage.

…defaults that optimize plotting performance with multiple farms, remove unnecessary and unused semaphore
@nazar-pc nazar-pc force-pushed the adjustable-plotting-concurrency branch from 863fbb1 to fe22c4a Compare October 12, 2023 13:14
@nazar-pc nazar-pc merged commit 14b552d into main Oct 12, 2023
9 of 10 checks passed
@nazar-pc nazar-pc deleted the adjustable-plotting-concurrency branch October 12, 2023 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants