Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Global phase staggering #514

Open
wants to merge 1 commit into
base: development
Choose a base branch
from

Conversation

beavis9k
Copy link

Added global_stagger_phase_* options to limit the total number of plotting jobs running based on phase. Limited testing done on my farm where I wanted to limit the number of jobs in phase 1 since I have more disk I/O than CPU cores that can use it. Testing was done on the main branch, not on this development rebase.

@rightpeter
Copy link

rightpeter commented May 18, 2021

I'm also thinking about limit based on phase. Good to see already a PR created 👍

Edit: pulled the code and it worked as I expected.

@ShocWave
Copy link

Nice, I hope this gets merged.

@jneuhauser
Copy link

I'm using this since some days without problems...

You can install latest development version with the applied PR by:
pip install --force-reinstall git+https://github.com/ericaltendorf/plotman@refs/pull/514/merge

@altendky
Copy link
Collaborator

What is the general intent of more phase based staggering? To deal with the multiple threads that can be configured in phase 1?

@beavis9k
Copy link
Author

What is the general intent of more phase based staggering? To deal with the multiple threads that can be configured in phase 1?

Exactly. Current staggering based on tmpdirs only helps manage disk I/O.

@ShocWave
Copy link

Yes, in phase 1 on older CPU's, it is so intensive that older systems with a weak cpu and limited RAM will slow to the point where the Ubuntu gui is unresponsive.

@roxleopardo
Copy link

Thank you! That a good feature that I was needing. I installed this version and seems working well.

@altendky
Copy link
Collaborator

altendky commented Jun 8, 2021

For limiting global thread usage I intend to implement a global thread limit. It seems a lot more direct, easier to configure, and easier to understand.

@beavis9k
Copy link
Author

beavis9k commented Jun 8, 2021

For limiting global thread usage I intend to implement a global thread limit. It seems a lot more direct, easier to configure, and easier to understand.

It makes sense to make it phase-based since the number of threads used changes from phase to phase. (As I understand it, phase 1 uses the specified number of threads most aggressively.) A global thread limit may make sense for other resource management situations though.

@roxleopardo
Copy link

For limiting global thread usage I intend to implement a global thread limit. It seems a lot more direct, easier to configure, and easier to understand.

It makes sense to make it phase-based since the number of threads used changes from phase to phase. (As I understand it, phase 1 uses the specified number of threads most aggressively.) A global thread limit may make sense for other resource management situations though.

I think the same. I have 12 threads, so using phase 1 limited to 4 plots I can run 4 plots at phase 1 (resulting 8 threads) and I run more 4 plots at phase 2+(resulting more 4 threads), this allows me to have a good manageability of CPU usage, I can run only one plot at phase 1 and 10 at phase 2+ if I like too. I think this is a good feature.

@michaelc95
Copy link

Love this idea. Hope it makes it in.

@altendky
Copy link
Collaborator

The global thread limit would note what phase each plot was in, and the number of threads specified to that particular process, to calculate the total presently used. It would compare this against the configured limit and the number of threads that would be used by a newly started plot. This way you don't have to back calculate. You just say total_thread_limit: 37 and move on and let plotman do the math and account for manually started plot's thread usage and any changes you make to the per process thread setting etc.

@altendky
Copy link
Collaborator

Is there a use case we can discuss where the global phase stagger would be used for something other than limiting threads? Or some way in which plotman can't apply the thread limit itself based on a user configured maximum number of threads?

@ericaltendorf
Copy link
Owner

Drive-by comment: I do think we should consider this suggestion seriously, but I think we should have a high bar for adding more scheduling config options. My intuition is we already have too many config options in total, probably not the right config options, and it makes it difficult and confusing for people to set up. I suspect in some cases it's even causing people to think they need more config options because they can't figure out how to use the ones offered.

My preference would be to collect feedback on the scheduling config system, hear what issues people are having with scheduling, and design a new config system that's hopefully simpler than the one we have but also meets the needs expressed by the community. What do folks think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants