Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Controller should schedule tasks of multiple clients at the same time #1434

Closed
kolomanschaft opened this issue Feb 23, 2012 · 4 comments
Closed
Labels
Milestone

Comments

@kolomanschaft
Copy link

Consider the following scenario. A simulation is split into 10 independent (parallel) tasks. The simulation has a bunch of parameters. 4 Engines are up and connected to a single controller. Several clients submit the same simulation but with different simulation parameters to the controller via load balanced views.

What seems to happen now is that the controller schedules the 10 tasks of the first client on it's engines. The 10 tasks of the second client get started after all tasks of the previous client are finished. Lets assume all tasks take exactly the same time T to compute. That means that during the first 8 tasks of the first client all engines are at full capacity. During the remaining 2 tasks, 2 of the engines are idle. That leads to an overall computation time of 6T. If the tasks of all connected clients would share a task pool, the computation time would have been 5T.

@tburnett
Copy link

I think this is dealt with by setting high water mark, TaskScheduler.hwm, in the ipcontroller_config profile. I encountered this issue with compute-intensive tasks of varying length, and wound up setting it to 2, instead of the default 0, so there would be no latency. (My numbers are 1728 tasks and 96 engines.) However your case would indicate 1.
It seems odd that the default for load balancing is to effectively disable it, but Min explained that he was concerned with latency.

@minrk
Copy link
Member

minrk commented Feb 23, 2012

For this reason, the HWM default in master is now 1, rather than 0. The behavior of HWM=1 is more obvious, which makes more sense for defaults.

@kolomanschaft
Copy link
Author

I was not aware of the HWM setting, but HWM=1 would definitely be what I would expect as default behavior. So with HWM set to 1, does it make any difference whether tasks are submitted from a single client or several?

@minrk
Copy link
Member

minrk commented Mar 6, 2012

No, the scheduler does not make any decisions based on who submitted each tasks. All tasks are equal, and ZeroMQ fair-queues requests on the incoming socket, so if two tasks submit a bunch of tasks very quickly at the same time, they will be interleaved. But this requires that they really be submitted at the same time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants