Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multi-instance tasks on concurrency enabled pools #109

Open
martinjoshua opened this issue Apr 28, 2021 · 8 comments
Open

Support multi-instance tasks on concurrency enabled pools #109

martinjoshua opened this issue Apr 28, 2021 · 8 comments
Labels
feature request Feature request

Comments

@martinjoshua
Copy link

martinjoshua commented Apr 28, 2021

It's possible that I am not configuring my tasks correctly, because this seems like a feature that would already exist. But, I am having trouble getting RequiredSlots and multi-instance tasks to work.

When I create a batch job it contains thousands of tasks. I would say that 75% of the tasks created are small, and a single node could run 4 tasks concurrently. The remaining 25% are tasks that require many cores with MPI.

I had MPI working, but now need to implement the RequiredSlots feature in order to optimize our node usage.

I've refactored the small tasks to use RequiredSlots, and that seems to be working now. However, the MPI task that requires multiple nodes no longer executes.

The documentation states that the requiredSlots property be set to 1 if it's a multi-instance task. I have tried setting this to 1, with the multi-instance property requiring 2 nodes, but the task never executes. I've also tried setting requiredSlots to the pool's TaskSlotsPerNode property (just to check), but of course received a validation exception stating the property value was invalid.

Are there any known issues with running a workflow that consists of tasks that consume just 1 slot (with each node having 4 slots), and other tasks that require multiple nodes with MPI?

@alfpark
Copy link
Contributor

alfpark commented Apr 28, 2021

Currently, concurrency on the pool must be disabled in order to execute multi-instance tasks, i.e., taskSlotsPerNode must be set to 1. Please examine using multiple pools to accomplish your goal.

@alfpark alfpark added the by design Behavior is by design label Apr 28, 2021
@martinjoshua
Copy link
Author

martinjoshua commented Apr 28, 2021

Thank you for passing along that URL. I must have missed it because I was searching for "required slots" documentation since I already had MPI working.

I would suggest that this documentation be updated, as it implies you can make both work: https://docs.microsoft.com/en-us/rest/api/batchservice/task/add (near the requiredSlots property, "For multi-instance Tasks, this must be 1.")

In regards to the ability to run concurrent tasks and also multi-instance tasks, I hope that this is something the Azure Batch team considers. Multiple pools is not a viable solution, since in most cases, there is a workflow built into the job itself with a very specific task dependency hierarchy. By adding another pool, we'd lose the ability to have task dependencies.
I know that task scheduling would be a little more difficult to implement, since an MPI task would likely require the entire node, but I don't see why this is an issue as long as the end user sets the task scheduling policy to ComputeNodeFillType.Pack. This would ensure there are open nodes for multi-instance tasks.

@alfpark alfpark changed the title RequiredSlots and MPI? Support multi-instance tasks on concurrency enabled pools Apr 28, 2021
@alfpark alfpark added feature request Feature request and removed by design Behavior is by design labels Apr 28, 2021
@martinjoshua
Copy link
Author

martinjoshua commented Sep 7, 2021

Hello! Just curious if there are any updates on this feature request? This feature would definitely cut costs by allowing us to be more efficient with VM resources.

@martinjoshua
Copy link
Author

I am just curious, has there been any progress toward resolving this issue? Any updates would be appreciated! Thanks!

@martinjoshua
Copy link
Author

Hello, I am sorry to bother, but is this something that ever made it to the Batch backlog?

@prkannap
Copy link

prkannap commented Mar 1, 2022

The above item is still in Backlog. Do you have a work item created yet?

@martinjoshua
Copy link
Author

Thank you for checking. I do not have a work item, what do I need to do to create one?

@martinjoshua
Copy link
Author

martinjoshua commented Aug 9, 2022

Hello @prkannap, just curious, has there been any movement on this item?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Feature request
Projects
None yet
Development

No branches or pull requests

3 participants