Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autoscale formula metrics for number of task slots #118

Open
okofish opened this issue Oct 26, 2021 · 1 comment
Open

Autoscale formula metrics for number of task slots #118

okofish opened this issue Oct 26, 2021 · 1 comment
Labels
feature request Feature request known issue Known issue

Comments

@okofish
Copy link

okofish commented Oct 26, 2021

Feature Request Description

My Azure Batch-based application uses a constraint-based scheduling convention wherein task slots represent vCPUs. If I have a pool using a 4-vCPU VM size then I set "Task slots per node" to 4, and when I submit a task that needs access to 2 vCPUs I set the task's "Required slots" to 2.

I have been unable to get this scheme to work reliably with autoscaling, because the autoscale formula language is unaware of tasks' task slot requirements. Information on task slots per node is available in the $TaskSlotsPerNode variable, but there does not appear to be any information on the slot requirements of existing tasks. This means that any autoscaling formula implicitly bakes in the assumption that every task requires exactly one slot.

Describe Preferred Solution

I think the ideal solution is to introduce a task-slot-wise version of each task metric variable. This might look like:

Existing task-wise metric New task-slot-wise metric
$ActiveTasks $ActiveTaskSlots
$RunningTasks $RunningTaskSlots
$PendingTasks $PendingTaskSlots
$SucceededTasks $SucceededTaskSlots
$FailedTasks $FailedTaskSlots

Describe Alternatives Considered

The current task-wise metric variables could be changed to reflect task slots instead of whole tasks. This would be a breaking change.

Additional Context

The proposed addition of task-slot-wise metrics would be analogous to the addition of the TaskSlotCounts object to the response of the Job_GetTaskSlots operation made in API version 2020-09-01.12.0.

@MichaCo
Copy link

MichaCo commented Jun 21, 2022

Hi @alfpark
Does "known issue" mean this will not get fixed? Sounds more like a feature request to me and a very valueable one!

I have another use case where I run into the exact same problem.
I have a Job which has tasks which depend on each other, there are one task per job which is more compute intense then others, I usually set that one to max slots per node so that it runs on one VM with all the resources available.

The dependent tasks later are 10x more tasks but they run faster and only need one CPU, so, in that case I set slots to 1 for those tasks.

Now, its impossible to calculate how many nodes are actually needed with auto scaling...

So, my question would be, will this feature be added soon, to use task slot variables within the formula instead of tasks?
Or is there another solution or workaround for those use cases I should consider instead of auto scaling?

Thanks,
Michael

@alfpark alfpark added the feature request Feature request label Nov 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Feature request known issue Known issue
Projects
None yet
Development

No branches or pull requests

3 participants