You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Current documentation about jobs scaling is incomplete and the jobs scaler does not behave as one would intuitively expect.
The algorithm for spawning of new jobs is not described in detail in the docs. I would expect that:
When a job pulls a new task from the queue, it should dequeue it from the queue on the start.
The value of metric exposed by scaler should be the length of the queue (number of tasks that have not been pulled by any job yet).
After each pollingInterval KEDA spawns new jobs and their number is equal to value of the metric (length of the queue) - number of jobs that don't have running or completed status.
At the same time the number of new jobs is capped by inequality number of new jobs to spawn + number of jobs without running or completed status <= maxReplicaCount.
The 3. point was proposed also in kedacore/keda#525 (comment) to work like this but it still has not been resolved. If there are already jobs in pending status, KEDA does not take them into account and spawns all jobs in the queue again during next polling which can result in huge number of jobs if e.g. the pending jobs cannot be scheduled or are waiting for new nodes to be provisioned.
Further it is not clear, in the context of jobs scaling, what is the meaning of cooldownPeriod, minReplicaCount and threshold in Prometheus trigger (and similar parameters in other triggers).
The text was updated successfully, but these errors were encountered:
We are working on setting better expectations around Jobs in v2.0 and think this is good input for our docs.
As far as I know this works how we are aiming for it, right @zroubalik?
@hmoravec Are you up for contributing improved docs on this front?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.
stalebot
added
the
stale
All issues that are marked as stale due to inactivity
label
Oct 13, 2021
Current documentation about jobs scaling is incomplete and the jobs scaler does not behave as one would intuitively expect.
The algorithm for spawning of new jobs is not described in detail in the docs. I would expect that:
pollingInterval
KEDA spawns new jobs and their number is equal tovalue of the metric (length of the queue) - number of jobs that don't have running or completed status
.number of new jobs to spawn + number of jobs without running or completed status <= maxReplicaCount
.The 3. point was proposed also in kedacore/keda#525 (comment) to work like this but it still has not been resolved. If there are already jobs in pending status, KEDA does not take them into account and spawns all jobs in the queue again during next polling which can result in huge number of jobs if e.g. the pending jobs cannot be scheduled or are waiting for new nodes to be provisioned.
Further it is not clear, in the context of jobs scaling, what is the meaning of
cooldownPeriod
,minReplicaCount
andthreshold
in Prometheus trigger (and similar parameters in other triggers).The text was updated successfully, but these errors were encountered: