Stop scaling out new resources if the function isn't successfully processing #39

jeffhollan · 2019-03-14T21:30:18Z

There are two scenarios we have today where we stop scaling new instances even as the queue length increases:

No available partitions. If you have an event hub with 5 partitions that is growing in event lenghts, we will scale to 5 instances, and then 6. The 6th one gets scaled out but can't get a lock on a partition so never actually consumes any events. We never scale to a 7th because we look to see that the 6th isn't execution so stop scaling.
A function is misconfigured and never actually is able to start consuming. Rather than scaling out a broken app, we stop scaling if we see no executions are happening.

In our service we do that by looking at the execution and billing metrics for the instance, and if no billing or execution metrics are emitting we stop scaling more.

We'd need a similar pattern in Kore so that if the event consumer isn't actually processing new messages, we don't keep scaling. Both for the "no available partition" scenario or the "my app isn't even working" scenario. We could also solve each of these in 2 seperate ways (e.g. maybe expose partition info into the scaler?)

markusthoemmes · 2019-03-15T07:35:18Z

Would that be solvable via more advanced metrics as have been mentioned on the last call? We could use a consumptionRate metrics to detect this without having having to leave the metrics context of the source for autoscaling (queueLength and consumptionRate would be two metrics generated by the same source).

That would also nicely dovetail into using these metrics for scale > 1.

gabrielSoudry · 2021-07-23T15:23:55Z

Some news ? The first case "No available partition" is common.

* adding auth spec and auth concepts

stale · 2021-11-21T12:30:50Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

stale · 2021-11-28T22:03:58Z

This issue has been automatically closed due to inactivity.

jeffhollan added the release-niceto-have label Mar 14, 2019

jeffhollan added the needs-discussion label May 3, 2019

preflightsiren pushed a commit to preflightsiren/keda that referenced this issue Nov 7, 2021

adding auth spec and auth concepts (kedacore#39)

13ef694

* adding auth spec and auth concepts

stale bot added the stale All issues that are marked as stale due to inactivity label Nov 21, 2021

stale bot closed this as completed Nov 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop scaling out new resources if the function isn't successfully processing #39

Stop scaling out new resources if the function isn't successfully processing #39

jeffhollan commented Mar 14, 2019

markusthoemmes commented Mar 15, 2019

gabrielSoudry commented Jul 23, 2021 •

edited

Loading

stale bot commented Nov 21, 2021

stale bot commented Nov 28, 2021

Stop scaling out new resources if the function isn't successfully processing #39

Stop scaling out new resources if the function isn't successfully processing #39

Comments

jeffhollan commented Mar 14, 2019

markusthoemmes commented Mar 15, 2019

gabrielSoudry commented Jul 23, 2021 • edited Loading

stale bot commented Nov 21, 2021

stale bot commented Nov 28, 2021

gabrielSoudry commented Jul 23, 2021 •

edited

Loading