Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop scaling out new resources if the function isn't successfully processing #39

Closed
jeffhollan opened this issue Mar 14, 2019 · 4 comments
Labels
needs-discussion release-niceto-have stale All issues that are marked as stale due to inactivity

Comments

@jeffhollan
Copy link
Member

There are two scenarios we have today where we stop scaling new instances even as the queue length increases:

  1. No available partitions. If you have an event hub with 5 partitions that is growing in event lenghts, we will scale to 5 instances, and then 6. The 6th one gets scaled out but can't get a lock on a partition so never actually consumes any events. We never scale to a 7th because we look to see that the 6th isn't execution so stop scaling.
  2. A function is misconfigured and never actually is able to start consuming. Rather than scaling out a broken app, we stop scaling if we see no executions are happening.

In our service we do that by looking at the execution and billing metrics for the instance, and if no billing or execution metrics are emitting we stop scaling more.

We'd need a similar pattern in Kore so that if the event consumer isn't actually processing new messages, we don't keep scaling. Both for the "no available partition" scenario or the "my app isn't even working" scenario. We could also solve each of these in 2 seperate ways (e.g. maybe expose partition info into the scaler?)

@markusthoemmes
Copy link
Contributor

Would that be solvable via more advanced metrics as have been mentioned on the last call? We could use a consumptionRate metrics to detect this without having having to leave the metrics context of the source for autoscaling (queueLength and consumptionRate would be two metrics generated by the same source).

That would also nicely dovetail into using these metrics for scale > 1.

@gabrielSoudry
Copy link

gabrielSoudry commented Jul 23, 2021

Some news ? The first case "No available partition" is common.

preflightsiren pushed a commit to preflightsiren/keda that referenced this issue Nov 7, 2021
* adding auth spec and auth concepts
@stale
Copy link

stale bot commented Nov 21, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale All issues that are marked as stale due to inactivity label Nov 21, 2021
@stale
Copy link

stale bot commented Nov 28, 2021

This issue has been automatically closed due to inactivity.

@stale stale bot closed this as completed Nov 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-discussion release-niceto-have stale All issues that are marked as stale due to inactivity
Projects
None yet
Development

No branches or pull requests

3 participants