-
Notifications
You must be signed in to change notification settings - Fork 841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide metric for tasks queue when using limit-active-tasks placement strategy #5057
Comments
Rather than implementing a full queue (like concurrency-safe FIFO guarantee or any kind of priority - which does not currently exist, for the record!) I suspect it would be enough for you to emit a metric whenever the
event happens, which seems to be around this block of code. I can imagine this being a strong enough heuristic to say "my workers are getting busy". Then I'm thinking you could autoscale depending on how often this event has occurred in the last hour (or whatever granularity/tuning makes sense)? Frankly I'm not sharp when it comes to k8s autoscaling, so I'd need a sanity check on this assertion. Am I making sense? off-base? |
Oh I only looked at the linked proposed implementation and assumed the final one was more or less the same and it was just forgotten to expose the metric. In general it should be possible to build a custom metric based on the logs. What do you think about not having a queue but a counter? |
Hey @pivotal-jamie-klassen do you have any feedback about the counter idea? |
@tenjaa seems fair to me. Especially if you experiment with using that metric in your own environment! I guess you'd probably emit a counter per worker. |
What challenge are you facing?
We switched to the limit-active-tasks placement strategy and so far it solved a lot of our problems. We want to improve now by scaling our workers depending on the size of the task queue.
We are running our Concourse in a Kubernetes environment.
What would make this better?
It would improve the scaling of workers.
Are you interested in implementing this yourself?
Sure :)
We already saw that the first proposed implementation had this metric exported: #4612
The text was updated successfully, but these errors were encountered: