Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

edge: Make scale-from-zero even more reliable #522

Closed
mumoshu opened this issue May 5, 2021 · 0 comments · Fixed by #524
Closed

edge: Make scale-from-zero even more reliable #522

mumoshu opened this issue May 5, 2021 · 0 comments · Fixed by #524

Comments

@mumoshu
Copy link
Collaborator

mumoshu commented May 5, 2021

Extracted from #447 (comment)

We've added support for scale-to/from-zero scenarios to the controller recently.

The biggest known issue so far is that it doesn't scale-from-zero well with PercentageRunnersBusy.

You can definitely combine the webhook-based scaling just for scaling from zero. But the webhook-based autoscaling can't be used in an environment that disallows exposing anything to the internet or to the lb accessible from the internet.

A possible solution would be to enable having two or more HRA.Spec.Metrics[] items, so that the controller can choose the largest desired replica numbers from one of Metrics[] items.

For example, having both PercentageRunnersBusy and TotalInProgressAndQueuedWorkflowRuns when you're trying to scale-from-zero would use the number of TotalInProgressAndQueuedWorkflowRuns, as it can safely become greater than 1 even if you had only offline runners(#465).

The only downside of doing this would be it's harder to use correctly, especially when you need to scale-to/from-zero organizational runners, and maybe enterprise runners. More concretely, TotalInProgressAndQueuedWorkflowRuns forces you to encode all the repository names that uses the organization runners into HRA.Spec.Metrics[].RepositoryNames`.

Another potential solution would be to let PercentageRunnersBusy somehow detect pending check-runs that requires the runner from the targeted RunnerDeployment.

We usually correlate check-runs received that the webhook-based autoscaler to HRA and RunnerDeployment by check-runs' names. Doing something similar but without webhook would theoretically work.

mumoshu added a commit that referenced this issue May 5, 2021
mumoshu added a commit that referenced this issue May 5, 2021
`PercentageRunnersBusy`, in combination with a secondary `TotalInProgressAndQueuedWorkflowRuns` metric, enables scale-from-zero for PercentageRunnersBusy.

Please see the new `Autoscaling to/from 0` section in the updated documentation about how it works.

Resolves #522
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant