edge: Make scale-from-zero even more reliable #522

mumoshu · 2021-05-05T02:09:23Z

We've added support for scale-to/from-zero scenarios to the controller recently.

The biggest known issue so far is that it doesn't scale-from-zero well with PercentageRunnersBusy.

You can definitely combine the webhook-based scaling just for scaling from zero. But the webhook-based autoscaling can't be used in an environment that disallows exposing anything to the internet or to the lb accessible from the internet.

A possible solution would be to enable having two or more HRA.Spec.Metrics[] items, so that the controller can choose the largest desired replica numbers from one of Metrics[] items.

For example, having both PercentageRunnersBusy and TotalInProgressAndQueuedWorkflowRuns when you're trying to scale-from-zero would use the number of TotalInProgressAndQueuedWorkflowRuns, as it can safely become greater than 1 even if you had only offline runners(#465).

The only downside of doing this would be it's harder to use correctly, especially when you need to scale-to/from-zero organizational runners, and maybe enterprise runners. More concretely, TotalInProgressAndQueuedWorkflowRuns forces you to encode all the repository names that uses the organization runners into HRA.Spec.Metrics[].RepositoryNames`.

Another potential solution would be to let PercentageRunnersBusy somehow detect pending check-runs that requires the runner from the targeted RunnerDeployment.

We usually correlate check-runs received that the webhook-based autoscaler to HRA and RunnerDeployment by check-runs' names. Doing something similar but without webhook would theoretically work.

The text was updated successfully, but these errors were encountered:

…alInProgressAndQueuedWorkflowRuns Fixes #522

`PercentageRunnersBusy`, in combination with a secondary `TotalInProgressAndQueuedWorkflowRuns` metric, enables scale-from-zero for PercentageRunnersBusy. Please see the new `Autoscaling to/from 0` section in the updated documentation about how it works. Resolves #522

mumoshu added a commit that referenced this issue May 5, 2021

Enable scaling from zero with PercentageRunnersBusy combined with Tot…

2fddf85

…alInProgressAndQueuedWorkflowRuns Fixes #522

This was referenced May 5, 2021

edge: Enable scaling from zero with PercentageRunnersBusy #524

Merged

feat: Allow Percentage runner busy minimum 1 #449

Closed

mumoshu closed this as completed in #524 May 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

edge: Make scale-from-zero even more reliable #522

edge: Make scale-from-zero even more reliable #522

mumoshu commented May 5, 2021

edge: Make scale-from-zero even more reliable #522

edge: Make scale-from-zero even more reliable #522

Comments

mumoshu commented May 5, 2021