Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: fix typos and add PR link #379

Merged
merged 5 commits into from
Mar 11, 2021
Merged

docs: fix typos and add PR link #379

merged 5 commits into from
Mar 11, 2021

Conversation

callum-tait-pbx
Copy link
Contributor

minor changes to docs I noticed

@callum-tait-pbx
Copy link
Contributor Author

callum-tait-pbx commented Mar 8, 2021

@mumoshu I think the following benefit listed in the README.md for the PercentageRunnersBusy schema is wrong:

Allows for multiple controllers to be deployed as each controller deployed is responsible for scaling their own runner pods on a per namespace basis. [#223](https://github.com/summerwind/actions-runner-controller/pull/223)

It came from the below snipped taken from the #223:

@ahmad-hamade this autoscaling scheme will scale the runners that your runner-deployment creates on your behalf, or the runners you've created Runner resources for. If you are running 3 runners all in separate EKS clusters - i am also assuming you are also running 3 separate controllers. If that is the case, each horizontal runner autoscaler that you have defined in each cluster will be responsible for scaling that clusters runners - which is a benefit over the previous implementation, where a single horizontal runner autoscaler is scaling across all runners, since it cannot distinguish pods running in separate clusters.

I don't quite get what the author is getting at. Are you able to figure it out so the benefit in the README can be updated to be correct? I'm struggling to grasp how this is a benefit unique to the PercentageRunnersBusy schema

@mumoshu
Copy link
Collaborator

mumoshu commented Mar 9, 2021

@callum-tait-pbx Before #355, TotalNumberOfQueuedAndInProgressWorkflowRuns was unable to distinguish runners from different namespaces. So probably it was theoretically correct to say that PercentageRunnersBusy was more appropriate for a multiple actions-runner-controller deployment across namespaces.

However,

Allows for multiple controllers to be deployed as each controller deployed is responsible for scaling their own runner pods on a per namespace basis.

It wasn't working like that until #380.

@mumoshu
Copy link
Collaborator

mumoshu commented Mar 9, 2021

I've reread the doc and I would say that this part of the doc:

  1. Allows for multiple controllers to be deployed as each controller deployed is responsible for scaling their own runner pods on a per namespace basis. #223

looks obsolete and can be removed now.

After #355 and #380, both scaling metrics(strategies) should work er eithper namespace or per cluster basis without any problem.

@callum-tait-pbx
Copy link
Contributor Author

@mumoshu that is probably worded better

3. Like all scaling metrics, you can manage workflow allocation to the RunnerDeployment through the use of [Github labels](#runner-labels).

**Drawbacks of this metric**
1. Repositories must be named within the scaling metric, maintaining a list of repositories may not be viable in larger environments or self-serve environments.
2. May not scale quick enough for some users needs
2. May not scale quick enough for some users needs. This metric is pull based and so the queue depth is polled as configured by the sync period, as a result scaling performance is bound by this sync period meaning there is a lag to scaling activity.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great explanation 💡


**Drawbacks of this metric**
1. May not scale quick enough for some users needs as we are scaling up and down based on indicative information rather than a direct count of the workflow queue depth
1. May not scale quick enough for some users needs. This metric is pull based and so the number of busy runners are polled as configured by the sync period, as a result scaling performance is bound by this sync period meaning there is a lag to scaling activity.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably you could try combining two. Webhook-based scaling will quickly add (possibly more than necessary) runners on demand and PercentageRunnersBusy will periodically "correct" the number of runners depending on runner statuses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the webhook scaling work inconjuction with the regular metrics? I saw you removed the need to define a metric when using webhooks, can you still define a metric and it will compliment the webhook scaling?

Copy link
Collaborator

@mumoshu mumoshu Mar 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the webhook scaling work inconjuction with the regular metrics?

Yes

I saw you removed the need to define a metric when using webhooks

Yes. Actually, it works only in conjunction with regular metrics before/after I've removed the need to explicitly define metrics. It wasn't working when repository runners + TotalNumberOfQueuedAndInProgressWorkers combination so I fixed it in #381. Other combinations should have worked and will keep working.

can you still define a metric and it will compliment the webhook scaling?

Exactly! In other words, the webhook-based scaling works in combination with either TotalNumberOfQueuedAndInProgressWorkers or PercentageRunnersBusy. When you omited HRA.Spec.Metrics it defaults to TotalNumberOfQueuedAndInProgressWorkers. So you can say that it is still working in conjunction with the regular metrics.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed my mind and will soon change the default Metrics[] when ScaleUpTriggers isn't empty, so that ScaleUpTrigger can be actually used standalone. But the general idea described above should still stand.

Copy link
Collaborator

@mumoshu mumoshu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you so much for your continuous effort @callum-tait-pbx ☺️

@mumoshu mumoshu merged commit a6270b4 into actions:master Mar 11, 2021
mumoshu added a commit that referenced this pull request Mar 14, 2021
… enabled

Relates to #379 (comment)
Relates to #377 (comment)

When you defined HRA.Spec.ScaleUpTriggers[] but HRA.Spec.Metrics[], the HRA controller will now enable ScaleUpTriggers alone and insteaed of automatically enabling TotalNumberOfQueuedAndInProgressWorkflowRuns. This allows you to use ScaleUpTriggers alone, so that the autoscaling is done without calling GitHub API at all, which should grealy decrease the change of GitHub API calls get rate-limited.
mumoshu added a commit that referenced this pull request Mar 14, 2021
… enabled (#391)

Relates to #379 (comment)
Relates to #377 (comment)

When you defined HRA.Spec.ScaleUpTriggers[] but HRA.Spec.Metrics[], the HRA controller will now enable ScaleUpTriggers alone and insteaed of automatically enabling TotalNumberOfQueuedAndInProgressWorkflowRuns. This allows you to use ScaleUpTriggers alone, so that the autoscaling is done without calling GitHub API at all, which should grealy decrease the change of GitHub API calls get rate-limited.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants