-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not able to scale Organization RunnerDeployment with "workflow_job" #951
Comments
@sigurdfalk Hey! My eyes were caught by this part of your log Could you also share your workflow definition YAML file? |
@mumoshu I think "self-hosted" is added by default to all self-hosted runners by GitHub. I added "self-hosted" to labels in my In our workflow, we use: Thank you so much for your help and all the great work you are doing with this project! We really love it ❤️ I think we can close this issue now |
Hello ! I
Following @sigurdfalk comment adding the I think currently there is a difference between the labels the controller knows about (== the explicitly specified labels) vs the labels Github receive (explicit + implicit (self-hosted, OS, infrastructure)). The same issue arises when specifiying the OS or the infrastructure, even though the job does get assigned on the same node that should be autoscaled. edit: the webhook controller checks the labels requested by the workflow match the explicitly specified labels, which block with the implicit labels. The issue will not arise if only |
I pushed a fix for the |
@sigurdfalk @clement-loiselet-talend Hey! Thanks a lot for your reports. Ignoring Do you, by any chance, know full list of implicit labels other than |
https://docs.github.com/en/actions/hosting-your-own-runners/using-self-hosted-runners-in-a-workflow#using-default-labels-to-route-jobs I think this is the list of implicit labels that get applied by GitHub Actions dependant on the hardware
|
@toast-gear Hey! Thanks. I think your list is for runner labels. I was more interested in Perhaps no |
The list on the e.g. : workflow.yamlname: Docker image CI
jobs:
lint:
runs-on: [self-hosted, docker-in-docker, linux] RunnerDeployment.yamlapiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
name: github-action-runner
spec:
template:
spec:
# explicit labels
labels:
- docker-in-docker In this example, the created runner will have the labels [ I don't think we should ignore these labels on the webhook controller as it risks upscaling a deployment the workflow cannot be deployed on. From what I saw we can't easily extract it from the runner, so I guess we could add a warning in the webhook to ease the debug if someone were to try to use these implicit labels without declaring them on the Runner. |
@clement-loiselet-talend Hey! Sorry for the delay. It took more time than I had thought to fully understand this but now- Gotcha. So, here're my thoughts:
Also, thank you for your #953! Thanks to your detailed response it turns out we'd better not merge that, and instead we should enhance logging. |
Sorry for the back and forth but I'm convinced that we should merge #953. Thanks again for your contribution @clement-loiselet-talend 🙇 I've left a comment with some more contexts in the PR, fyi. |
- apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
name: my-runner
spec:
replicas: 0
template:
spec:
serviceAccountName: runner-sa
organization: SuperOrg
labels:
- self-hosted
- staging
env:
- name: GOOGLE_PROJECT_ID
value: {{ quote .Values.googleProjectID }}
resources:
limits:
memory: 2Gi
requests:
cpu: 50m
memory: 500Mi
---
- apiVersion: actions.summerwind.dev/v1alpha1
kind: HorizontalRunnerAutoscaler
metadata:
name: my-runner
spec:
minReplicas: 0
maxReplicas: 10
scaleTargetRef:
kind: RunnerDeployment
name: my-runner
scaleDownDelaySecondsAfterScaleOut: 1800
scaleUpTriggers:
- amount: 1
duration: 30m
githubEvent:
workflowJob:
action: queued
---
name: Frontend
run-name: Frontend
on:
- push
jobs:
build-frontend-image:
name: Build and push a frontend image
runs-on: ["staging"]
steps:
- run: echo hello |
Describe the bug
Organization RunnerDeployment not able to scale with
workflow_job
event. Manifests looks like below:Seing the following in logs:
The amount of replicas stays at 2 no matter how many queued jobs we have requesting these runners.
Checks
Expected behavior
Amount of runners should scale up when we have queued workflows
Environment (please complete the following information):
The text was updated successfully, but these errors were encountered: