-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements in the documentation for permissions and replication #175
Conversation
@droidpl This seems like a bug! As similar as how HPA and K8s Deployment works, we should be able to omit RunnerDeployment's replicas so that reapplying manifests doesn't roll back the number of replicas altered by the autoscaler. |
@droidpl Would you mind sharing me logs/error message/etc if possible for the replicas error? |
@mumoshu Here the log from the controller:
When I added back the replica item it started working |
I've reverted the changes to the replica doc as this has been identified as a bug in #175 (comment). Do you have an issue to link to that @mumoshu? |
Just to chip in I have also seen this, I can post the log later if you like but it's the same log entry. My YAML is a derivative of the below. If you omit the replicas key from the RunnerDeployment you get the above error. apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
name: test-runner
namespace: default
spec:
replicas: 1
template:
spec:
group: test
organization: droidpl-demorg
dockerdWithinRunnerContainer: true|false
scaleDownDelaySecondsAfterScaleUp: 1m
labels:
- aws
- linux
imagePullPolicy: Always
---
apiVersion: actions.summerwind.dev/v1alpha1
kind: HorizontalRunnerAutoscaler
metadata:
name: test-autoscaler
spec:
scaleTargetRef:
name: test-runner
minReplicas: 1
maxReplicas: 2
metrics:
- type: TotalNumberOfQueuedAndInProgressWorkflowRuns
repositoryNames:
- aws-deployment |
@droidpl it would be helpful if you post your |
Steps to Reproduce
Feel free to rip out the AWS bits Runner Yaml apiVersion: v1
kind: ServiceAccount
metadata:
name: github-actions-service-account
namespace: actions-runner-system
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::$AWS_ACCOUNT_ID:role/eksctl-sre-eks-test-addon-iamserviceaccount-Role1-1244USHBUKLEY
---
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
name: sre-github-runners-runner-deployment
namespace: actions-runner-system
spec:
template:
spec:
organization: photobox
serviceAccountName: github-actions-service-account
env:
- name: AWS_ROLE_ARN
value: arn:aws:iam::$AWS_ACCOUNT_ID:role/eksctl-sre-eks-test-addon-iamserviceaccount-Role1-1244USHBUKLEY
- name: AWS_WEB_IDENTITY_TOKEN_FILE
value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
volumeMounts:
- mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount
name: aws-iam-token
readOnly: true
volumes:
- name: aws-iam-token
projected:
defaultMode: 420
sources:
- serviceAccountToken:
audience: sts.amazonaws.com
expirationSeconds: 86400
path: token
securityContext:
fsGroup: 27
---
apiVersion: actions.summerwind.dev/v1alpha1
kind: HorizontalRunnerAutoscaler
metadata:
name: sre-github-runners-autoscaler
namespace: actions-runner-system
spec:
scaleDownDelaySecondsAfterScaleUp: 10
scaleTargetRef:
name: sre-github-runners-runner-deployment
minReplicas: 1
maxReplicas: 10
metrics:
- type: TotalNumberOfQueuedAndInProgressWorkflowRuns
repositoryNames:
- callum-test-repo
- callum-test-repo-2 Error 2020-11-12T23:11:13.195Z ERROR controller-runtime.controller Reconciler error {"controller": "runnerdeployment", "request": "sre-actions-runner-system/sre-github-runners-runner-deployment", "error": "RunnerReplicaSet.actions.summerwind.dev \"sre-github-runners-runner-deployment-nch6w\" is invalid: spec.replicas: Invalid value: \"null\": spec.replicas in body must be of type integer: \"null\""}
github.com/go-logr/zapr.(*zapLogger).Error
/go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:258
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:232
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:211
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:152
k8s.io/apimachinery/pkg/util/wait.JitterUntil
/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:153
k8s.io/apimachinery/pkg/util/wait.Until
/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:88 |
Related #180 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the all the info and this fix @droidpl @callum-tait-pbx!
I'll revisit #64 soon
Thanks so much folks 😄 |
@callum-tait-pbx You can see my files in #174 |
@summerwind I've improved a couple of things in the documentation based on the testing I've done with the repo.
Added thereplica: <num>
to horizontal autoscaling groups as this seems to be a requirement. When I didn't put the replica it was failing to start the runners