Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements in the documentation for permissions and replication #175

Merged
merged 2 commits into from
Nov 13, 2020
Merged

Improvements in the documentation for permissions and replication #175

merged 2 commits into from
Nov 13, 2020

Conversation

droidpl
Copy link
Contributor

@droidpl droidpl commented Nov 11, 2020

@summerwind I've improved a couple of things in the documentation based on the testing I've done with the repo.

  • Added the permissions explicitly just in case people don't go and create the GitHub apps with the link but they go manually.
  • Added the replica: <num> to horizontal autoscaling groups as this seems to be a requirement. When I didn't put the replica it was failing to start the runners

@mumoshu
Copy link
Collaborator

mumoshu commented Nov 12, 2020

Added the replica: to horizontal autoscaling groups as this seems to be a requirement. When I didn't put the replica it was failing to start the runners

@droidpl This seems like a bug! As similar as how HPA and K8s Deployment works, we should be able to omit RunnerDeployment's replicas so that reapplying manifests doesn't roll back the number of replicas altered by the autoscaler.

@mumoshu
Copy link
Collaborator

mumoshu commented Nov 12, 2020

@droidpl Would you mind sharing me logs/error message/etc if possible for the replicas error?

@droidpl
Copy link
Contributor Author

droidpl commented Nov 12, 2020

@mumoshu Here the log from the controller:

2020-11-11T13:03:04.493Z        ERROR   controller-runtime.controller   Reconciler error        {"controller": "runnerdeployment", "request": "default/refinitiv-sdp-runner", "error": "RunnerReplicaSet.actions.summerwind.dev \"refinitiv-sdp-runner-m47mf\" is invalid: spec.replicas: Invalid value: \"null\": spec.replicas in body must be of type integer: \"null\""}
github.com/go-logr/zapr.(*zapLogger).Error
        /go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:258
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:232
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:211
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
        /go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:152
k8s.io/apimachinery/pkg/util/wait.JitterUntil
        /go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:153
k8s.io/apimachinery/pkg/util/wait.Until
        /go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:88

When I added back the replica item it started working

@droidpl
Copy link
Contributor Author

droidpl commented Nov 12, 2020

I've reverted the changes to the replica doc as this has been identified as a bug in #175 (comment). Do you have an issue to link to that @mumoshu?

@callum-tait-pbx
Copy link
Contributor

callum-tait-pbx commented Nov 12, 2020

Just to chip in I have also seen this, I can post the log later if you like but it's the same log entry. My YAML is a derivative of the below. If you omit the replicas key from the RunnerDeployment you get the above error.

apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: test-runner
  namespace: default
spec:
  replicas: 1
  template:
    spec:
      group: test
      organization: droidpl-demorg
      dockerdWithinRunnerContainer: true|false
      scaleDownDelaySecondsAfterScaleUp: 1m
      labels:
        - aws
        - linux
      imagePullPolicy: Always
---
apiVersion: actions.summerwind.dev/v1alpha1
kind: HorizontalRunnerAutoscaler
metadata:
  name: test-autoscaler
spec:
  scaleTargetRef:
    name: test-runner
  minReplicas: 1
  maxReplicas: 2
  metrics:
    - type: TotalNumberOfQueuedAndInProgressWorkflowRuns
      repositoryNames:
        - aws-deployment

@callum-tait-pbx
Copy link
Contributor

callum-tait-pbx commented Nov 12, 2020

#64 looks like this issue has been posted before so the bug has been around a while if it is a bug? @mumoshu

@callum-tait-pbx
Copy link
Contributor

@droidpl it would be helpful if you post your runner.yaml

@callum-tait-pbx
Copy link
Contributor

callum-tait-pbx commented Nov 12, 2020

Steps to Reproduce

  1. Clone v0.11.2
  2. Deploy either the PAT or GitHub App config
  3. Build the manifest kustomize build ./default > ./manifest.yaml
  4. Deploy manifest kubectl apply -f ./manifest.yaml
  5. Deploy runners kubectl apply -f ./runners.yaml
  6. Check manager logs kubectl logs controller-manager-777fb76ddb-7zwzk -c manager

Feel free to rip out the AWS bits

Runner Yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: github-actions-service-account
  namespace: actions-runner-system
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::$AWS_ACCOUNT_ID:role/eksctl-sre-eks-test-addon-iamserviceaccount-Role1-1244USHBUKLEY  
---
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: sre-github-runners-runner-deployment
  namespace: actions-runner-system
spec:
  template:
    spec:
      organization: photobox
      serviceAccountName: github-actions-service-account
      env:
      - name: AWS_ROLE_ARN
        value: arn:aws:iam::$AWS_ACCOUNT_ID:role/eksctl-sre-eks-test-addon-iamserviceaccount-Role1-1244USHBUKLEY
      - name: AWS_WEB_IDENTITY_TOKEN_FILE
        value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
      volumeMounts:
      - mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount
        name: aws-iam-token 
        readOnly: true
      volumes:
      - name: aws-iam-token
        projected:
          defaultMode: 420
          sources:
          - serviceAccountToken:
              audience: sts.amazonaws.com
              expirationSeconds: 86400
              path: token
      securityContext:
        fsGroup: 27
---
apiVersion: actions.summerwind.dev/v1alpha1
kind: HorizontalRunnerAutoscaler
metadata:
  name: sre-github-runners-autoscaler
  namespace: actions-runner-system
spec:
  scaleDownDelaySecondsAfterScaleUp: 10
  scaleTargetRef:
    name: sre-github-runners-runner-deployment
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: TotalNumberOfQueuedAndInProgressWorkflowRuns
    repositoryNames:
    - callum-test-repo
    - callum-test-repo-2

Error

2020-11-12T23:11:13.195Z	ERROR	controller-runtime.controller	Reconciler error	{"controller": "runnerdeployment", "request": "sre-actions-runner-system/sre-github-runners-runner-deployment", "error": "RunnerReplicaSet.actions.summerwind.dev \"sre-github-runners-runner-deployment-nch6w\" is invalid: spec.replicas: Invalid value: \"null\": spec.replicas in body must be of type integer: \"null\""}
github.com/go-logr/zapr.(*zapLogger).Error
	/go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:258
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:232
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:211
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
	/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:152
k8s.io/apimachinery/pkg/util/wait.JitterUntil
	/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:153
k8s.io/apimachinery/pkg/util/wait.Until
	/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:88

@mumoshu
Copy link
Collaborator

mumoshu commented Nov 13, 2020

Related #180

Copy link
Collaborator

@mumoshu mumoshu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the all the info and this fix @droidpl @callum-tait-pbx!

I'll revisit #64 soon

@mumoshu mumoshu merged commit 1f82032 into actions:master Nov 13, 2020
@droidpl droidpl deleted the patch-1 branch November 13, 2020 14:07
@droidpl
Copy link
Contributor Author

droidpl commented Nov 13, 2020

Thanks so much folks 😄

@droidpl
Copy link
Contributor Author

droidpl commented Nov 13, 2020

@callum-tait-pbx You can see my files in #174

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants