Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple runner operators on different GKE clusters #454

Open
kasey-weirich opened this issue Jul 6, 2022 · 1 comment
Open

Multiple runner operators on different GKE clusters #454

kasey-weirich opened this issue Jul 6, 2022 · 1 comment
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request

Comments

@kasey-weirich
Copy link

kasey-weirich commented Jul 6, 2022

We have multiple environments set up for development/staging work and I am trying to migrate our runner operator and runner pool to a new GKE cluster. Currently our development ecosystem (on GKE) is working as expected with the runner operator scheduling pods as new jobs come up.

This is all configured on the same GH org.

I am trying to migrate to a new cluster using the same GitHub app as a K8s secrets in the new cluster (assuming the GH app can be reused). I have installed the runner operator via Helm and have supplied the GitHub app secrets in the values file. The operator installs with no issues observed in the logs.

When I install the runner pool on the new cluster it shows ReconcileSuccess however, the Current size is always 0;

I have tried:

  • updating the CRD
  • setting minRunners to 5
  • Nothing seems to work --- even tried this on a freshly built cluster and am seeing the same thing (ReconcileSuccess, Current size: 0)

The runner operator logs do not give any indication of why I am not seeing any runners, everything appears to be working:

2022-07-06T14:58:57.622Z	INFO	controller-runtime.metrics	metrics server is starting to listen	{"addr": ":8080"}
2022-07-06T14:58:57.622Z	INFO	setup	starting manager
I0706 14:58:57.622640       1 leaderelection.go:248] attempting to acquire leader lease runner-operator/4ef9cd91.tietoevry.com...
2022-07-06T14:58:57.622Z	INFO	starting metrics server	{"path": "/metrics"}
I0706 14:59:13.859538       1 leaderelection.go:258] successfully acquired lease runner-operator/4ef9cd91.tietoevry.com
2022-07-06T14:59:13.859Z	DEBUG	events	Normal	{"object": {"kind":"ConfigMap","namespace":"runner-operator","name":"4ef9cd91.tietoevry.com","uid":"33346577-5c1c-4d78-82b4-79d1e191147b","apiVersion":"v1","resourceVersion":"14561"}, "reason": "LeaderElection", "message": "github-actions-runner-operator-fd84696f-2l2x2_2f916b5b-f9e3-4982-85f8-501181d79b2d became leader"}
2022-07-06T14:59:13.859Z	INFO	controller.githubactionrunner	Starting EventSource	{"reconciler group": "garo.tietoevry.com", "reconciler kind": "GithubActionRunner", "source": "kind source: /, Kind="}
2022-07-06T14:59:13.859Z	INFO	controller.githubactionrunner	Starting EventSource	{"reconciler group": "garo.tietoevry.com", "reconciler kind": "GithubActionRunner", "source": "kind source: /, Kind="}
2022-07-06T14:59:13.859Z	INFO	controller.githubactionrunner	Starting EventSource	{"reconciler group": "garo.tietoevry.com", "reconciler kind": "GithubActionRunner", "source": "kind source: /, Kind="}
2022-07-06T14:59:13.859Z	INFO	controller.githubactionrunner	Starting Controller	{"reconciler group": "garo.tietoevry.com", "reconciler kind": "GithubActionRunner"}
2022-07-06T14:59:13.961Z	INFO	controller.githubactionrunner	Starting workers	{"reconciler group": "garo.tietoevry.com", "reconciler kind": "GithubActionRunner", "worker count": 1}
2022-07-06T14:59:19.414Z	INFO	controllers.GithubActionRunner	Reconciling GithubActionRunner	{"githubactionrunner": "runner-operator/runner-pool"}
2022-07-06T14:59:19.760Z	INFO	controllers.GithubActionRunner	Registration secret not found, creating	{"githubactionrunner": "runner-operator/runner-pool"}
2022-07-06T14:59:19.976Z	INFO	controllers.GithubActionRunner	Pods and runner API not in sync, returning early	{"githubactionrunner": "runner-operator/runner-pool"}
2022-07-06T14:59:20.136Z	INFO	controllers.GithubActionRunner	Reconciling GithubActionRunner	{"githubactionrunner": "runner-operator/runner-pool"}
2022-07-06T14:59:20.325Z	INFO	controllers.GithubActionRunner	Pods and runner API not in sync, returning early	{"githubactionrunner": "runner-operator/runner-pool"}
2022-07-06T14:59:50.136Z	INFO	controllers.GithubActionRunner	Reconciling GithubActionRunner	{"githubactionrunner": "runner-operator/runner-pool"}
2022-07-06T14:59:50.316Z	INFO	controllers.GithubActionRunner	Pods and runner API not in sync, returning early	{"githubactionrunner": "runner-operator/runner-pool"}
2022-07-06T15:00:20.367Z	INFO	controllers.GithubActionRunner	Reconciling GithubActionRunner	{"githubactionrunner": "runner-operator/runner-pool"}
2022-07-06T15:00:20.585Z	INFO	controllers.GithubActionRunner	Pods and runner API not in sync, returning early	{"githubactionrunner": "runner-operator/runner-pool"}
2022-07-06T15:00:50.604Z	INFO	controllers.GithubActionRunner	Reconciling GithubActionRunner	{"githubactionrunner": "runner-operator/runner-pool"}
2022-07-06T15:00:50.863Z	INFO	controllers.GithubActionRunner	Pods and runner API not in sync, returning early	{"githubactionrunner": "runner-operator/runner-pool"}

Thank you for any pointers you can provide.

@kasey-weirich
Copy link
Author

I believe I finally figured this out, in my runner spec yaml file, I changed the name of the runner pool and am now seeing runner pods in my new cluster. Previously named runner-pool.

apiVersion: garo.tietoevry.com/v1alpha1
kind: GithubActionRunner
metadata:
  name: runner-pool-test-01
  namespace: runner-operator
spec:
  minRunners: 2
  maxRunners: 20
  organization: myOrgo
  reconciliationPeriod: 30s
  podTemplateSpec:
    metadata:
      annotations:
        "prometheus.io/scrape": "true"
        "prometheus.io/port": "3903"
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                topologyKey: kubernetes.io/hostname
                labelSelector:
                  matchExpressions:
                    - key: garo.tietoevry.com/pool
                      operator: In
                      values:
                        - runner-pool-test-01

Even if the operator is on a completely different cluster, using the same runner pool name in the runner spec results in zero runner pods getting created.

Would it be useful to update the README with something mentioning using the operator/runner pool across multiple clusters/lifecycles/env's?

@davidkarlsen davidkarlsen added documentation Improvements or additions to documentation enhancement New feature or request labels Aug 30, 2022
@davidkarlsen davidkarlsen self-assigned this Sep 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants