Skip to content

Commit

Permalink
Merge branch 'master' into dnsConfig
Browse files Browse the repository at this point in the history
* master: (30 commits)
  chore(deps): update azure/setup-helm action to v2.1 (actions#1328) [skip ci]
  Add more usages of RUNNER_VERSION to Renovate config. (actions#1313)
  chore: fix typo (actions#1316) [skip ci]
  chore: bump chart to latest (actions#1319)
  Fix release workflow
  Prevent runners from stuck in Terminating when pod disappeared without standard termination process (actions#1318)
  ci: pin go version to the known working version (actions#1303)
  chore: bump chart to latest (actions#1300)
  Fix runner pod to be cleaned up earlier regardless of the sync period (actions#1299)
  Make the hard-coded runner startup timeout to avoid race on token expiration longer (actions#1296)
  docs: highlight why persistent are not ideal
  fix(deps): update module sigs.k8s.io/controller-runtime to v0.11.2
  chore(deps): update dependency actions/runner to v2.289.2
  chore(deps): update helm/chart-releaser-action action to v1.4.0 (actions#1287)
  refactor(runner/entrypoint): check for externalstmp (actions#1277)
  docs: redundant words
  docs: wording
  docs: add autoscaling also causes problems
  chore: new line for consistency
  docs: use the right font
  ...
  • Loading branch information
billimek committed Apr 12, 2022
2 parents ac450f6 + 960a704 commit 711a846
Show file tree
Hide file tree
Showing 28 changed files with 391 additions and 67 deletions.
21 changes: 19 additions & 2 deletions .github/renovate.json5
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,27 @@
// use https://github.com/actions/runner/releases
"fileMatch": [
".github/workflows/runners.yml"
],
],
"matchStrings": ["RUNNER_VERSION: +(?<currentValue>.*?)\\n"],
"depNameTemplate": "actions/runner",
"datasourceTemplate": "github-releases"
},
{
"fileMatch": [
"runner/Makefile"
],
"matchStrings": ["RUNNER_VERSION \\?= +(?<currentValue>.*?)\\n"],
"depNameTemplate": "actions/runner",
"datasourceTemplate": "github-releases"
},
{
"fileMatch": [
"runner/Dockerfile",
"runner/Dockerfile.dindrunner"
],
"matchStrings": ["RUNNER_VERSION=+(?<currentValue>.*?)\\n"],
"depNameTemplate": "actions/runner",
"datasourceTemplate": "github-releases"
}
]
}
}
2 changes: 1 addition & 1 deletion .github/workflows/on-push-lint-charts.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ jobs:
fetch-depth: 0

- name: Set up Helm
uses: azure/setup-helm@v2.0
uses: azure/setup-helm@v2.1
with:
version: ${{ env.HELM_VERSION }}

Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/on-push-master-publish-chart.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ jobs:
fetch-depth: 0

- name: Set up Helm
uses: azure/setup-helm@v2.0
uses: azure/setup-helm@v2.1
with:
version: ${{ env.HELM_VERSION }}

Expand Down Expand Up @@ -114,7 +114,7 @@ jobs:
git config user.email "$GITHUB_ACTOR@users.noreply.github.com"
- name: Run chart-releaser
uses: helm/chart-releaser-action@v1.3.0
uses: helm/chart-releaser-action@v1.4.0
env:
CR_TOKEN: "${{ secrets.GITHUB_TOKEN }}"

2 changes: 1 addition & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:

- uses: actions/setup-go@v3
with:
go-version: '^1.17.7'
go-version: '1.17.7'

- name: Install tools
run: |
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/runners.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ on:
- '!**.md'

env:
RUNNER_VERSION: 2.289.1
RUNNER_VERSION: 2.289.2
DOCKER_VERSION: 20.10.12
DOCKERHUB_USERNAME: summerwind

Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,8 @@ jobs:
uses: actions/checkout@v3
- uses: actions/setup-go@v3
with:
go-version: '^1.17.7'
go-version: '1.17.7'
check-latest: false
- run: go version
- uses: actions/cache@v3
with:
Expand Down
48 changes: 26 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ ToC:
- [Enterprise Runners](#enterprise-runners)
- [RunnerDeployments](#runnerdeployments)
- [RunnerSets](#runnersets)
- [Persistent Runners](#persistent-runners)
- [Autoscaling](#autoscaling)
- [Anti-Flapping Configuration](#anti-flapping-configuration)
- [Pull Driven Scaling](#pull-driven-scaling)
Expand All @@ -32,7 +33,6 @@ ToC:
- [Runner Groups](#runner-groups)
- [Runner Entrypoint Features](#runner-entrypoint-features)
- [Using IRSA (IAM Roles for Service Accounts) in EKS](#using-irsa-iam-roles-for-service-accounts-in-eks)
- [Persistent Runners](#persistent-runners)
- [Software Installed in the Runner Image](#software-installed-in-the-runner-image)
- [Using without cert-manager](#using-without-cert-manager)
- [Troubleshooting](#troubleshooting)
Expand Down Expand Up @@ -226,14 +226,16 @@ By default the controller will look for runners in all namespaces, the watch nam

This feature is configured via the controller's `--watch-namespace` flag. When a namespace is provided via this flag, the controller will only monitor runners in that namespace.

If you plan on installing all instances of the controller stack into a single namespace you will need to make the names of the resources unique to each stack. In the case of Helm this can be done by giving each install a unique release name, or via the `fullnameOverride` properties.
You can deploy multiple controllers either in a single shared namespace, or in a unique namespace per controller.

Alternatively, you can install each controller stack into its own unique namespace (relative to other controller stacks in the cluster), avoiding the need to uniquely prefix resources.
If you plan on installing all instances of the controller stack into a single namespace there are a few things you need to do for this to work.

When you go to the route of sharing the namespace while giving each a unique Helm release name, you must also ensure the following values are configured correctly:
1. All resources per stack must have a unique, in the case of Helm this can be done by giving each install a unique release name, or via the `fullnameOverride` properties.
2. `authSecret.name` needs be unique per stack when each stack is tied to runners in different GitHub organizations and repositories AND you want your GitHub credentials to narrowly scoped.
3. `leaderElectionId` needs to be unique per stack. If this is not unique to the stack the controller tries to race onto the leader election lock resulting in only one stack working concurrently. Your controller will be stuck with a log message something like this `attempting to acquire leader lease arc-controllers/actions-runner-controller...`
4. The MutatingWebhookConfiguration in each stack must include a namespace selector for that stacks corresponding runner namespace, this is already configured in the helm chart.

- `authSecret.name` needs be unique per stack when each stack is tied to runners in different GitHub organizations and repositories AND you want your GitHub credentials to narrowly scoped.
- `leaderElectionId` needs to be unique per stack. If this is not unique to the stack the controller tries to race onto the leader election lock and resulting in only one stack working concurrently.
Alternatively, you can install each controller stack into a unique namespace (relative to other controller stacks in the cluster). Implementing ARC this way avoids the first, second and third pitfalls (you still need to set the corresponding namespace selector for each stacks mutating webhook)

## Usage

Expand Down Expand Up @@ -365,6 +367,8 @@ example-runnerdeploy2475ht2qbr mumoshu/actions-runner-controller-ci Running

> This feature requires controller version => [v0.20.0](https://github.com/actions-runner-controller/actions-runner-controller/releases/tag/v0.20.0)
_Ensure you see the limitations before using this kind!!!!!_

For scenarios where you require the advantages of a `StatefulSet`, for example persistent storage, ARC implements a runner based on Kubernete's StatefulSets, the RunnerSet.

A basic `RunnerSet` would look like this:
Expand Down Expand Up @@ -448,6 +452,21 @@ Under the hood, `RunnerSet` relies on Kubernetes's `StatefulSet` and Mutating We
**Limitations**

* For autoscaling the `RunnerSet` kind only supports pull driven scaling or the `workflow_job` event for webhook driven scaling.
* Whilst `RunnerSets` support all runner modes as well as autoscaling, currently PVs are **NOT** automatically cleaned up as they are still bound to their respective PVCs when a runner is deleted by the controller. This has **major** implications when using `RunnerSets` in the standard runner mode, `ephemeral: true`, see [persistent runners](#persistent-runners) for more details. As a result of this, using the default ephemeral configuration or implementing autoscaling for your `RunnerSets`, you will get a build up of PVCs and PVs without some sort of custom solution for cleaning up.

### Persistent Runners

Every runner managed by ARC is "ephemeral" by default. The life of an ephemeral runner managed by ARC looks like this- ARC creates a runner pod for the runner. As it's an ephemeral runner, the `--ephemeral` flag is passed to the `actions/runner` agent that runs within the `runner` container of the runner pod.

`--ephemeral` is an `actions/runner` feature that instructs the runner to stop and de-register itself after the first job run.

Once the ephemeral runner has completed running a workflow job, it stops with a status code of 0, hence the runner pod is marked as completed, removed by ARC.

As it's removed after a workflow job run, the runner pod is never reused across multiple GitHub Actions workflow jobs, providing you a clean environment per each workflow job.

Although not generally recommended, it's possible to disable passing `--ephemeral` flag by explicitly setting `ephemeral: false` in the `RunnerDeployment` or `RunnerSet` spec. When disabled, your runner becomes "persistent". A persistent runner does not stop after workflow job ends, and in this mode `actions/runner` is known to clean only runner's work dir after each job. Whilst this can seem helpful it creates a non-deterministic environment which is not ideal for a CI/CD environment. Between runs your actions cache, docker images stored in the `dind` and layer cache, globally installed packages etc are retained across multiple workflow job runs which can cause issues which are hard to debug and inconsistent.

Persistent runners are available as an option for some edge cases however they are not preferred as they can create challenges around providing a deterministic and secure environment.

### Autoscaling

Expand Down Expand Up @@ -666,7 +685,7 @@ The primary benefit of autoscaling on Webhook compared to the pull driven scalin

> You can learn the implementation details in [#282](https://github.com/actions-runner-controller/actions-runner-controller/pull/282)
To enable this feature, you firstly need to install the webhook server, currently, only our Helm chart has the ability install it:
To enable this feature, you first need to install the GitHub webhook server. To install via our Helm chart,
_[see the values documentation for all configuration options](https://github.com/actions-runner-controller/actions-runner-controller/blob/master/charts/actions-runner-controller/README.md)_

```console
Expand Down Expand Up @@ -1309,21 +1328,6 @@ spec:
securityContext:
fsGroup: 1000
```

### Persistent Runners

Every runner managed by ARC is "ephemeral" by default. The life of an ephemeral runner managed by ARC looks like this- ARC creates a runner pod for the runner. As it's an ephemeral runner, the `--ephemeral` flag is passed to the `actions/runner` agent that runs within the `runner` container of the runner pod.

`--ephemeral` is an `actions/runner` feature that instructs the runner to stop and de-register itself after the first job run.

Once the ephemeral runner has completed running a workflow job, it stops with a status code of 0, hence the runner pod is marked as completed, removed by ARC.

As it's removed after a workflow job run, the runner pod is never reused across multiple GitHub Actions workflow jobs, providing you a clean environment per each workflow job.

Although not recommended, it's possible to disable passing `--ephemeral` flag by explicitly setting `ephemeral: false` in the `RunnerDeployment` or `RunnerSet` spec. When disabled, your runner becomes "persistent". A persistent runner does not stop after workflow job ends, and in this mode `actions/runner` is known to clean only runner's work dir after each job. That means your runner's environment, including various actions cache, docker images stored in the `dind` and layer cache, is retained across multiple workflow job runs.

Persistent runners are available as an option for some edge cases however they are not preferred as they can create challenges around providing a deterministic and secure environment.

### Software Installed in the Runner Image

**Cloud Tooling**<br />
Expand Down
3 changes: 3 additions & 0 deletions api/v1alpha1/runner_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,9 @@ func (rs *RunnerSpec) ValidateRepository() error {

// RunnerStatus defines the observed state of Runner
type RunnerStatus struct {
// Turns true only if the runner pod is ready.
// +optional
Ready bool `json:"ready"`
// +optional
Registration RunnerStatusRegistration `json:"registration"`
// +optional
Expand Down
4 changes: 2 additions & 2 deletions charts/actions-runner-controller/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,10 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.17.0
version: 0.17.3

# Used as the default manager tag value when no tag property is provided in the values.yaml
appVersion: 0.22.0
appVersion: 0.22.3

home: https://github.com/actions-runner-controller/actions-runner-controller

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5124,6 +5124,9 @@ spec:
type: string
phase:
type: string
ready:
description: Turns true only if the runner pod is ready.
type: boolean
reason:
type: string
registration:
Expand Down
2 changes: 1 addition & 1 deletion charts/actions-runner-controller/docs/UPGRADING.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Due to the above you can't just do a `helm upgrade` to release the latest versio

## Steps

1. Upgrade CRDs
1. Upgrade CRDs, this isn't optional, the CRDs you are using must be those that correspond with the version of the controller you are installing

```shell
# REMEMBER TO UPDATE THE CHART_VERSION TO RELEVANT CHART VERISON!!!!
Expand Down
35 changes: 35 additions & 0 deletions charts/actions-runner-controller/templates/webhook_configs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,11 @@ metadata:
webhooks:
- admissionReviewVersions:
- v1beta1
{{- if .Values.scope.singleNamespace }}
namespaceSelector:
matchLabels:
name: {{ default .Release.Namespace .Values.scope.watchNamespace }}
{{- end }}
clientConfig:
{{- if .Values.admissionWebHooks.caBundle }}
caBundle: {{ quote .Values.admissionWebHooks.caBundle }}
Expand All @@ -35,6 +40,11 @@ webhooks:
sideEffects: None
- admissionReviewVersions:
- v1beta1
{{- if .Values.scope.singleNamespace }}
namespaceSelector:
matchLabels:
name: {{ default .Release.Namespace .Values.scope.watchNamespace }}
{{- end }}
clientConfig:
{{- if .Values.admissionWebHooks.caBundle }}
caBundle: {{ .Values.admissionWebHooks.caBundle }}
Expand All @@ -58,6 +68,11 @@ webhooks:
sideEffects: None
- admissionReviewVersions:
- v1beta1
{{- if .Values.scope.singleNamespace }}
namespaceSelector:
matchLabels:
name: {{ default .Release.Namespace .Values.scope.watchNamespace }}
{{- end }}
clientConfig:
{{- if .Values.admissionWebHooks.caBundle }}
caBundle: {{ .Values.admissionWebHooks.caBundle }}
Expand All @@ -81,6 +96,11 @@ webhooks:
sideEffects: None
- admissionReviewVersions:
- v1beta1
{{- if .Values.scope.singleNamespace }}
namespaceSelector:
matchLabels:
name: {{ default .Release.Namespace .Values.scope.watchNamespace }}
{{- end }}
clientConfig:
{{- if .Values.admissionWebHooks.caBundle }}
caBundle: {{ .Values.admissionWebHooks.caBundle }}
Expand Down Expand Up @@ -117,6 +137,11 @@ metadata:
webhooks:
- admissionReviewVersions:
- v1beta1
{{- if .Values.scope.singleNamespace }}
namespaceSelector:
matchLabels:
name: {{ default .Release.Namespace .Values.scope.watchNamespace }}
{{- end }}
clientConfig:
{{- if .Values.admissionWebHooks.caBundle }}
caBundle: {{ .Values.admissionWebHooks.caBundle }}
Expand All @@ -140,6 +165,11 @@ webhooks:
sideEffects: None
- admissionReviewVersions:
- v1beta1
{{- if .Values.scope.singleNamespace }}
namespaceSelector:
matchLabels:
name: {{ default .Release.Namespace .Values.scope.watchNamespace }}
{{- end }}
clientConfig:
{{- if .Values.admissionWebHooks.caBundle }}
caBundle: {{ .Values.admissionWebHooks.caBundle }}
Expand All @@ -163,6 +193,11 @@ webhooks:
sideEffects: None
- admissionReviewVersions:
- v1beta1
{{- if .Values.scope.singleNamespace }}
namespaceSelector:
matchLabels:
name: {{ default .Release.Namespace .Values.scope.watchNamespace }}
{{- end }}
clientConfig:
{{- if .Values.admissionWebHooks.caBundle }}
caBundle: {{ .Values.admissionWebHooks.caBundle }}
Expand Down
3 changes: 3 additions & 0 deletions config/crd/bases/actions.summerwind.dev_runners.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5126,6 +5126,9 @@ spec:
type: string
phase:
type: string
ready:
description: Turns true only if the runner pod is ready.
type: boolean
reason:
type: string
registration:
Expand Down
23 changes: 23 additions & 0 deletions config/default/gh-webhook-server-auth-proxy-patch.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# This patch injects an HTTP proxy sidecar container that performs RBAC
# authorization against the Kubernetes API using SubjectAccessReviews.
apiVersion: apps/v1
kind: Deployment
metadata:
name: github-webhook-server
spec:
template:
spec:
containers:
- name: kube-rbac-proxy
image: quay.io/brancz/kube-rbac-proxy:v0.10.0
args:
- '--secure-listen-address=0.0.0.0:8443'
- '--upstream=http://127.0.0.1:8080/'
- '--logtostderr=true'
- '--v=10'
ports:
- containerPort: 8443
name: https
- name: github-webhook-server
args:
- '--metrics-addr=127.0.0.1:8080'
25 changes: 16 additions & 9 deletions config/default/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,19 +20,22 @@ bases:
- ../webhook
# [CERTMANAGER] To enable cert-manager, uncomment all sections with 'CERTMANAGER'. 'WEBHOOK' components are required.
- ../certmanager
# [PROMETHEUS] To enable prometheus monitor, uncomment all sections with 'PROMETHEUS'.
# [PROMETHEUS] To enable prometheus monitor, uncomment all sections with 'PROMETHEUS'.
#- ../prometheus
# [GH_WEBHOOK_SERVER] To enable the GitHub webhook server, uncomment all sections with 'GH_WEBHOOK_SERVER'.
#- ../github-webhook-server

patchesStrategicMerge:
# Protect the /metrics endpoint by putting it behind auth.
# Only one of manager_auth_proxy_patch.yaml and
# manager_prometheus_metrics_patch.yaml should be enabled.
# Protect the /metrics endpoint by putting it behind auth.
# Only one of manager_auth_proxy_patch.yaml and
# manager_prometheus_metrics_patch.yaml should be enabled.
- manager_auth_proxy_patch.yaml
# If you want your controller-manager to expose the /metrics
# endpoint w/o any authn/z, uncomment the following line and
# comment manager_auth_proxy_patch.yaml.
# Only one of manager_auth_proxy_patch.yaml and
# manager_prometheus_metrics_patch.yaml should be enabled.

# If you want your controller-manager to expose the /metrics
# endpoint w/o any authn/z, uncomment the following line and
# comment manager_auth_proxy_patch.yaml.
# Only one of manager_auth_proxy_patch.yaml and
# manager_prometheus_metrics_patch.yaml should be enabled.
#- manager_prometheus_metrics_patch.yaml

# [WEBHOOK] To enable webhook, uncomment all the sections with [WEBHOOK] prefix including the one in crd/kustomization.yaml
Expand All @@ -43,6 +46,10 @@ patchesStrategicMerge:
# 'CERTMANAGER' needs to be enabled to use ca injection
- webhookcainjection_patch.yaml

# [GH_WEBHOOK_SERVER] To enable the GitHub webhook server, uncomment all sections with 'GH_WEBHOOK_SERVER'.
# Protect the GitHub webhook server metrics endpoint by putting it behind auth.
# - gh-webhook-server-auth-proxy-patch.yaml

# the following config is for teaching kustomize how to do var substitution
vars:
# [CERTMANAGER] To enable cert-manager, uncomment all sections with 'CERTMANAGER' prefix.
Expand Down
1 change: 0 additions & 1 deletion config/default/manager_auth_proxy_patch.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,4 +23,3 @@ spec:
args:
- "--metrics-addr=127.0.0.1:8080"
- "--enable-leader-election"
- "--sync-period=10m"
Loading

0 comments on commit 711a846

Please sign in to comment.