Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document how to add Windows self-hosted runners #1608

Merged
merged 6 commits into from
Aug 23, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
113 changes: 113 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ ToC:
- [Using IRSA (IAM Roles for Service Accounts) in EKS](#using-irsa-iam-roles-for-service-accounts-in-eks)
- [Software Installed in the Runner Image](#software-installed-in-the-runner-image)
- [Using without cert-manager](#using-without-cert-manager)
- [Windows Runners](#setting-up-windows-runners)
- [Multitenancy](#multitenancy)
- [Troubleshooting](#troubleshooting)
- [Contributing](#contributing)
Expand Down Expand Up @@ -1764,6 +1765,118 @@ $ helm --upgrade install actions-runner-controller/actions-runner-controller \
admissionWebHooks.caBundle=${CA_BUNDLE}
```

### Setting up Windows Runners

The main two steps in enabling Windows self-hosted runners are:

- Using `nodeSelector`'s property to filter the `cert-manger` and `actions-runner-controller` pods
- Deploying a RunnerDeployment using a Windows-based image

For the first step, you need to set the `nodeSelector.kubernetes.io/os` property in both the `cert-manager` and the `actions-runner-controller` deployments to `linux` so that the pods for these two deployments are only scheduled in Linux nodes. You can do this as follows:

```yaml
nodeSelector:
kubernetes.io/os: linux
```

`cert-manager` has 4 different application within it the main application, the `webhook`, the `cainjector` and the `startupapicheck`. In the parameters or values file you use for the deployment you need to add the `nodeSelector` property four times, one for each application.

For the `actions-runner-controller` you only have to use the `nodeSelector` only for the main deployment, so it only has to be set once.

Once this is set up, you will need to deploy two different `RunnerDeployment`'s, one for Windows and one for Linux.
The Linux deployment can use either the default image or a custom one, however, there isn't a default Windows image so for Windows deployments you will have to build your own image.

Below we share an example of the YAML used to create the deployment for each Operating System and a Dockerfile for the Windows deployment.

<details><summary>Windows</summary>
<p>

#### RunnerDeployment

```yaml
---
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
name: k8s-runners-windows
namespace: actions-runner-system
spec:
template:
spec:
image: <repo>/<image>:<windows-tag>
dockerdWithinRunnerContainer: true
nodeSelector:
kubernetes.io/os: windows
kubernetes.io/arch: amd64
repository: <owner>/<repo>
labels:
- windows
- X64
- devops-managed
```

#### Dockerfile

mumoshu marked this conversation as resolved.
Show resolved Hide resolved
> Note that you'd need to patch the below Dockerfile if you need a graceful termination.
> See https://github.com/actions-runner-controller/actions-runner-controller/pull/1608/files#r917319574 for more information.

```Dockerfile
FROM mcr.microsoft.com/windows/servercore:ltsc2019

WORKDIR /actions-runner

SHELL ["powershell", "-Command", "$ErrorActionPreference = 'Stop';$ProgressPreference='silentlyContinue';"]

RUN Invoke-WebRequest -Uri https://github.com/actions/runner/releases/download/v2.292.0/actions-runner-win-x64-2.292.0.zip -OutFile actions-runner-win-x64-2.292.0.zip

RUN if((Get-FileHash -Path actions-runner-win-x64-2.292.0.zip -Algorithm SHA256).Hash.ToUpper() -ne 'f27dae1413263e43f7416d719e0baf338c8d80a366fed849ecf5fffcec1e941f'.ToUpper()){ throw 'Computed checksum did not match' }

RUN Add-Type -AssemblyName System.IO.Compression.FileSystem ; [System.IO.Compression.ZipFile]::ExtractToDirectory('actions-runner-win-x64-2.292.0.zip', $PWD)

RUN Invoke-WebRequest -Uri 'https://aka.ms/install-powershell.ps1' -OutFile install-powershell.ps1; ./install-powershell.ps1 -AddToPath

RUN powershell Set-ExecutionPolicy Bypass -Scope Process -Force; [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072; iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1'))

RUN powershell choco install git.install --params "'/GitAndUnixToolsOnPath'" -y

RUN powershell choco feature enable -n allowGlobalConfirmation

CMD [ "pwsh", "-c", "./config.cmd --name $env:RUNNER_NAME --url https://github.com/$env:RUNNER_REPO --token $env:RUNNER_TOKEN --labels $env:RUNNER_LABELS --unattended --replace --ephemeral; ./run.cmd"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ian-flores Hey! Thanks for submitting this awesome pull request 🙏

Although I'm not an expert in managing Windows machines as K8s nodes, I tried my best to review this and everything looked generally good.

My only concern is in this line- how does it handle termination signals sent by Kubernetes?

On Linux, the PID 0 of each container gets SIGTERM on pod termination and usually, the PID 0 should gracefully stop all its child processes. We implement this for ARC with dumb-init, as you can see at:
https://github.com/actions-runner-controller/actions-runner-controller/blob/8b619e7c6fa9b8b07ff184e7882cec4108d0a52e/runner/actions-runner.dockerfile#L131-L132

Is PowerShell supposed to handle it, or do you need your own way?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @mumoshu

We are also quite new with Windows, but I'll try to answer here. As far as I know there are not termination signals on Windows, it is handled with the CRTL_SHUTDOWN_EVENT. On the kubernetes documentation I found this:

terminationGracePeriodSeconds - this is not fully implemented in Docker on Windows, see the GitHub issue. The behavior today is that the ENTRYPOINT process is sent CTRL_SHUTDOWN_EVENT, then Windows waits 5 seconds by default, and finally shuts down all processes using the normal Windows shutdown behavior. The 5 second default is actually in the Windows registry inside the container, so it can be overridden when the container is built.

I've found also this interesting article explaining windows shutdown process:

https://technoresult.com/windows-shutdown-process-behind-the-scean/

Which also says that at some point all running processes are shutdown with a grace period of 5 seconds with the same signal, the timeout for this is configurable as well.

As you know, on Linux you can handle everything in a much easier way, but Windows is different. We can play with shutdown timeouts and implement some logic for handling the CTRL_SHUTDOWN_EVENT signal with our entrypoint, but I don't think it's worth it because we would have to send more signals to processes, which the windows shutting down process already does.

Let me know what you think

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amaldonadomat Good to know and totally understood. Thank you so much for sharing your knowledge and thoughts!

```
</p>
</details>


<details><summary>Linux</summary>
<p>

#### RunnerDeployment

```yaml
---
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
name: k8s-runners-linux
namespace: actions-runner-system
spec:
template:
spec:
image: <repo>/<image>:<linux-tag>
nodeSelector:
kubernetes.io/os: linux
kubernetes.io/arch: amd64
repository: <owner>:<repo>
labels:
- linux
- X64
- devops-managed
```
</p>
</details>

After both `RunnerDeployment`'s are up and running, you can now proceed to deploy the `HorizontalRunnerAutoscaler` for each deployment.

### Multitenancy

> This feature requires controller version => [v0.26.0](https://github.com/actions-runner-controller/actions-runner-controller/releases/tag/v0.26.0)
Expand Down