Actions-Runner-Controller support for Gitea Actions #29567

omniproc · 2024-03-03T21:29:52Z

Feature Description

The Gitea Actions release was a great first step. But currently it's missing many features of a more mature solution based on K8s runners rather then single nodes. While it's possible to have runners on K8s this currently requires DinD which has it's hole set of own problems, security issues (privileged exec required as of today) and feature limitations (can't use DinD to start another container to build a container image (DinDinD)). I know with buildx workarounds exist, but those are just that: workarounds.

I think the next step could be something like what actions-runner-controller is doing for GitHub actions. Basically a operator that is deployed on K8s and registers as runner. Every job it starts is then started in it's own pod rather then the runner itself. The runner coordinates the pods.

Related docs:

Screenshots

No response

ChristopherHX · 2024-03-04T16:28:04Z

k8s hooks are technically (means there is no documentation, the docker compose examples use dind + docker hooks) already usable with Gitea Actions see this third party runner adapter https://gitea.com/gitea/awesome-gitea/pulls/149

Actions-Runner-Controller would require emulation of a bigger set of internal githib actions api

I actually find this interesting to reverse engineer that product too, but I never dealt with k8s myself.

act_runner with it's act backend doesn't support container hooks or k8s for the time beeing

omniproc · 2024-03-05T05:49:21Z

Interesting. I wasn't aware you could change the runner implementation just like that. Def will look into it. However given what you said about DinD still being a requirement I don't think it will change much (we already have our runners on K8s with DinD using a adopted version of gitea/act-runner for k8s but as mentioned, this comes with many headaches).

The goal IMHO would be to be able to start workflows on k8s directly. Possible implementations:

Every job is it's own pod. Challenge: data sharing between jobs would require PVs and complicated mount/unmount logic to support the more common RWO PVs. I'm aware that currently github's approach to data sharing between jobs is "yo dawg, just upload it to our artifact store" but in on-prem scenarios that's not what you normally want so some sort of common local cache between jobs is a relevant feature at least I would be very interested in.
Every workflow is a pod. Jobs start as containers. Benefit: all containers can have access to the same data easily using e.g. a EmptyDir volume. Challenge: pods are immutable so:
- either all jobs (==containers) need to be present when the pod starts, requiring some kind of wait logic when we need job dependencies which pbl. comes with it's own set of problems.
- possibly ephemeral container could be used to add containers (==jobs) to a pod at runtime when a dependent job is ready. However ephemeral containers come with a set of limitations and are ment for a different use case, so I'm not sure if that would be a good fit.

Option one (every job is it's own pod) seems like the most promissing option in my opinion.

ChristopherHX · 2024-03-05T10:22:50Z

However given what you said about DinD still being a requirement I don't think it will change much

I meant, I didn't create any k8s mode examples / actually tried it yet. Sorry for confusion here.

The docker container hooks only allow dind for k8s. While the k8s hooks should use kubernetes api for container management, I still need to look into creating a test setup running.

I can imagine

(controller) actions_runner is started with maxparallel 100 (yes it's possible to use any value >= 1)
(job controller) a worker script (spawned when a job request is received) forwards stdin and the network to the adapter to spawn the actions/runner
(actual job) k8s hooks spawn a job container using k8s apis

Well not using act_runner has limitations when you try to use Gitea Actions Extensions (using features not present in GitHub Actions)

I think option 1 is more likly to happen than option 2. Job scheduling is based on jobs not on workflows.

ChristopherHX · 2024-03-05T15:55:11Z

k8shooks works for me using these files on minikube (arm64)

actions-runner-k8s-gitea-sample-files.zip

Missing usage of secrets, need to learn kubernetes
No autoscaling
No persistence of runner credentials

With clever sharing of the runner credentials volume, you could start a lot of replicas for more parallel runners

This works without dind

Test workflow

on: push
jobs:
  _:
    runs-on: k8s # <-- Used runner label
    container: ubuntu:latest # <-- Required, maybe the Gitea Actions adapter could insert a default
    steps:
    # Git is needed for actions/checkout to work for Gitea, rest api is not compatible
    - run: apt update && apt install -y git
    - uses: https://github.com/actions/checkout@v3 # <-- The almost only Gitea Extension supported
    - run: ls -la
    - run: ls -la .github/workflows

The runner-pod-workflow is the job container pod, running directly via k8s.

omniproc · 2024-03-05T18:30:56Z

Looks promising. I'll give it a shot and share my findings.

omniproc · 2024-03-12T16:11:25Z

Okay, so... there seems to be some issues with the current setup. Let me share my findings:

You've been asking how to provide secrets in K8s, it's as simple as that:

- name: GITEA_RUNNER_REGISTRATION_TOKEN
   valueFrom:
      secretKeyRef:
        name: secret_name
        key: secret_key

and creating your secret with (take care: K8s is case sensitive):

apiVersion: v1
kind: Secret
metadata:
  name: secret_name
type: Opaque
stringData:
  secret_key: "s3cr3t"

You shouldn't start pods in K8s directly but rather wrap them into a higher level resource such as a deployment which will make it benefit from the (deployment) controller logic when updating or self-healing the pod. I did that so the result looks something like this:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: runner
  name: runner
spec:
  replicas: 1
  selector:
    matchLabels:
      app: runner
  template:
    metadata:
      labels:
        app: runner
    spec:
      strategy:
        type: Recreate
      restartPolicy: Always
      serviceAccountName: ci-builder
      #securityContext:
      #  runAsNonRoot: true
      #  runAsUser: 1000
      #  runAsGroup: 1000
      #  seccompProfile:
      #    type: RuntimeDefault
      volumes:
        - name: workspace
          emptyDir:
            sizeLimit: 5Gi
      containers:
      - name: runner
        image: ghcr.io/christopherhx/gitea-actions-runner:v0.0.11
        #securityContext:
        #  readOnlyRootFilesystem: true
        #  allowPrivilegeEscalation: false
        #  capabilities:
        #    drop:
        #      - ALL
        volumeMounts:
          - mountPath: /home/runner/_work
            name: workspace
        env:
          - name: ACTIONS_RUNNER_POD_NAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: ACTIONS_RUNNER_REQUIRE_JOB_CONTAINER
            value: "true"
          - name: ACTIONS_RUNNER_CONTAINER_HOOKS
            value: /home/runner/k8s/index.js
          - name: GITEA_INSTANCE_URL
            value: https://foo.bar
          - name: GITEA_RUNNER_REGISTRATION_TOKEN
            valueFrom:
              secretKeyRef:
                name: gitea
                key: token
          - name: GITEA_RUNNER_LABELS
            value: k8s
        resources:
          requests:
            cpu: 500m
            memory: 2Gi
          limits:
            cpu: 1000m
            memory: 8Gi

Few changes I made here:

For the volume if no persistence across new pods started by the runner is needed a volume of type emptyDir can act as a temporary volume to share data between containers of a pod and write data to a well known location.
I added a resources section to follow best practice. numbers pbl. need to be adopted to something that makes more sense.
I added securityContext but needed to disable it for now for trouble shooting since it currently can't work as needed because of some issues with the current runner setup:
- The Dockerfile switches to the runner user using it's name in USER runner. K8s doesn't like that if runAsNonRoot is specified but no runAsUser is given in the security context and the image is using a "non-numeric" user. I'd opt in for using USER 1000 in the Dockerfile instead, which should make this easier in the future.
- allowPrivilegeEscalation: false can't currently be used because start.sh makes use of sudo to create the folder layout: sudo chown -R runner:docker /home/runner/_work and sudo chown -R runner:docker /data. I think a better approach would be to just create those folders within the mounted EmptyDir volume. The running user should already have all permissions there to create the folders so no sudo would be needed but I'm not sure what those folders are currently used for and how hardcoded those paths are.
- readOnlyRootFilesystem will pbl. also cause issues in the future when other paths then the mounted volume is used and again, I think the easiest way to allow for max. container security in k8s would be to simply not use the root fs at all but simply do everything on the mounted volume.

So, those are simply improvement suggestions for the future. For now as you can see I've been trying to keep it as simple as possible, but I still run into a issue. The runner starts and registers, but when using the job you provided I run into the following error returned by the job:


[WORKER 2024-03-12 15:59:08Z INFO HostContext] Well known directory 'Bin': '/home/runner/bin'
[WORKER 2024-03-12 15:59:08Z INFO HostContext] Well known directory 'Root': '/home/runner'
[WORKER 2024-03-12 15:59:08Z INFO HostContext] Well known config file 'Credentials': '/home/runner/.credentials'
[WORKER 2024-03-12 15:59:08Z INFO HostContext] Well known directory 'Bin': '/home/runner/bin'
[WORKER 2024-03-12 15:59:08Z INFO HostContext] Well known directory 'Root': '/home/runner'
[WORKER 2024-03-12 15:59:08Z INFO HostContext] Well known config file 'Runner': '/home/runner/.runner'
[WORKER 2024-03-12 15:59:08Z INFO Worker] Version: 2.314.0
[WORKER 2024-03-12 15:59:08Z INFO Worker] Commit: bc79e859d7b66e8018716bc94160656f6c6948fc
[WORKER 2024-03-12 15:59:08Z INFO Worker] Culture: 
[WORKER 2024-03-12 15:59:08Z INFO Worker] UI Culture: 
[WORKER 2024-03-12 15:59:08Z INFO Worker] Waiting to receive the job message from the channel.
[WORKER 2024-03-12 15:59:08Z INFO ProcessChannel] Receiving message of length 6322, with hash '30564f1b4d3e28c3d9cc39d17eca1132cc026a2abeb6ab1be6736d80cf019ea9'
[WORKER 2024-03-12 15:59:08Z INFO Worker] Message received.
Newtonsoft.Json.JsonReaderException: Invalid character after parsing property name. Expected ':' but got:  . Path 'ContextData.github.d[20].v.d[5].v.d[14].v.d[11].v', line 1, position 6322.
   at Newtonsoft.Json.JsonTextReader.ParseProperty()
   at Newtonsoft.Json.JsonTextReader.ParseObject()
   at Newtonsoft.Json.Linq.JContainer.ReadContentFrom(JsonReader r, JsonLoadSettings settings)
   at Newtonsoft.Json.Linq.JContainer.ReadTokenFrom(JsonReader reader, JsonLoadSettings options)
   at Newtonsoft.Json.Linq.JObject.Load(JsonReader reader, JsonLoadSettings settings)
   at Newtonsoft.Json.Linq.JObject.Load(JsonReader reader)
   at GitHub.DistributedTask.Pipelines.ContextData.PipelineContextDataJsonConverter.ReadJson(JsonReader reader, Type objectType, Object existingValue, JsonSerializer serializer)
   at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.DeserializeConvertable(JsonConverter converter, JsonReader reader, Type objectType, Object existingValue)
   at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.PopulateDictionary(IDictionary dictionary, JsonReader reader, JsonDictionaryContract contract, JsonProperty containerProperty, String id)
   at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.CreateObject(JsonReader reader, Type objectType, JsonContract contract, JsonProperty member, JsonContainerContract containerContract, JsonProperty containerMember, Object existingValue)
   at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.CreateValueInternal(JsonReader reader, Type objectType, JsonContract contract, JsonProperty member, JsonContainerContract containerContract, JsonProperty containerMember, Object existingValue)
   at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.SetPropertyValue(JsonProperty property, JsonConverter propertyConverter, JsonContainerContract containerContract, JsonProperty containerProperty, JsonReader reader, Object target)
   at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.PopulateObject(Object newObject, JsonReader reader, JsonObjectContract contract, JsonProperty member, String id)
   at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.CreateObject(JsonReader reader, Type objectType, JsonContract contract, JsonProperty member, JsonContainerContract containerContract, JsonProperty containerMember, Object existingValue)
   at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.CreateValueInternal(JsonReader reader, Type objectType, JsonContract contract, JsonProperty member, JsonContainerContract containerContract, JsonProperty containerMember, Object existingValue)
   at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.Deserialize(JsonReader reader, Type objectType, Boolean checkAdditionalContent)
   at Newtonsoft.Json.JsonSerializer.DeserializeInternal(JsonReader reader, Type objectType)
   at Newtonsoft.Json.JsonSerializer.Deserialize(JsonReader reader, Type objectType)
   at Newtonsoft.Json.JsonConvert.DeserializeObject(String value, Type type, JsonSerializerSettings settings)
   at Newtonsoft.Json.JsonConvert.DeserializeObject[T](String value, JsonSerializerSettings settings)
   at GitHub.Runner.Sdk.StringUtil.ConvertFromJson[T](String value)
   at GitHub.Runner.Worker.Worker.RunAsync(String pipeIn, String pipeOut)
   at GitHub.Runner.Worker.Program.MainAsync(IHostContext context, String[] args)
[WORKER 2024-03-12 15:59:09Z ERR  Worker] Newtonsoft.Json.JsonReaderException: Invalid character after parsing property name. Expected ':' but got:  . Path 'ContextData.github.d[20].v.d[5].v.d[14].v.d[11].v', line 1, position 6322.
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at Newtonsoft.Json.JsonTextReader.ParseProperty()
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at Newtonsoft.Json.JsonTextReader.ParseObject()
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at Newtonsoft.Json.Linq.JContainer.ReadContentFrom(JsonReader r, JsonLoadSettings settings)
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at Newtonsoft.Json.Linq.JContainer.ReadTokenFrom(JsonReader reader, JsonLoadSettings options)
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at Newtonsoft.Json.Linq.JObject.Load(JsonReader reader, JsonLoadSettings settings)
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at Newtonsoft.Json.Linq.JObject.Load(JsonReader reader)
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at GitHub.DistributedTask.Pipelines.ContextData.PipelineContextDataJsonConverter.ReadJson(JsonReader reader, Type objectType, Object existingValue, JsonSerializer serializer)
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.DeserializeConvertable(JsonConverter converter, JsonReader reader, Type objectType, Object existingValue)
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.PopulateDictionary(IDictionary dictionary, JsonReader reader, JsonDictionaryContract contract, JsonProperty containerProperty, String id)
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.CreateObject(JsonReader reader, Type objectType, JsonContract contract, JsonProperty member, JsonContainerContract containerContract, JsonProperty containerMember, Object existingValue)
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.CreateValueInternal(JsonReader reader, Type objectType, JsonContract contract, JsonProperty member, JsonContainerContract containerContract, JsonProperty containerMember, Object existingValue)
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.SetPropertyValue(JsonProperty property, JsonConverter propertyConverter, JsonContainerContract containerContract, JsonProperty containerProperty, JsonReader reader, Object target)
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.PopulateObject(Object newObject, JsonReader reader, JsonObjectContract contract, JsonProperty member, String id)
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.CreateObject(JsonReader reader, Type objectType, JsonContract contract, JsonProperty member, JsonContainerContract containerContract, JsonProperty containerMember, Object existingValue)
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.CreateValueInternal(JsonReader reader, Type objectType, JsonContract contract, JsonProperty member, JsonContainerContract containerContract, JsonProperty containerMember, Object existingValue)
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.Deserialize(JsonReader reader, Type objectType, Boolean checkAdditionalContent)
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at Newtonsoft.Json.JsonSerializer.DeserializeInternal(JsonReader reader, Type objectType)
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at Newtonsoft.Json.JsonSerializer.Deserialize(JsonReader reader, Type objectType)
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at Newtonsoft.Json.JsonConvert.DeserializeObject(String value, Type type, JsonSerializerSettings settings)
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at Newtonsoft.Json.JsonConvert.DeserializeObject[T](String value, JsonSerializerSettings settings)
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at GitHub.Runner.Sdk.StringUtil.ConvertFromJson[T](String value)
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at GitHub.Runner.Worker.Worker.RunAsync(String pipeIn, String pipeOut)
[WORKER 2024-03-12 15:59:09Z ERR  Worker]    at GitHub.Runner.Worker.Program.MainAsync(IHostContext context, String[] args)

##[Error]failed to execute worker exitcode: 1

/Edit so the root cause seems to be somewhere here: https://github.com/actions/runner/blob/v2.314.0/src/Runner.Worker/Program.cs#L20

In addition I found that providing a runner config by mounting one and setting the CONFIG_FILE env var doesn't seem to work, you'll get a Error: unknown flag: --config if you try. Root cause seems to be this.

ChristopherHX · 2024-03-12T23:47:07Z

I didn't got this this kind of error before (at least for a year)

Receiving message of length 6322, with hash '30564f1b4d3e28c3d9cc39d17eca1132cc026a2abeb6ab1be6736d80cf019ea9'
[WORKER 2024-03-12 15:59:08Z INFO Worker] Message received.
Newtonsoft.Json.JsonReaderException: Invalid character after parsing property name. Expected ':' but got: . Path 'ContextData.github.d[20].v.d[5].v.d[14].v.d[11].v', line 1, position 6322.

Sounds like the message inside the container got trimmed before it reached the actions/runner.

Based on the error the begin was sent to the actions/runner successfully

Maybe some data specfic to your test setup might cause this. (even parts not in the repo are stored in the message)

I would need to add more debug logging to diagnose this

omniproc · 2024-03-13T14:12:57Z

If you add the logging I can reproduce the issue if you like. My guess is that's it's maybe proxy related. But can't tell from the error logs.

ChristopherHX · 2024-03-15T10:19:36Z

@omniproc you made changes via the deployment file that are not compatible with actions/runner k8s container hooks and I have no idea if using a deployment is possible.
Actions-Runner-Controller might use helm charts + kubernetes api, not shure how they do that.

Unable to attach or mount volumes: unmounted volumes=[work], unattached volumes=[], failed to process volumes=[work]: error processing PVC default/runner-785778b969-v88f8-work: failed to fetch PVC from API server: persistentvolumeclaims "runner-785778b969-v88f8-work" not found

the workspace cannot be an empty dir volume, like in my example files it is required to be a persistentvolumeclaim

You can technically change the name of the pvc via ACTIONS_RUNNER_CLAIM_NAME env, but I don't know how to get a dynamically generated name of a volume. See https://github.com/actions/runner-container-hooks/blob/main/packages/k8s/README.md, if that doesn't match it will error out.

allowPrivilegeEscalation: false can't currently be used because start.sh makes use of sudo to create the folder layout: sudo chown -R runner:docker /home/runner/_work and sudo chown -R runner:docker /data. I think a better approach would be to just create those folders within the mounted EmptyDir volume. The running user should already have all permissions there to create the folders so no sudo would be needed but I'm not sure what those folders are currently used for and how hardcoded those paths are.

This led mkdir /data fail and you get an error about a .runner file.

Would require an empty dir mount

          - mountPath: /data
            name: data

Maybe if I create that dir in the Dockerfile it would work without that as long your fs is read write

The nightly doesn't have sudo anymore in the start.sh file, but it can still certainly break existing non k8s setups as of now.

If you add the logging I can reproduce the issue if you like. My guess is that's it's maybe proxy related. But can't tell from the error logs.

I found a mistake in the python wrapper file, probably due to resource constaints to RAM has os.read read less than expected and shorten the message.

I also added some asserts about return values of pipe communication + env ACTIONS_RUNNER_WORKER_DEBUG would print the job message from python side.

Please try to use that nightly image
https://github.com/ChristopherHX/gitea-actions-runner/pkgs/container/gitea-actions-runner/190660665?tag=nightly
important change to the os/arch tab and copy full tag + sha variant, I had problems with old cached nightly images.

it should get you to the point that you omited the persistentvolumeclaims of my example and kubernetes cannot start the job pod (also make shure to create an empty dir mount at /data/)

omniproc added the type/proposal The new feature has not been accepted yet but needs to be discussed first. label Mar 3, 2024

sillyguodong added the topic/gitea-actions related to the actions of Gitea label Mar 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Actions-Runner-Controller support for Gitea Actions #29567

Actions-Runner-Controller support for Gitea Actions #29567

omniproc commented Mar 3, 2024

ChristopherHX commented Mar 4, 2024

omniproc commented Mar 5, 2024 •

edited

ChristopherHX commented Mar 5, 2024

ChristopherHX commented Mar 5, 2024 •

edited

omniproc commented Mar 5, 2024

omniproc commented Mar 12, 2024 •

edited

ChristopherHX commented Mar 12, 2024

omniproc commented Mar 13, 2024 •

edited

ChristopherHX commented Mar 15, 2024 •

edited

Actions-Runner-Controller support for Gitea Actions #29567

Actions-Runner-Controller support for Gitea Actions #29567

Comments

omniproc commented Mar 3, 2024

Feature Description

Screenshots

ChristopherHX commented Mar 4, 2024

omniproc commented Mar 5, 2024 • edited

ChristopherHX commented Mar 5, 2024

ChristopherHX commented Mar 5, 2024 • edited

omniproc commented Mar 5, 2024

omniproc commented Mar 12, 2024 • edited

ChristopherHX commented Mar 12, 2024

omniproc commented Mar 13, 2024 • edited

ChristopherHX commented Mar 15, 2024 • edited

omniproc commented Mar 5, 2024 •

edited

ChristopherHX commented Mar 5, 2024 •

edited

omniproc commented Mar 12, 2024 •

edited

omniproc commented Mar 13, 2024 •

edited

ChristopherHX commented Mar 15, 2024 •

edited