Skip to content

An error occurred: Runner not found #3857

Open
@theihor

Description

@theihor

Describe the bug

Hello. I am maintaining Linux Kernel BPF CI.

We use self-hosted runners on various hardware. Recently they started failing with the following vague error: An error occurred: Runner not found. It affects all architectures that we use (x86_64, aarch64 and s390x). This then causes the runner systemd service to restart, and then the GH App quickly runs out of tokens due to frequent restarts.

Sometimes some runners are able to pick up a job when the tokens reset, but most fail with this error.

As far as I can tell, runner not found message indicates RunnerNotFoundException caught at https://github.com/actions/runner/blob/main/src/Runner.Listener/Program.cs#L148-L153

The only place it is thrown from is https://github.com/actions/runner/blob/main/src/Sdk/WebApi/WebApi/BrokerHttpClient.cs#L118-L119
Which means that github broker endpoint returns this error.

However, there is no clue what it means exactly. The runners are ephemeral with no reuse, and "replace" themselves on each run.

There were no recent changes to the self-hosted infra, or runner updates, or in the workflows.

From systemd logs on the hosts, it appears that these errors first started to appear some time in April, but in the last few days they became much more frequent to the point of blocking the testing pipeline.

To Reproduce
N/A

Expected behavior

Runners should pick up and run jobs normally.

Runner Version and Platform

v2.323.0 on Linux x86_64, aarch64 and s390x

What's not working?

Runners fail to pick up jobs after successful authentication.

Job Log Output

May 13 00:07:18 bpf-ci-runner-s390x-01 systemd[1]: Started actions-runner-kernel-patches@01.service - Ephemeral GitHub Actions Runner Container for kernel-patches - 01.
May 13 00:07:19 bpf-ci-runner-s390x-01 docker[5099]: Runner reusage is disabled
May 13 00:07:19 bpf-ci-runner-s390x-01 docker[5099]: Obtaining the token of the runner
May 13 00:07:19 bpf-ci-runner-s390x-01 docker[5099]: Ephemeral option is enabled
May 13 00:07:19 bpf-ci-runner-s390x-01 docker[5099]: Disable auto update option is enabled
May 13 00:07:19 bpf-ci-runner-s390x-01 docker[5099]: Configuring
May 13 00:07:19 bpf-ci-runner-s390x-01 docker[5099]: --------------------------------------------------------------------------------
May 13 00:07:19 bpf-ci-runner-s390x-01 docker[5099]: |        ____ _ _   _   _       _          _        _   _                      |
May 13 00:07:19 bpf-ci-runner-s390x-01 docker[5099]: |       / ___(_) |_| | | |_   _| |__      / \   ___| |_(_) ___  _ __  ___      |
May 13 00:07:19 bpf-ci-runner-s390x-01 docker[5099]: |      | |  _| | __| |_| | | | | '_ \    / _ \ / __| __| |/ _ \| '_ \/ __|     |
May 13 00:07:19 bpf-ci-runner-s390x-01 docker[5099]: |      | |_| | | |_|  _  | |_| | |_) |  / ___ \ (__| |_| | (_) | | | \__ \     |
May 13 00:07:19 bpf-ci-runner-s390x-01 docker[5099]: |       \____|_|\__|_| |_|\__,_|_.__/  /_/   \_\___|\__|_|\___/|_| |_|___/     |
May 13 00:07:19 bpf-ci-runner-s390x-01 docker[5099]: |                                                                              |
May 13 00:07:19 bpf-ci-runner-s390x-01 docker[5099]: |                       Self-hosted runner registration                        |
May 13 00:07:19 bpf-ci-runner-s390x-01 docker[5099]: |                                                                              |
May 13 00:07:19 bpf-ci-runner-s390x-01 docker[5099]: --------------------------------------------------------------------------------
May 13 00:07:19 bpf-ci-runner-s390x-01 docker[5099]: # Authentication
May 13 00:07:21 bpf-ci-runner-s390x-01 docker[5099]: √ Connected to GitHub
May 13 00:07:21 bpf-ci-runner-s390x-01 docker[5099]: # Runner Registration
May 13 00:07:22 bpf-ci-runner-s390x-01 docker[5099]: A runner exists with the same name
May 13 00:07:22 bpf-ci-runner-s390x-01 docker[5099]: √ Successfully replaced the runner
May 13 00:07:22 bpf-ci-runner-s390x-01 docker[5099]: √ Runner connection is good
May 13 00:07:22 bpf-ci-runner-s390x-01 docker[5099]: # Runner settings
May 13 00:07:22 bpf-ci-runner-s390x-01 docker[5099]: √ Settings Saved.
May 13 00:07:24 bpf-ci-runner-s390x-01 docker[5099]: √ Connected to GitHub
May 13 00:07:24 bpf-ci-runner-s390x-01 docker[5099]: Current runner version: '2.323.0'
May 13 00:07:24 bpf-ci-runner-s390x-01 docker[5099]: 2025-05-13 00:07:24Z: Listening for Jobs
May 13 00:07:25 bpf-ci-runner-s390x-01 docker[5099]: An error occurred: Runner not found
May 13 00:07:25 bpf-ci-runner-s390x-01 systemd[1]: actions-runner-kernel-patches@01.service: Main process exited, code=exited, status=1/FAILURE
May 13 00:07:25 bpf-ci-runner-s390x-01 systemd[1]: actions-runner-kernel-patches@01.service: Failed with result 'exit-code'.

Runner and Worker's Diagnostic Logs

Didn't have a chance to capture these. Will add a comment when I do.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions