Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Self update failed due to /externals mount point #252

Closed
e36-cicd opened this issue Jan 11, 2021 · 2 comments · Fixed by #253
Closed

Self update failed due to /externals mount point #252

e36-cicd opened this issue Jan 11, 2021 · 2 comments · Fixed by #253

Comments

@e36-cicd
Copy link

Hi,

We ran into this issue - whenever it tries to handle a job, it detects it needs to auto update and then crashes at the end. After adding a sleep 3600 at the end, we were able to catch it.

Crash logs:

actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:25:24.297624393Z 2021-01-11 17:25:24Z: Listening for Jobs
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:21.656887962Z Runner update in progress, do not shutdown runner.
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:21.761093347Z Downloading 2.275.1 runner
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:28.167734819Z Waiting for current job finish running.
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:28.253782165Z Generate and execute update script.
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:28.337678733Z Runner will exit shortly for update, should back online within 10 seconds.
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:31.420450852Z renamed '/runner/_diag/SelfUpdate-20210111-172728.log' -> '/runner/_diag/SelfUpdate-20210111-172728.log.failed'
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:31.421289389Z Runner listener exited with error code 3
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:31.421339241Z Runner listener exit because of updating, re-launch runner in 5 seconds.
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.427363934Z Starting Runner listener with startup type: service
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.428878354Z Started listener process
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.438268920Z events.js:187
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.438294986Z       throw er; // Unhandled 'error' event
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.438299706Z       ^
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.438303203Z
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.438306351Z Error: spawn /runner/bin/Runner.Listener ENOENT
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.438310107Z     at Process.ChildProcess._handle.onexit (internal/child_process.js:264:19)
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.438313798Z     at onErrorNT (internal/child_process.js:456:16)
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.438317296Z     at processTicksAndRejections (internal/process/task_queues.js:80:21)
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.438320791Z Emitted 'error' event on ChildProcess instance at:
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.438324025Z     at Process.ChildProcess._handle.onexit (internal/child_process.js:270:12)
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.438326635Z     at onErrorNT (internal/child_process.js:456:16)
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.438329041Z     at processTicksAndRejections (internal/process/task_queues.js:80:21) {
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.438341082Z   errno: 'ENOENT',
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.438344220Z   code: 'ENOENT',
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.438346528Z   syscall: 'spawn /runner/bin/Runner.Listener',
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.438348864Z   path: '/runner/bin/Runner.Listener',
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.438351150Z   spawnargs: [ 'run', '--startuptype', 'service' ]
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.438353627Z }
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.438355820Z + trap - TERM INT
actions-runner-worker-fsjpf-kv9j4 runner 2021-01-11T17:27:36.438358144Z + wait 75

Self update failed log:

removed '/runner/bin/runsvc.sh.bak'
removed '/runner/bin/RunnerService.js.bak'
removed directory '/runner/bin'
[2021-01-11 17:32:47-5904] move /runner/externals /runner/externals.2.274.2
mv: cannot move '/runner/externals' to '/runner/externals.2.274.2': Device or resource busy
[2021-01-11 17:32:47-5923] Can't move /runner/externals to /runner/externals.2.274.2

So it seems that due to the /runner/externals mount caused it to fail because it's an emptyDir mount so it can't be moved... Any ideas on how to fix?

Thanks!

@mumoshu
Copy link
Collaborator

mumoshu commented Jan 12, 2021

@e36-cicd Thanks for reporting! This hurts. Probably the update process has been changed since we added /runner/externals to your containers for compatibility with GH's managed runners?

The only way I could come up was to make /runner an emptyDir volume shared by the runner container and the dind container. But that would definitely need our docker image] to be updated as well, we can't bake various scripts like entrypoint.sh and config.sh directly under /runner, as the emptyDir hides them.

Here's the part of your Dockerfile that instructs docker to bake those under /runner:

https://github.com/summerwind/actions-runner-controller/blob/524cf1b3793aeb8ea6280b2c1cd0a9cd55090173/runner/Dockerfile#L65-L68

Probably we need to update it to extract files under e.g. /runnertmp in docker build and modify our entrypoint.sh to move every file to /runner on startup, as similar as we already do for /externals:

https://github.com/summerwind/actions-runner-controller/blob/524cf1b3793aeb8ea6280b2c1cd0a9cd55090173/runner/entrypoint.sh#L47-L51

@mumoshu
Copy link
Collaborator

mumoshu commented Jan 24, 2021

Linking with #264 and #268

mumoshu added a commit that referenced this issue Jan 24, 2021
* Fix self-update failuers due to /runner/externals mount

Fixes #252

* Tested Self-update Fixes (#269)

Adding fixes to #253 as confirmed and tested in #264 (comment) by @jolestar, @achedeuzot and @hfuss 🙇 🍻

Co-authored-by: Hayden Fuss <wifu1234@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants