Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ephemeral runner gets stuck in Successful state #3527

Closed
4 tasks done
katarzynainit opened this issue May 16, 2024 · 3 comments · Fixed by #3528
Closed
4 tasks done

Ephemeral runner gets stuck in Successful state #3527

katarzynainit opened this issue May 16, 2024 · 3 comments · Fixed by #3528
Labels
bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers

Comments

@katarzynainit
Copy link

Checks

Controller Version

0.9.0

Deployment Method

Helm

Checks

  • This isn't a question or user support case (For Q&A and community support, go to Discussions).
  • I've read the Changelog before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes

To Reproduce

1. I have ARC installed in GKE in one namespace: controller and runnersets in the same one
2. All works as expected in most of the cases
3. From time to time (randomly) we observe such behavior:
- ephemeral runnerset gets patch , e.g. from 0 desired to 1 replica
- ephemeral runner is created, but immediately its status is changed to Succeeded and nothing happens - the workload is "stuck" on waiting for runner

Describe the bug

In controller logs I see that it already "Found the runner with the same name" - it looks like the controller is performing reconcile twice for the same ephemeralrunner in almost the same time, the second run "removes" runner and makes it hung.

The runner is eventually not created, and the ephemeral runner gets to stage Succeeded and stuck until workflow is cancelled.

We started to observe this behavior when we moved to faster cluster.

Describe the expected behavior

The controller should create runner always on ephemeral runner creation.

Additional Context

N/A

Controller Logs

https://gist.github.com/katarzynainit/ceccccde10d5454aa104d0f5a98f9b0d

Runner Pod Logs

N/A
@katarzynainit katarzynainit added bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers labels May 16, 2024
Copy link
Contributor

Hello! Thank you for filing an issue.

The maintainers will triage your issue shortly.

In the meantime, please take a look at the troubleshooting guide for bug reports.

If this is a feature request, please review our contribution guidelines.

kr-sabre added a commit to SabreOSS/actions-runner-controller that referenced this issue May 17, 2024
kr-sabre added a commit to SabreOSS/actions-runner-controller that referenced this issue May 23, 2024
@nikola-jokic
Copy link
Member

Hey @katarzynainit,

Can you please show the controller values.yaml file, so I can try to reproduce this issue.

@katarzynainit
Copy link
Author

Hi, we are using forked arc-controller - code changes relate to skipping controller and listeners SA and RBAC creation based on three flags.
Code related to processing ephemeral runners is unchanged vs 0.9.0.

https://gist.github.com/katarzynainit/d9e6ed4d3c6b95e929d73e2b1e8f7cc1 (flags for internal changes are marked in the values)

We started to observe this issue on faster cluster, we didn't see them before (the same configuration, but different and slower cluster).

It also happens from time to time only, so might be difficult to observe.

kr-sabre added a commit to SabreOSS/actions-runner-controller that referenced this issue May 29, 2024
kr-sabre added a commit to SabreOSS/actions-runner-controller that referenced this issue Jun 3, 2024
kr-sabre added a commit to SabreOSS/actions-runner-controller that referenced this issue Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers
Projects
None yet
2 participants