-
Notifications
You must be signed in to change notification settings - Fork 954
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Self-hosted runners disappeared #756
Comments
@BrightRan Thank you for creating this ticket. I'd like to add that I'm using v2.273.5 of the runner on a plain Amazon Linux 2 EC2 Instance. I haven't experienced any issues yesterday so perhaps it was an intermittent issue on GitHub or Amazon's end. |
I have see this issue as well using the save version as @chingc. Today I received an error from my build actions for a project, seems that the runner vanished from the project and left me with no runners registered. Opening the runner itself on my server it showed that the runner was still registered but was getting an unknown disconnect error from GitHub, and that was all that it would do was loop between restarting the runner service to saying it received an unknown disconnect error back to starting the runner service. I have since wiped the old runner and re-downloaded it and registered it back to the server again as I needed the build pipeline running but not sure when it first occurred on my system. |
This still seems to be an issue. We had 5 runners, 4 of them were offline and 1 was idle. I disabled Actions to fix some syntax. Left it alone for ~2+ weeks and came back to see that only the idle one was remaining. The other 4 looks to have been deleted. Re-enabling Actions didn't bring them back either. I still have the directories for the other 4 runners, but trying to start them throws this error |
Hi @jonnikim, If the runner does not get any tasks for 30 days, it is being cleaned up from the service side. That might be the reason why you needed to re-configure your runner again. @brandan-schmitz, @chingc, does this help? |
I am experiencing a similar issue, when attempting to run the actions-runner (runc.cmd) on my machine I get the following error I have no idea where to get this token. Is there any way to reconfigure without being dependent on tokens that disappeared from the repo? |
Hi @mhl-itm-bhg, You can just remove a file named |
Hi everyone, Since this seems to be resolved, I am going to close this issue. If you experience this issue again, you can create a new issue or write a comment here, and we will re-open it 😄 |
how can we make it so that the runner doesn't get deleted. |
I just experienced this issue. Is there any update on how to prevent this? |
The docs now state:
@shishodiyas, @whutchinson98 you can't. One way you can automate this is to use API to fetch the registration token and register your runner again from a shell script. |
I have a shell script for the same but can you elaborate on the API use. |
Of course, this docs describe how to use API to fetch registration token for example: https://docs.github.com/en/rest/actions/self-hosted-runners#create-a-registration-token-for-a-repository. You can create small script that can fetch the registration token, then once you start configuring your runner, you may want to add flags like : |
@nikola-jokic From an automation point of view, this is some pretty anti-user design. Why would you auto-terminate an integration that has been down for 14 days? It's not costing Github anything that a runner that one of us hosting has gone idle. Some of us do projects as hobbies, we take breaks from them, we have lives. Is it really that much to ask that an automated build works again after a Raspberry Pi got accidentally unplugged for two weeks? I actually spend more time maintaining self-hosted runners than I build with them. |
Hi @shukriadams, For most enterprises, this is expected and wanted. We understand you’re not most enterprises. If you want to discuss it more:
|
A self-hosted runner is automatically removed from GitHub Enterprise Cloud if it has not connected to GitHub Actions for more than 14 days. An ephemeral self-hosted runner is automatically removed from GitHub Enterprise Cloud if it has not connected to GitHub Actions for more than 1 day. |
😂😂😂😂😂😂😂😂
…On Thu, 1 Feb 2024 at 19:24, Shukri Adams ***@***.***> wrote:
I found a good workaround.
1. Delete the Github self-hosted agent from your local system.
2. Disable Github self-hosted runner integration from the settings
page on your project.
3. Install Jenkins on your own infrastructure.
4. Create a Jenkins job that builds your project.
Hope that helps.
—
Reply to this email directly, view it on GitHub
<#756 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AZFIPOB2IEI6XMFPI3ZFYUTYRONBPAVCNFSM4SRMYXNKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOJSGEZTQMJWGQ3Q>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
It seems the original error was not fully addressed in this issue 😓
I am seeing this same thing on our self-hosted runners. And the only fixes I have found are related to disabling ipv6. Is there another solution for this? or is there at least a workaround? |
Just give us a setting to disable the automatic removal! It's completely ridiculous that I have to manually add a self hosted runner whenever I have to deploy an update (usually once per month). It takes me more time to go through the whole process of adding the runner again, than the time it takes to actually run the process. It didn't use to be this way. An automation tool that requires manual labour to use is not much of an automation tool. |
This just bit me too. Can we please have a setting for this, or at least a warning of some kind. This is not a good user experience. Why is the deletion not recorded in the audit logs? |
Best i could come up with was to write an automation to add the runner again. after every 14 days. |
Adding my voice to the dissatisfaction here. There are absolutely no docs on how to reset a runner once github has unilaterally purged it. If github insists on this design paradigm for what are supposed to be persistent self-hosted runners, I would like to request
|
Adding support for this issue here as well! We need a setting; runners cant just be deleted because they are turned off. We dont pay for our EC2 runners to be on all the time if we are only using it once a month and manually adding them back every time is ridiculous! |
@github fucked me over and deleted the aarch64-macos runner's configuration after it was down for a brief period of time[0] so I will have to set it up from scratch again. For now, we remove aarch64-macos so our CI at least passes once again. [0] actions/runner#756 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
Thought I'd add that we just had multiple self-hosted runners disappear from GitHub organization configuration. Nobody else has access to GH config or the runners and I know for sure that I didn't remove them. 2 of the missing runners were running jobs 3 days ago. Strangely.. one runner is still present. No clue why just this one. I submitted a ticket, hopefully they can pull from a backup. Access to some of these runners can be difficult, so just adding again would be a hassle. |
Very confused why this happened to my self-hosted runners. Ours are used multiple times a day yet I've had it happen twice now that they were removed for seemingly no reason. Our runners are setup as services and checking |
I'm not sure if you can get to this or comment on it, but my ticket: https://support.github.com/ticket/enterprise/122857/2997363 Seems like it's not just me. I have created scripts to automate installation of runners and I'm now keeping the configuration stored in version control (except secrets of course) to make it easy to reinstall. |
2024 and i face the same issue.
can you share your automation configuration |
I'm using a workflow I call 'doorstop'. I have to manually update it with new runners but that's so far not been an issue. Example:
|
It would be good if the GitHub owner could receive an alert via email about the runner approaching 14 days rather than deleting it totally.
|
Absolutely brilliant. Thanks man |
I am also encountering a problem where my self-hosted runner gets removed, and long before a 2 week unused expiration. I have test scripts on WSL in a Windows 11 VM. I had another odd WSL error that caused my runner to crash, Only a few days later when I noticed the action was not working, upon restarting it, I noticed that the runner object no longer existed on GitHub. As this issue is closed; is there an open one on this topic, or any known reliable workarounds? |
I just recently had a runner disappear after being offline for 6 days.... maybe if enough people chime in here it'll be reopened. |
@BrendenWalker would you happen to be on a VPN? I was asked that question & yes: I was. I'm trying again on a non-VPN segment to see if that helps. |
I realised the runner gets removed even before the two weeks mark, if the runner is not idle or active. Since your runner crashed, I say it was removed before 14days because it wasn't active. A work around that worked for me ( as suggested by @BrendenWalker ) is to setup a workflow runner that runs every two days or one week. Depends on you. The runner should perform very minimal task like echo or whoami. This will give github the impression that the runner is still active. This has worked for me. |
Sadly, my workaround didn't save me from the last one. 6 days so my workflow didn't have a chance to startup the VM and run an action. I've also taken to semu-automating installation on Windows via powershell. If this keeps happening I'll probably deploy ansible or some other full automated means.. |
This latest failure is an Azure VM.. no VPN that I know of, however it does not have a public IP address and icmp traffic to the internet doesn't work so no ping. The config.cmd --check functionality reports failure when it can't ping some servers even though it's already verified HTTPS access to the same servers. AFAIK https access is all that runners require, which would explain why my action runners work fine.. as long as they're not booted out of GitHub configuration. |
@BrendenWalker and @alvieridev there's certainly a possibility that I have an unstable network, even without the VPN. With your experience with self-hosted runners, what do you think of this (admittedly hacky) idea:
|
A bit blunt, but sometimes that's necessary. I haven't had that particular issue (yet). In my case I'm running as a windows service (so far, we have *nix runners in GitLab but haven't migrated those projects yet), and they can be setup to automatically restart.. That is IF they stop cleanly and notify the windows SCM that they stopped ;-) |
fwiw, on the list of "possible solutions, but won't work for me".... is this scheduled keep-alive task. TIL scheduled tasks only work on the main branch, which is undesired when contributing upstream via a fork. :/
Perhaps this might help someone that's ok with main branch workflow edits. |
I just had GH support give me this gem:
Had to refer them to the GitHub documentation which contradicts that:
|
I wonder if the runner is registered with a particular IP address that is then never re-used when connecting (in the instances of VMs) Github will remove the runner after the 14 days despite the runner being used within that window |
Interesting theory. However, I would expect it to show offline whenever the IP address changed. That's not been the case so far in my experience. Last one was removed 6 days after running a job successfully. I think their is a bug in the 'cleanup' logic and it's simply not functioning like it should. They just need to open source all of GitHub and I'll fix the dang thing ;-) |
Well, that's false. I've had self hosted runners go missing long before 14 days of inactivity. See screen snip, above; last processed based on a commit action on 10/2 then when I tried to restart it on 10/8, the runner was gone from my GitHub account and I had to setup a new one.
Now that's an interesting hypothesis.
I'm not an enterprise customer. |
@github fucked me over and deleted the aarch64-macos runner's configuration after it was down for a brief period of time[0] so I will have to set it up from scratch again. For now, we remove aarch64-macos so our CI at least passes once again. [0] actions/runner#756 Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
Hey everyone! This just in on my ticket:
that might.. just might confirm that we're not imagining things ;-) |
If this is true and it was an error that caused them to delete after 14 days of Idle, and they were meant to only delete after 14 days Offline, this is starting to approach a sane policy. Deleting after 14 days or inactivity (or 14 days of existence and 1 hour of inactivity) is baffling. |
We've had all our self-hosted runners deleted. For anyone encountering the same issue, it looks like it was a bug on Github's end that deleted runners it thought were dormant when in fact they were active but in an idle state. They couldn't restore them, so we had to reconfigure these runners from scratch. The reply we've got from support if it can help anyone:
|
Associated GitHub Community topic: https://github.community/t/disappearing-self-hosted-runners/137669
The customer has added some self-hosted runners for his repository, but the runners would completely disappear as if he never added any.
When he refreshes, the runners would come back. Some would be
Offline
but would go back to beingIdle
after another refresh. Other times when he refreshes the runners disappear again.When the customer logs into the runner machines to check their status, he can see a lot of connection retries.
The text was updated successfully, but these errors were encountered: