-
Notifications
You must be signed in to change notification settings - Fork 60
K8s mode without PV #160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
K8s mode without PV #160
Conversation
Is there any work happening on this one? @DenisPalnitsky @nikola-jokic |
@anlesk I tested it on my case and it worked. Let me rebase this PR, give it another test and mark it as ready for review if all good. |
6426853
to
1a2ba2f
Compare
1a2ba2f
to
f6d616b
Compare
I ran some tests and overall the approach works however when I try to run quite heavy workflow with 60 jobs and 6 containers in each job I get below error ~50% of cases.
The memory consumption is ~1Gb for both successful and failed runners and it's less then the pod limit. I could not figure out what is the root-cause however I'm not a JS expert so maybe I'm missing something obvious. I would appreciate the review and a help from the community. |
Would any GitHub Action runner maintainers be able to share an estimated timeline for addressing this issue? |
We are also interested in getting away from a shared volume between the runner and workflow pods. Would be great if this PR got reviewed and merged! @nikola-jokic any chance this can move forward? |
Hey everyone, this is definitely one of the things we will prioritize soon. This approach is interesting, but we want to try few more approaches before we land on the final solution. I'll keep you all updated |
@nikola-jokic Thanks for the update — appreciate you keeping us informed. Just wanted to mention that this is currently impacting our evaluation of GitHub Actions, so we're looking forward to seeing how the final direction shapes up. |
@nikola-jokic Is there any update on this? Is this approach valid? I was trying to do something similar but wasn't sure if the volume needs to be kept in sync or just the initial copy of the volume during the prepare step is enough? |
Hey @quoctruong, We will work on this soon, but there are a few things we need to worry about:
Using a sidecar would eliminate the copy, which would be great. But we need to test this approach and make sure we don't break anything. |
Thanks for the update @nikola-jokic . We are extremely interested in getting away from shared volumes. We've tried multiple RWX storage options and none perform well with large repos. |
Thanks @nikola-jokic. Out of curiosity, how do you plan to solve the issue where the runner container is started up before even knowing about the workflow container? Since it's not possible to add a container to a running pod, this means even with a sidecar, all the containers have to be in the pod spec before they are started? So most likely this has to be done in the https://github.com/actions/runner code base and not in the hook? |
The goal of this PR is to remove the dependency on PV from Kubernetes mode.
The approach
Currently PV is used to share the files between Runner and Workflow pods. Those files include scripts that should be executed on runners and required software such as Node.
My assumption is that it's a one way communication from Runner to Workflow, meaning that it's a Runner pod that "sends" file to the "Workflow" and Workflow pod does not need to share any files with Runner.
Based on that assumption we can copy the required files directly to Workflow pod without provisioning common storage. We can use cpToPod function for direct file copying.
This will address the issue described here.
Feedback Request
This PR is a POC that was tested on a workflows that run in a container and it worked fine. I wonder, if there is any flaws that will make this approach unviable.
Implementation Notes
I had to rework
cpToPod
function to wait for the operation to finish.