-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
container job workflow pod fails to initialize - HttpError: HTTP request failed #3493
Comments
Hello! Thank you for filing an issue. The maintainers will triage your issue shortly. In the meantime, please take a look at the troubleshooting guide for bug reports. If this is a feature request, please review our contribution guidelines. |
I was able to spin up a <runner_pod_name>-workflow pod by adding this to the values.yaml pod spec:
I got the solution from this comment. i dont understand why this fixed my issue, as the pod already has this service account definition in the pod spec on the cluster:
|
Hey @sofiegonzalez, Can you please show the |
Hey @nikola-jokic sorry for the late response this is what the AutoscalingRunnerSet looked like previously with the service account annotation: https://gist.github.com/sofiegonzalez/a9a8e447924294d060533ea472f6557e |
No worries @sofiegonzalez! I'm glad that you resolved the problem, but I don't understand why having Can you please try to install the new scale set without serviceAccount field. A fresh install, not an upgrade. If it works, then I might know what the problem is. I cannot reproduce this issue so I'm trying my best to understand it from the description. |
What do you mean by old service account and older kubernetes service? I will try a fresh install without the serviceAccount field and update here, but I'm not going to do a fresh install of the gha-runner-scale-set-controller chart unless you think I need to. |
Hey @nikola-jokic just did a fresh install. Here is the values.yaml i used: https://gist.github.com/sofiegonzalez/bc12dd21217bdbba392c481b644527eb This time the workflow pod was able to initialize and run my personal container. I really don't understand what changed, before I had done upgrades and fresh installs when trying to get the workflow pod to start up. |
I think I have an idea what the problem was. When doing upgrades, sometimes, removing additional resources can take a lot of time. This problem is fixed with this PR. When you did the upgrade, the resource was probably not completely removed. Now, after upgrading it, the role associated with that service account was probably in a bad state, causing no tokens to be mounted on the pod, and therefore lacking permissions. That is the reason I asked you to do a fresh install |
That makes sense, thanks for the clarification! |
No worries! let's close this issue now and we can re-open it if you find that something else is a problem, especially since it works with the fresh install, and the PR I linked is already merged. |
Checks
Controller Version
latest
Deployment Method
Helm
Checks
To Reproduce
Describe the bug
Hi, My main issue is that the CI fails when I try to start a container job in
containerMode: kubernetes
with the errorError: HttpError: HTTP request failed
. This is blocking us from making progress.I have followed the github actions scale sets video on youtube, and tried to recreate the same configurations. The main difference being that I am using a PVC I have created through a manifest and am applying that with terraform. I am also using an docker image we built from a public docker repo, it is pull-able without authentication.
Right as the container job starts, the pod dies and fails to initialize. I can see the PVC was bound correctly. I am not sure what the
Error: HttpError: HTTP request failed
error means or what it is referring to.Describe the expected behavior
The container job should start up and create a <pod_name>-workflow pod to run the container.
Additional Context
Controller Logs
Runner Pod Logs
The text was updated successfully, but these errors were encountered: