-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
setup-r / setup-pandoc stuck on macOS (hitting 6h timeout) #435
Comments
This is likely a network failure that never times out, probably from homebrew, I don't think there is anything we can do, just cancel these jobs if you see one taking too long. |
Thanks @jimhester, closing this. |
I'm experiencing this quite often (on website daily builds). Shame theres no way to tag something to retry if there is a timeout. |
What about adding a manual timeout for that particular step in the GH action? |
These do not look like network errors to me. They seem to hang after calling |
I added some debugging to the |
@gaborcsardi, I have triggered a number or builds using setup-r/-pandoc from
Not sure how much more informative this is, but the installer extra logs might help. For reference, here is the log to an equivalent successful run: https://github.com/riccardoporreca/rmdgallery/runs/4434997287?check_suite_focus=true#step:4:89, I just noticed the failing job above does not log anything like "installer: Package name is R 4.2.0 for macOS". Hope this helps I hit another failure in https://github.com/riccardoporreca/rmdgallery/runs/4434994121?check_suite_focus=true#step:4:89, which should be completely unrelated
|
Thanks, that is useful! It seems that the R installer can be stuck as well, so I should add the logging to that as well. My theory is that the issue is the concurrent installs interfering. (Pretty much a guess at this point.) More soon. Btw. the disk eject failure could also be a concurrency issue, but in general we should ignore disk ejection failures, the attached disks do not bother anyone. |
For the record, I added logging for the R installer as well in the |
@gaborcsardi, I have done another battery of runs (same setup as above), and here the logs of three timeout failures
Hope having this few more examples, including the additional logging, help. |
@riccardoporreca Thanks, very helpful indeed! One of the installer processes clearly hangs. I still cannot reproduce the this locally, but nevertheless I made the macOS installations sequential now, hopefully this will help. |
Maybe this was causing the hangs, see #435.
My impression is that with I'll remove the detach from the gfortran volume, in case that causes it. |
I reran it a bunch of times, and it seems to work better, so fingers crossed: |
I would encourage everyone to switch to |
Hi. I got a timeout yesterday it seems with v2 First one I've seen with this version. |
Can you try using
It is a very long shot... |
Sure, but does GHA really care if there is a mix of action versions as long as they dont collide? |
For a second I was thinking if GHA checks them out into the same directory. But apparently not, because the I would still use |
Anyway, we can add timeouts and retries, if we'll still see errors with a fully |
(I'm not sure if it's relevant, but I missed updating |
I have just seen it twice with all actions on In both cases the gfortran installation froze:
|
…` examples * Closes #26. * Latest setup got via `usethis::use_github_action("check-standard", save_as = "ci.yaml")`, using development `usethis` version (to support relying on the default branch, now `v2-branch`). * `covr` installed via pak, so it is cached alongside the other dependencies. (Not done as part of `setup-r-dependencies` for all jobs, since it is only used in conditional steps.) * Drop fixing macOS symlinks (fixed in `v2`), but keep the 5 minutes timeout for setup-r (#24), since r-lib/actions#435 isn't fully addressed.
Just a note that this is still an issue. Hopefully switching to rig will solve it. |
I haven't seen this for a while, maybe because of the macos updates on GHA. Anyone else has seen this recently? |
I am going to close this now. Hopefully it is not coming back. |
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue and include a link to this issue |
Describe the bug
Over the last weeks, I have often seen setup-r (and setup-pandoc, just today) getting stuck and eventually hitting the default 6h timeout.
This has happened on macOS-latest, in different repos and for various R versions.
From the log it is not very clear where and why the execution got stuck (especially for setup-r), and the issue is transient in that a re-run usually goes through.
To Reproduce
Examples of failures, with relevant logs including timing for convenience and for the records:
Expected behavior
It would be great if setup-r and setup-pandoc would capture explicitly the issue causing the action to get stuck, or perhaps just handle a meaningful timeout internally for individual operations, perhaps hinting at re-running the workflow in a timeout error message.
Using the action in a workflow, one can of course set a timeout for the action step, see e.g. riccardoporreca/rmdgallery@65b87fe, but I believe for the users this should be best handled in the action itself.
Additional context
Given how common the issue is (I see it every week in cron runs), it is not great to keep a runner busy for 6hours if we can prevent it.
The text was updated successfully, but these errors were encountered: