-
Notifications
You must be signed in to change notification settings - Fork 547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Max concurrent jobs with sccache #248
Comments
Update: setting number of concurrent jobs to |
I'm not sure I would necessarily close this issue? The server shouldn't deadlock even if you have more processes than cpus. |
Can you reliably reproduce this? If so, getting a log out of the sccache server would help diagnosing what's happening here. I agree that this should not fail this way. You can get logs by first ensuring that no sccache server is running ( |
Found at least one cause for this error that is pretty easy to reproduce.
It is possible that there are other problems that cause the same error but that is at least a candidate. |
Ah yes, that's #204 |
I have observed a deadlock as well which gives this same error. Haven't been able to reproduce it with logs enabled yet though. |
I believe the deadlock may be alexcrichton/tokio-process#42 so hopefully updating to tokio-process 0.2.5 should fix. Included it in #304 since |
This should be fixed since we merged #304 . If it reoccurs please let me know! |
ROCm changes seem to cause sccache failures, see e.g. #35734 and mozilla/sccache#248.
ROCm changes seem to cause sccache failures, see e.g. #35734 and mozilla/sccache#248.
ROCm changes seem to cause sccache failures, see e.g. #35734 and mozilla/sccache#248.
ROCm changes seem to cause sccache failures, see e.g. #35734 and mozilla/sccache#248.
ROCm changes seem to cause sccache failures, see e.g. #35734 and mozilla/sccache#248.
ROCm changes seem to cause sccache failures, see e.g. #35734 and mozilla/sccache#248. Differential Revision: [D20849176](https://our.internmc.facebook.com/intern/diff/D20849176)
ROCm changes seem to cause sccache failures, see e.g. #35734 and mozilla/sccache#248. Differential Revision: [D20848317](https://our.internmc.facebook.com/intern/diff/D20848317)
ROCm changes seem to cause sccache failures, see e.g. pytorch#35734 and mozilla/sccache#248. ghstack-source-id: faf0d04f95a9fbf7bd3a690fd4e660ff4ab66b88 Pull Request resolved: pytorch#35979
ROCm changes seem to cause sccache failures, see e.g. #35734 and mozilla/sccache#248. Differential Revision: [D20849176](https://our.internmc.facebook.com/intern/diff/D20849176)
ROCm changes seem to cause sccache failures, see e.g. #35734 and mozilla/sccache#248. Differential Revision: [D20848317](https://our.internmc.facebook.com/intern/diff/D20848317)
ROCm changes seem to cause sccache failures, see e.g. #35734 and mozilla/sccache#248. Differential Revision: [D20849254](https://our.internmc.facebook.com/intern/diff/D20849254)
When using yf225@336584a to compile CUDA source files with
make -j4
on a 4-core machine, sccache seems to give the following error:while
make -j2
seems to work fine for the 4-core machine. Interestingly on a 16-core machinemake -j8
works fine. This issue doesn’t happen when building C++ source files. Curious what might be causing this issue? Thanks!The text was updated successfully, but these errors were encountered: