-
Notifications
You must be signed in to change notification settings - Fork 439
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU detection hangs when starting BOINC #5183
Comments
If you run But the client is supposed to keep going even if the GPU detection subprocess fails. |
Try |
Just so I'm not exiting prematurely, how long should this take, maximum? |
“Not that long”, probably… (I don’t know the real answer) But if the detection hangs, the client will also hang because it waits forever for the detection to complete. A timeout here would clearly be a good idea. |
OK, it's just hanging then. Both running as boinc user and as root user. I never get that coproc_info.xml file that @davidpanderson was looking for. I have added the boinc user to the video group, but maybe it's a permissions issue separate from that? Not sure. |
Not sure what else to suggest. You could try the forum; maybe somebody there has some ideas. During detection, warning messages are stored for reporting back to the client on completion. It would be useful if those also got written to the detector process’s Also, in case anybody’s wondering: |
Do you have these packages installed?
- boinc-client-nvidia-cuda
- boinc-client-opencl
- mesa-opencl-icd
- ocl-icd-libopencl1
?
Пт, 7 апр. 2023 г. в 19:52, Eli T. Drumm ***@***.***>:
OK, it's just hanging then. Both running as boinc user and as root user. I
never get that coproc_info.xml file that @davidpanderson
<https://github.com/davidpanderson> was looking for.
I have added the boinc user to the video group, but maybe it's a
permissions issue separate from that? Not sure.
—
Reply to this email directly, view it on GitHub
<#5183 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAYVTIIERLUN6D6KGT75EBDXABH5TANCNFSM6AAAAAAWV7TISM>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
--
Best regards,
Vitalii Koshura
Sent via iPhone
|
boinc --detect_gpus shouldn't hang even if expected libraries are missing. I'll implement Brian's idea of writing warnings to stderr; that may shed some light. |
And/or some tracepoints to track progress through the code. It’s not as though there’s any need to keep the noise down in that file… |
I added the stderr writes in a 'dpa_gpu_detect' branch. |
BTW it should take << 1 sec to complete |
@AenBleidd So I'm on Arch, which is probably not officially supported, but I have the equivalents of those libraries installed according to the Arch Wiki. @davidpanderson I can try to clone that branch, build, and report back |
OK, so with that enabled (plus a few "test" fprintfs from me) here's what I get: So I guess the error is happening sometime after the Update: it's something in the opencl detection. I'll keep digging. Update: It's a problem with this line of code: ciErrNum = (*p_clGetPlatformIDs)(MAX_OPENCL_PLATFORMS, platforms, &num_platforms); Does this mean I need to get in touch with the OpenCL people? ChatGPT tells me this is calling the |
OK well by following the instructions in this thread (uninstalling some intel stuff) I was able to get it to work. I guess the problem is with OneAPI. Sorry about all this, but thank you for the help! |
@etdr, thank you for testing this and sharing the solution. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Describe the bug
I am inferring the GPU detection is the problem because of how the logs are split relative to when I shut the service down. (See screenshots.)
I might be missing something really obvious, or it might have to do with the systemd unit, but wanted to make sure no one knew why this was happening here.
Steps To Reproduce
boinc
package from Arch repositoryboinc-client.service
unitExpected behavior
Even if GPU detection fails, I expect it to come back with a yes or a no, not to hang.
Screenshots
System Information
Additional context
I have two GPUs in this machine, one AMD and one NVIDIA, both pretty recent. I have the most up-to-date libraries for opencl installed for both of them.
Nothing shows up in the stderrgpudetect file.
The text was updated successfully, but these errors were encountered: