New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update on hzdr-hemera requires update to hemera profile #2860
Comments
Likely, also we have to pass the project with |
The email did not announce a special account/contingent for us. So I would not think so. |
I've just ran into the same problem, also on the GPU partition on hemera. Could not figure it out. It seems to be unrelated to the |
Btw setting |
As they turned off hyperthreading, we basically have to reduce the number of CPU "cores" we request per node and GPU by a factor two. |
This of course makes sense. However, (upon a rather brief look) I did not see where are we accounting for the hyperthreading: the original gpu.tpl for Hemera has 6 cores per GPU, which is 24 per node with 4 GPUs, exactly same as the amount of physical cores. |
Just half the The exact details are how slurm handles You can also verify if it does the right placement, could be that we now accidentally just stay on one package, etc. On Davide we also request with the combination |
@psychocoderHPC @steindev can this be closed? (since #2862 was able to be closed) |
After the update on hemera I resubmitted a simulation which ran on hemera before the update. Now I get the error
My job cfg file looks like:
Is this related to the deactivation of Hyperthreading and how we request cores/assign tasks?
Btw, the same happens for a 4 gpu job I try to run.
The text was updated successfully, but these errors were encountered: