-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update ngpus_per_node for a SCREAM/E3SM GPU run on Derecho #4687
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that you have tested that this does what you want for both cesm and e3sm on derecho - what about other systems? Should you test there as well? Any way you can think of to move this to XML instead of doing it in python?
I thought GPU_TYPE and GPU_OFFLOAD were part of the machine config so if its defined for derecho, they will be used. We really want less of these model-specific hooks in the python. |
Thanks @jedwards4b and @rljacob for your comments. I see your concerns and I agree that these model- or machine-specific options are not appropriate for the python workflow. Hmm, if we remove
If this approach is feasible, I just need to make sure that values set in the XML files can be used to set up the GPU flags, GPU node type, etc accordingly in the CMake file later. Is it possible? |
Yes, I think that this is a good approach. |
Thanks @jedwards4b . @rljacob what do you think about this approach? I want to make sure that you are also comfortable with the new method before I make the changes. Thanks. |
Yes this is fine. But will you still need to make mods to the SCREAM Cmake files? |
That is correct. I anyway need to add the CMake file for the Derecho machine in SCREAM. |
Will be addressed by a different approach in a separate PR. |
In order to build and run E3SM/SCREAM on Derecho's A100 GPU, we need to set the
ngpus_per_node
correctly for E3SM/SCREAM.E3SM/SCREAM does not use the
GPU_TYPE
andGPU_OFFLOAD
options.