-
Notifications
You must be signed in to change notification settings - Fork 117
Performance tuning for NVIDIA Grace-Hopper for the Gordon Bell runs #987
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #987 +/- ##
=======================================
Coverage 40.91% 40.91%
=======================================
Files 70 70
Lines 20270 20270
Branches 2520 2520
=======================================
Hits 8293 8293
Misses 10439 10439
Partials 1538 1538 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
it seems the maxregisters flag should be guarded by the device type (i'm sure some devices don't have that many registers) |
@ntselepidis is this supposed to be merged or just for fun? |
My understanding is that the change to |
@ntselepidis what is the status of this? seems like we need a cmake guard for specific architectures or something |
User description
This PR will add some changes to reduce registers and improve performance.
Tested with NVHPC nightly on single Santis node with 4 Grace-Hoppers on
3D_IGR_TaylorGreenVortex_nvidia
case.PR Type
Enhancement
Description
Add GPU register limit optimization for NVIDIA Grace-Hopper
Include GPU memory management directive for Jacobian arrays
Diagram Walkthrough
File Walkthrough
CMakeLists.txt
Set GPU register count limit
CMakeLists.txt
-gpu=maxregcount:165
compiler flag to limit GPU register usagem_igr.fpp
Add GPU memory management directive
src/simulation/m_igr.fpp
$:GPU_DECLARE
directive for Jacobian arrays (jac
,jac_rhs
,jac_old
)