-
Notifications
You must be signed in to change notification settings - Fork 500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HYPRE+HIP Runtime Error #2910
Comments
@wcdawn, just to clarify -- you are using |
@tzanio correct. MFEM v4.4. I just did a pull today. |
looks suspicious? |
@jandrej I get the same message when passing
|
Hello @wcdawn, Are you able to get a backtrace by running with Also, we have tested with hypre version 2.23, do you encounter the same crashes with that version as well? |
@pazner I rebuilt with HYPRE v2.23.0 and get the same error. I also rebuilt MFEM with
|
It looks like this stack trace and crash happen on a separate thread, not on the main thread. Can you try switching to the main thread and getting the stack trace there? I suspect that this new thread that crashes is created during some call to the HIP/ROCm runtime but it is best to confirm that and find out what call exactly causes this. |
@v-dobrev Here is the output of
The backtrace from thread 1 looks interesting.
I'm starting to suspect that it could be a problem with the HIP/ROCm runtime as well. |
I just noticed that you did not set the GPU arch in your hypre config command -- try adding |
Thanks for catching that. It doesn't seem to have changed anything and the backtrace looks the same.
|
Another suggestion/question: were you able to run older MFEM versions on this machine, e.g. right after #2750 was merged? Also, just to confirm, if you build MFEM with HIP and HYPRE without HIP, does this work? |
I did Building MFEM with HIP and HYPRE without HIP does work. Additionally, building both without HIP works. Is there a HYPE example to use to test this? It seems like it could be something in HYPRE itself or maybe something in the MFEM/HYRPE interface. |
Hi @wcdawn, were you able to figure out what the problem is? @liruipeng, we suspect that the above issue (see the backtrace here: #2910 (comment)) maybe in hypre. What will be a good way for @wcdawn to test this in hypre itself without mfem? |
@v-dobrev Unfortunately not. I think it could be something with HYPRE. I'm not sure if it has been tested with this particular GPU. |
cc: @noelchalmers |
and @pbauman |
Hi folks. There is certainly an issue with HYPRE at present that I will try to address when I can. The issue is that the Navi gaming cards (gfx1030 indicates an RDNA2 card, so something like a 6900XT) run with warp/wavefront sizes of 32. Currently, HYPRE on AMD GPUs is setup for warp/wavefront size of 64. I'll post a note here when we update HYPRE to support wavefront size 32 on AMD GPUs. |
@pbauman, thank you for looking into this issue. |
|
|
I'm trying to compile with HIP & HYPRE. Compiling with just HIP works fine, but I'd like to use
HypreBoomerAMG
.I get the following runtime error when running
ex1p
. Any help would be much appreciated.I'm using the
master
branch of MFEM & HYPRE v2.24.0.HYRPE config
MFEM config
The text was updated successfully, but these errors were encountered: