-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix out of host memory during HDF5 dumping #2690
fix out of host memory during HDF5 dumping #2690
Conversation
*/ | ||
if(rc != cuplaErrorMemoryAllocation) | ||
CUDA_CHECK(rc) | ||
/* cupla is not supporting the function cudaHostAlloc to create mapped memory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please open a cupla & alpaka issue and link it here in the comment?
Even if it might currently be a wontfix
it needs to be doc-ed as an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can open a feature request in alpaka.
This bug is not important for cupla because I as user ignored the error cupla provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened the alpaka feature request: https://github.com/ComputationalRadiationPhysics/alpaka/issues/612
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx, please link it in both comments in-code :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please link https://github.com/ComputationalRadiationPhysics/alpaka/issues/296 as the original alpaka issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: cupla 0.1.0 does not support ...
@@ -69,7 +73,21 @@ class MappedBufferIntern : public DeviceBuffer<TYPE, DIM> | |||
|
|||
if (pointer && ownPointer) | |||
{ | |||
CUDA_CHECK(cudaFreeHost(pointer)); | |||
#if( PMACC_CUDA_ENABLED == 1 ) | |||
/* cupla is not supporting the function cudaHostAlloc to create mapped memory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link same issues here, please
I tested your bug fix on the two simulations mentioned above. They do no longer crash. Great work. 👍 |
@psychocoderHPC if this is also fixing #2504, then this also affects the CUDA backend. Can you confirm? |
Maybe I mentioned it in the PR description. I started yesterday a test if it solves the issue but missed to check the result today :-) I will check it later this evening or next week.
[update] the job is now running since 25h and passed over 120k steps. Normally died always after 35ksteps. I will wait until next week because the full job runs 60h.
|
#2504 is solved by this PR. |
Ok, so I tag this as CUDA bug now as well. |
I merge this now, but I think the alpaka issue should have been linked as annotated. If you like to add this, pls do in a follow-up PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't merge now.
Please add:
- link to alpaka issue on mapped mem (2x)
#ifdef
before#undef
guard
Please amend the changes, so we have one commit in the end.
* @todo this is a workaround plese fix me. We need to investigate if | ||
* it is possible to have mapped/unified memory in alpaka. | ||
*/ | ||
# undef cudaFreeHost |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would guard this with a #ifdef
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will skip this because it is already guarded by PMACC_CUDA_ENABLED
. cupla always defines cudaFreeHost
if not than something is broken anyway. Also if I guard this code part I need also to guard the reintroducing of # define cudaFreeHost(...) cuplaFreeHost(__VA_ARGS__)
in the case it was not defined before.
All this will become only unreadable without increasing the code quality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why one would guard a def
or undef
is to make it more flexible for other hacks/workarounds that might (un)set.
You can keep it as is under the mentioned assumption, but that would make it more flexible.
* @todo this is a workaround plese fix me. We need to investigate if | ||
* it is possible to have mapped/unified memory in alpaka. | ||
*/ | ||
# undef cudaFreeHost |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would guard this with an #ifdef cudaFreeHost
if(rc != cuplaErrorMemoryAllocation) | ||
CUDA_CHECK(rc) | ||
/* cupla is not supporting the function cudaHostAlloc to create mapped memory. | ||
* Therefore we need to call the native CUDA function cudaFreeHost to free the memory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need -> have
free the memory
* Therefore we need to call the native CUDA function cudaFreeHost to free the memory. | ||
* Due to the renaming of cuda functions with cupla via macros we need to remove | ||
* the renaming to get access to the native cuda function. | ||
* @todo this is a workaround plese fix me. We need to investigate if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please
* Therefore we need to call the native CUDA function cudaFreeHost to free the memory. | ||
* Due to the renaming of cuda functions with cupla via macros we need to remove | ||
* the renaming to get access to the native cuda function. | ||
* @todo this is a workaround plese fix me. We need to investigate if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same typos as above :)
For HDF5 checkpoints and normal dumps we use CUDA mapped memory to transfere the particles to the host memory. Due to the usage of cupla and the reason that mapped memory is not supported by cupla and alpaka we fall back to native CUDA functions. To free mapped memory the function `cudaFreeHost`. This function is also used to free normal host memory allocated with `cudaMallocHost` which is supported by cupla. The cupla internal macros rename `cudaFreeHost` to `cuplaFreeHost` and fall back to alpaka functionallity to free the memory. In the case where we allocated the memory with without the knowlage of cupla cupla is trowing an error and `cudaFreeHost` will never be called. The result in PIConGPU is that with each particle dump the host memory footprint grows until we run out of memory. - fix broken memory freeing - `MappedBufferIntern` - fix that wrong function is used to allocate mapped memory - add workaround to free mapped memory - add fallpack if an CPU accelerator is used
7d29093
to
174a03a
Compare
I linked the alpaka issues and fixed the typos/spelling The guards make no sense #2690 (comment) I will keep this PR as it is, including the note: This PR is well tested and I would like to avoid to break it again with untested alapka features. |
For HDF5 checkpoints and normal dumps we use CUDA mapped memory to transfere the particles to the host memory. Due to the usage of cupla and the reason that mapped memory is not supported by cupla and alpaka we fall back to native CUDA functions. To free mapped memory the function
cudaFreeHost
. This function is also used to free normal host memory allocated withcudaMallocHost
which is supported by cupla.The cupla internal macros rename
cudaFreeHost
tocuplaFreeHost
and fall back to alpaka functionallity to free the memory. In the case where we allocated the memory without the knowledge of cupla, cupla is trowing an error andcudaFreeHost
will never be called.The result in PIConGPU is that with each particle dump the host memory footprint grows until we run out of memory.
MappedBufferIntern
Note:
MappedBufferIntern
is not used for long time therefore the wrong allocation call was not seen by any user of PMaccThis bugfix
shouldsolve: #2504Tests
CC-ing: @PrometheusPi @steindev @HighIander @BeyondEspresso