-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use std::aligned_alloc for allocations #6341
Conversation
1093348
to
4b5c2a6
Compare
b507200
to
5854e75
Compare
5854e75
to
4fbee6a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me 👍
https://github.com/zippylab should be able to comment some more on the issue found with using |
We had been seeing crashes of the XGC application on the Sunspot system (Aurora test system) coming from this line in HostSpace.cpp (in Kokkos::HostSpace::impl_deallocate):
in a part of XGC that is copying particle data from device to host memory (using Cabana::SoA and Cabana::AoSoA). After rebuilding Kokkos, Cabana, and XGC with the changes in this PR, that failure mode of XGC went away, and it ran past the point where the SEGV was happening before in the same test case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about RawMemoryAllocationFailure::AllocationMechanism
?
Should we not deprecate enumerators that are not supported?
core/src/impl/Kokkos_HostSpace.cpp
Outdated
size_t size_alignment_multiple = | ||
(arg_alloc_size + alignment - 1) / alignment * alignment; | ||
#ifdef KOKKOS_COMPILER_MSVC | ||
ptr = _aligned_malloc(size_alignment_multiple, alignment); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does _aligned_malloc
also require the size argument to be a multiple of the alignment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't find information about that but preferred doing it anyway for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does MSVC not support aligned_alloc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MSVC indeed does not support aligned_alloc
as noted on CppReference.com
It would be nice to be able to use operator new(size_alignment_multiple, std::align_val_t(alignment))
instead, avoiding the conditional, but then we'd have to store the alignment to pass to the corresponding operator delete
Sure, see 8c8c9c3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the print stuff works?
core/src/impl/Kokkos_HostSpace.cpp
Outdated
size_t size_alignment_multiple = | ||
(arg_alloc_size + alignment - 1) / alignment * alignment; | ||
#ifdef KOKKOS_COMPILER_MSVC | ||
ptr = _aligned_malloc(size_alignment_multiple, alignment); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does MSVC not support aligned_alloc?
@@ -102,11 +102,13 @@ void Experimental::RawMemoryAllocationFailure::print_error_message( | |||
o << " (The allocation mechanism was "; | |||
switch (m_mechanism) { | |||
case AllocationMechanism::StdMalloc: o << "standard malloc()."; break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this work if KOKKOS_ENABLE_DEPRECATED_CODE_4 is off? Isn't the entire AllocationMechanism thing removed if DEPRECATED_CODE_4 is not enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are looking at the wrong place, the relevant is core/src/impl/Kokkos_Error.hpp
, not core/src/Kokkos_HostSpace.hpp
. Both files contain AllocationMechanism
enums but they are different. Note that we test with KOKKOS_DEPRECATED_CODE_4=OFF
in our CI, see
Lines 56 to 92 in 1a3ea28
stage('CUDA-11.7-NVHPC') { | |
agent { | |
dockerfile { | |
filename 'Dockerfile.nvhpc' | |
dir 'scripts/docker' | |
additionalBuildArgs '--build-arg BASE=nvcr.io/nvidia/nvhpc:22.9-devel-cuda11.7-ubuntu20.04' | |
label 'nvidia-docker && large_images' | |
args '-v /tmp/ccache.kokkos:/tmp/ccache --env NVIDIA_VISIBLE_DEVICES=$NVIDIA_VISIBLE_DEVICES' | |
} | |
} | |
environment { | |
OMP_NUM_THREADS = 8 | |
// Nested OpenMP does not work for this configuration, | |
// so disabling it | |
OMP_MAX_ACTIVE_LEVELS = 1 | |
OMP_PLACES = 'threads' | |
OMP_PROC_BIND = 'spread' | |
NVHPC_CUDA_HOME = '/opt/nvidia/hpc_sdk/Linux_x86_64/22.9/cuda/11.7' | |
} | |
steps { | |
sh '''rm -rf build && mkdir -p build && cd build && \ | |
/opt/cmake/bin/cmake \ | |
-DCMAKE_BUILD_TYPE=RelWithDebInfo \ | |
-DCMAKE_CXX_COMPILER=nvc++ \ | |
-DCMAKE_CXX_STANDARD=17 \ | |
-DCMAKE_CXX_FLAGS="--diag_suppress=implicit_return_from_non_void_function,no_device_stack" \ | |
-DKokkos_ARCH_NATIVE=ON \ | |
-DKokkos_ENABLE_COMPILER_WARNINGS=ON \ | |
-DKokkos_ENABLE_DEPRECATED_CODE_4=OFF \ | |
-DKokkos_ENABLE_TESTS=ON \ | |
-DKokkos_ENABLE_CUDA=ON \ | |
-DKokkos_ENABLE_CUDA_LAMBDA=ON \ | |
-DKokkos_ENABLE_OPENMP=ON \ | |
.. && \ | |
make -j8 && ctest --verbose''' | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok that seems somewhat unfortunate naming but ok.
Note that we get a couple of warnings (if we enable KOKKOS_DEPRECATED_CODE_4 and deprecation warnings; here with warnings as errors)
but they all originate from "*.cc" files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about HBWSpace
?
|
Add a note in the PR description that you intentionally did not update HBWSpace. |
Here you go! |
…ses in switch statement
@masterleinad @dalg24 @crtrott Unfortunately, this does not work for all platforms: #6367 |
Related to #5011. It seems that we don't need all these different allocation mechanisms anymore and can use
std::aligned_alloc
(https://en.cppreference.com/w/cpp/memory/c/aligned_alloc) instead. Also, there were some problems withINTEL_MM_ALLOC
reported that don't appear anymore when usingstd::malloc
instead.Note that
MSVC
is unable to handle aligned allocations of any kind and provides_aligned_malloc
(to be freed with_aligned_free
) instead.This pull request deprecates all other allocation variants for
HostSpace
to only usealigned_malloc
.HBWSpace
is untested and broken, see #4889. I would address it in a different pull request (maybe that one).