Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide Kokkos::realloc overload taking Kokkos::view_alloc() return type #5035

Merged
merged 23 commits into from
Jul 11, 2022

Conversation

masterleinad
Copy link
Contributor

@masterleinad masterleinad commented May 18, 2022

Supersedes #5027. This would allow passing a label or an execution space or other (possibly multiple) View constructor properties to Kokkos::realloc.
We decided in #5027 to not allow passing a label at all which is implemented here.

dalg24
dalg24 previously requested changes May 19, 2022
Copy link
Member

@dalg24 dalg24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have somewhere an up-to-date overload set for resize, realloc, and create_mirror[_view]?

core/src/Kokkos_CopyViews.hpp Outdated Show resolved Hide resolved
core/src/Kokkos_CopyViews.hpp Show resolved Hide resolved
core/src/Kokkos_CopyViews.hpp Show resolved Hide resolved
@masterleinad
Copy link
Contributor Author

Do we have somewhere an up-to-date overload set for resize, realloc, and create_mirror[_view]?

The Wiki should be up-to-date, see https://github.com/kokkos/kokkos/wiki/Kokkos::resize, https://github.com/kokkos/kokkos/wiki/Kokkos::realloc and https://github.com/kokkos/kokkos/wiki/Kokkos::create_mirror.

resize and realloc at the moment can take a single View constructor property (which currently excludes labels and execution spaces) and create_mirror[_view] can take WithoutInitializing_t.

@masterleinad masterleinad requested a review from dalg24 May 19, 2022 12:57
@dalg24
Copy link
Member

dalg24 commented May 19, 2022

OK so none of them seem to be taking Kokkos::view_alloc() return type.

@masterleinad
Copy link
Contributor Author

masterleinad commented May 19, 2022

OK, so let's see what possible View constructor properties are and what we want:

realloc resize create_mirror[_view][_and_copy]
labels disallow disallow disallow
pointers disallow disallow disallow
execution space allow allow allow
memory space disallow disallow allow
allow_padding allow allow disallow
WithoutInitializing allow allow allow

In my opinion, it makes sense to have an overload for the return type of view_alloc for all of them, at the very least, internally and we can then go from here. realloc can serve as a prototype for this.

@masterleinad
Copy link
Contributor Author

I decided to implement the behavior for passing an execution space already. This needs #5040 to pass the tests and #4826 will remove some more fences that we currently need to allow in the test.

@cz4rs cz4rs self-requested a review June 1, 2022 14:24
@masterleinad
Copy link
Contributor Author

Retest this please.

@masterleinad
Copy link
Contributor Author

Seems to work with Sacado.

@masterleinad
Copy link
Contributor Author

cuda.debug_pin_um_to_host failing for CUDA-11.6-NVHPC looks unrelated/spurious:

7: [ RUN      ] cuda.debug_pin_um_to_host
7: Time CudaSpace: 0.006380 CudaUVMSpace_1: 0.015492 CudaUVMSpace_2: 0.016266 CudaPinnedHostSpace: 0.249478 CudaUVMSpace_Pinned: 0.016704
7: /var/jenkins/workspace/Kokkos/core/unit_test/cuda/TestCuda_DebugPinUVMSpace.cpp:128: Failure
7: Value of: passed
7:   Actual: false
7: Expected: true
7: [  FAILED  ] cuda.debug_pin_um_to_host (443 ms)

@crtrott
Copy link
Member

crtrott commented Jul 8, 2022

Retest this please.

"not include a pointer!");
static_assert(!alloc_prop_input::has_memory_space,
"The view constructor arguments passed to Kokkos::realloc must "
"not include a memory space instance!");

const std::string label = v.label();

v = drview_type(); // Deallocate first, if the only view to allocation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a flaw in this PR, but the global Kokkos::fence() implied by a deallocation does unfortunately undermine some of the goal of #4793.

Mentioning as much for the back-link over there as anything else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same applies for View as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very happy to discuss that again. I'm in favor in restricting to execution spaces that can actually access the memory space.

@@ -925,6 +961,19 @@ class DualView : public ViewTraits<DataType, Arg1Type, Arg2Type, Arg3Type> {
modified_flags(1) = modified_flags(0) = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a behavior change in this PR that I can tell, but could @stanmoore1 or someone else familiar with DualView semantics check that in the case where size is the same and no reallocation happens, and we do initialize only on the device instance, that the resulting modified_flags values are sensible? It seems like a subsequent synchronization call should recognize that the re-initialized data on the device side is fresher than the stale data on the host side.

outer_view2 = inner_view;
},
[&](BeginFenceEvent event) {
if (event.descriptor().find("HostSpace fence") != std::string::npos)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can DynRankView exhibit the same "Debug Only Check for Execution Error" that the tests for DualView and ScatterView exclude?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test for View has that exclusion, so I think we need it here, unless it's superfluous there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is only necessary for View since it includes an additional case.

masterleinad and others added 2 commits July 8, 2022 18:26
@masterleinad
Copy link
Contributor Author

Retest this please.

@dalg24
Copy link
Member

dalg24 commented Jul 11, 2022

Failure (OpenMP timeout) is unrelated.

@dalg24 dalg24 merged commit d0b31c6 into kokkos:develop Jul 11, 2022
@dalg24
Copy link
Member

dalg24 commented Jul 12, 2022

I messed up here

4: [ RUN      ] openmptarget.realloc_exec_space
4: Test failure: encountered unwanted events
4:   Found fence event!
4: BeginFence { "SharedAllocationRecord<Kokkos::Experimental::OpenMPTargetSpace, void>::SharedAllocationRecord(): fence after copying header from HostSpace", 0,6}
4: EndFence { 6}
4: BeginFence { "SharedAllocationRecord<Kokkos::Experimental::OpenMPTargetSpace, void>::SharedAllocationRecord(): fence after copying header from HostSpace", 67108864,7}
4: EndFence { 7}
4: BeginFence { "SharedAllocationRecord<Kokkos::Experimental::OpenMPTargetSpace, void>::SharedAllocationRecord(): fence after copying header from HostSpace", 0,8}
4: EndFence { 8}
4: BeginFence { "SharedAllocationRecord<Kokkos::Experimental::OpenMPTargetSpace, void>::SharedAllocationRecord(): fence after copying header from HostSpace", 67108864,9}
4: EndFence { 9}
4: /var/jenkins/workspace/Kokkos/core/unit_test/TestWithoutInitializing.hpp:132: Failure
4: Value of: success
4:   Actual: false
4: Expected: true
4: [  FAILED  ] openmptarget.realloc_exec_space (0 ms)

CI has been failing ever since the merge

@masterleinad masterleinad deleted the disallow_realloc_label branch July 12, 2022 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Blocks Promotion Overview issue for release-blocking bugs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants