Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move allocation profiling to the allocate/deallocate calls #3084

Merged

Conversation

DavidPoliakoff
Copy link
Contributor

Addresses #3064 . Pushing to catch the inevitable preprocessor branches leading to undefined arguments, will mark ready to review when that's fixed

#endif

m_space.deallocate(SharedAllocationRecord<void, void>::m_alloc_ptr,
m_space.deallocate(RecordBase::m_alloc_ptr->m_label,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this supposed to work? Doesn't the thing above actually deep_copy the label back first?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still can't work. you need to deep copy the header back to the host before accessing the label

#endif

m_space.deallocate(SharedAllocationRecord<void, void>::m_alloc_ptr,
m_space.deallocate(RecordBase::m_alloc_ptr->m_label,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need to deep copy the header back first in order to get the label?

@DavidPoliakoff
Copy link
Contributor Author

Quick note: it turns out that this exposes events from new Kokkos internal allocations to the user. Once I've fixed this and marked it ready for review, the reviewer should check to make sure the names we're using for those are acceptable

@DavidPoliakoff
Copy link
Contributor Author

It turns out that Views go through this code path:

https://github.com/kokkos/kokkos/blob/master/core/src/impl/Kokkos_MemorySpace.hpp#L69

Which allocates the requested size plus a header. This breaks all of our Profiling, each header is now 128 bytes (sizeof(SharedAllocationHeader)) bigger than they used to be, as far as the Profiling system is concerned

@crtrott
Copy link
Member

crtrott commented Jun 11, 2020

would it make this a bit simpler if we just removed the ENABLE_PROFILING thing in this PR? (Or assume here that it is on, and then do a follow on which removes it everywhere else and adds the "I don't have dlopen" option)?

@DavidPoliakoff
Copy link
Contributor Author

It would make it a little less ugly to remove ENABLE_PROFILING, but I'd prefer to separate that. When I do, I'll make this less ugly. I just don't want to do something as big as that in a disconnected PR.

@DavidPoliakoff DavidPoliakoff marked this pull request as ready for review June 11, 2020 17:12
Copy link
Member

@crtrott crtrott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to avoid unnecessary deep copies if no profile library is loaded.

core/src/Cuda/Kokkos_CudaSpace.cpp Outdated Show resolved Hide resolved
core/src/Cuda/Kokkos_CudaSpace.cpp Outdated Show resolved Hide resolved
core/src/HIP/Kokkos_HIP_Space.cpp Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Jun 11, 2020

Codecov Report

Merging #3084 into develop will increase coverage by 0.0%.
The diff coverage is 100.0%.

Impacted file tree graph

@@           Coverage Diff           @@
##           develop   #3084   +/-   ##
=======================================
  Coverage     85.6%   85.7%           
=======================================
  Files          122     122           
  Lines        10391   10398    +7     
=======================================
+ Hits          8905    8917   +12     
+ Misses        1486    1481    -5     
Flag Coverage Δ
#clang 76.2% <100.0%> (+<0.1%) ⬆️
#gcc 86.4% <100.0%> (+<0.1%) ⬆️
Impacted Files Coverage Δ
core/src/Kokkos_HostSpace.hpp 100.0% <ø> (ø)
core/src/impl/Kokkos_HostSpace.cpp 73.0% <100.0%> (+2.0%) ⬆️
core/src/impl/Kokkos_MemorySpace.hpp 70.8% <100.0%> (+1.2%) ⬆️
core/src/impl/Kokkos_Serial.cpp 91.6% <100.0%> (ø)
core/src/impl/Kokkos_TaskQueueCommon.hpp 85.8% <0.0%> (-0.9%) ⬇️
core/src/Kokkos_MemoryPool.hpp 88.0% <0.0%> (-0.5%) ⬇️
core/src/impl/Kokkos_ViewMapping.hpp 97.5% <0.0%> (-0.1%) ⬇️
core/src/Kokkos_View.hpp 93.3% <0.0%> (ø)
core/src/Kokkos_Layout.hpp 100.0% <0.0%> (ø)
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 605ab03...8779701. Read the comment docs.

Copy link
Member

@crtrott crtrott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the single argument allocate function works. Since it calls a 2 argument version which doesn't exist?

@@ -201,6 +201,19 @@ CudaHostPinnedSpace::CudaHostPinnedSpace() {}
// <editor-fold desc="allocate()"> {{{1

void *CudaSpace::allocate(const size_t arg_alloc_size) const {
return allocate("[unlabeled]", arg_alloc_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this call doesn't work doesn't? I mean there doesn't seem to be an 2 argument allocate overload. Maybe arg_logical_size should just be defaulted to the arg_alloc_size thing. Or we should just report out physical allocation size instead of logical.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the three-arg does have a default argument, it defaults to 0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok the default is not in the cpp file I guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants