-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#4517: DynamicView: Fix deallocation extent #4533
#4517: DynamicView: Fix deallocation extent #4533
Conversation
This change fixes the performance issues that were observed in EMPIRE |
Was the code previously just deallocating some huge number of garbage entries that it shouldn't have been? |
Thanks @lifflander for finding this! We have verified that this fixes the empire issues. @ndellingwood - when this merges into kokkos, can you snapshot into trilinos. This shoudl get the empire sync of trilinos going again! |
Yes, they are nullptr... so just wasting time. Before I changed the code, the deallocator was going over all the entries also but it tested for nullptr before actually calling the deallocator. |
Looks like this makes the unit tests segfault. |
@@ -173,7 +173,8 @@ struct ChunkedArrayManager { | |||
void execute() { | |||
// Destroy the array of chunk pointers. | |||
// Two entries beyond the max chunks are allocation counters. | |||
for (unsigned i = 0; i < m_chunk_max; i++) { | |||
auto const len = *reinterpret_cast<size_t*>(m_chunks + m_chunk_max + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels so sketchy. Why is it not something like
auto const len = *reinterpret_cast<size_t*>(m_chunks + m_chunk_max + 1); | |
size_t const len = *(m_chunks + m_chunk_max + 1); |
Also what was wrong with unsigned
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m_chunks is a pointer type that is abused to store a numerical value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kokkos/containers/src/Kokkos_DynamicView.hpp
Lines 356 to 360 in a1d045d
size_t size() const noexcept { | |
size_t extent_0 = | |
*reinterpret_cast<const size_t*>(m_chunks_host + m_chunk_max + 1); | |
return extent_0; | |
} |
Something's clearly not right/working here
b4d15c4
to
e5efb4c
Compare
I accidentally picked the wrong entry for deallocation:
|
e5efb4c
to
223781e
Compare
The CI failure from Jenkins is spurious relative to this PR -
It probably represents an independent issue to be investigated, though. |
@@ -173,7 +173,8 @@ struct ChunkedArrayManager { | |||
void execute() { | |||
// Destroy the array of chunk pointers. | |||
// Two entries beyond the max chunks are allocation counters. | |||
for (unsigned i = 0; i < m_chunk_max; i++) { | |||
size_t const len = *reinterpret_cast<size_t*>(m_chunks + m_chunk_max); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
size_t const len = *reinterpret_cast<size_t*>(m_chunks + m_chunk_max); | |
uintptr_t const len = *reinterpret_cast<uintptr_t*>(m_chunks + m_chunk_max); |
just to be conforming with access in other locations. Also, I would very much prefer if the allocation counters are stored in separate variables (inside ChunkedArrayManager
). The need to reinterpret_cast
here feels pretty sketchy and is confusing regarding readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I've made the change to uintptr_t
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that allocating them separately would be increase code readability, but this PR is not making this worse (i.e., DynamicView has been this way since the beginning). Due to how critical this is for EMPIRE right now, I would like to get this moved forward as is and then push for refactoring later to improve code readability.
223781e
to
86f3193
Compare
86f3193
to
cff6e2b
Compare
Retest this please |
…extent kokkos#4517: DynamicView: Fix deallocation extent (cherry picked from commit 55aeec4)
Cherry-picked to 3.5 with #4538 |
Fixes #4517