-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve error message in view memory access violations #4950
Improve error message in view memory access violations #4950
Conversation
3453fa6
to
d0de588
Compare
5adb02a
to
fc7d02e
Compare
core/src/impl/Kokkos_ViewMapping.hpp
Outdated
char const unmanaged[] = "**UNMANAGED**"; | ||
char const unavailable[] = "**UNAVAILABLE**"; | ||
Map const&) { | ||
char err[1024] = ""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the UB? .c_str()
should be stable unless the source string is modified/destroyed.
Also, I prefer strncat
over strcat
, because if the source is too long, it gets truncated instead of UB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the UB? .c_str() should be stable unless the source string is modified/destroyed.
Exactly
Also, I prefer strncat over strcat, because if the source is too long, it gets truncated instead of UB.
Ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a bit iffy, you essentially increase the stack frame size for any bounds checking by 1kB, this may crash codes in Debug mode now which didn't before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They would need to increase the stack size for the GPU to make it work again, if they were close before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider
#include <Kokkos_Core.hpp>
template <class Space>
struct Foo {
Kokkos::View<int, Space> m_v {"v"};
Kokkos::View<float, Space> m_w {"w"};
Kokkos::View<double*, Space> m_u {reinterpret_cast<double*>(0xDEADBEEF), 1};
KOKKOS_FUNCTION void operator()(int) const {
++m_v();
++m_w();
++m_u(0);
}
};
int main(int argc, char* argv[]) {
Kokkos::initialize(argc, argv);
{
Kokkos::parallel_for(1, Foo<Kokkos::DefaultExecutionSpace>()); // OK
Kokkos::parallel_for(1, Foo<Kokkos::DefaultHostExecutionSpace>()); // memory access violation occurs
}
Kokkos::finalize();
}
Passing --resource-usage
to NVCC yields
ptxas info : 32 bytes gmem, 32768 bytes cmem[3]
ptxas info : Compiling entry function '_ZN6Kokkos4Impl33cuda_parallel_launch_local_memoryINS0_11ParallelForI3FooINS_6SerialEENS_11RangePolicyIJNS_4CudaEEEES7_EEEEvT_' for 'sm_70'
ptxas info : Function properties for _ZN6Kokkos4Impl33cuda_parallel_launch_local_memoryINS0_11ParallelForI3FooINS_6SerialEENS_11RangePolicyIJNS_4CudaEEEES7_EEEEvT_
1040 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 24 registers, 456 bytes cmem[0]
ptxas info : Compiling entry function '_ZN6Kokkos4Impl33cuda_parallel_launch_local_memoryINS0_11ParallelForI3FooINS_4CudaEENS_11RangePolicyIJS4_EEES4_EEEEvT_' for 'sm_70'
ptxas info : Function properties for _ZN6Kokkos4Impl33cuda_parallel_launch_local_memoryINS0_11ParallelForI3FooINS_4CudaEENS_11RangePolicyIJS4_EEES4_EEEEvT_
0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 18 registers, 456 bytes cmem[0]
ptxas info : Compiling entry function '_ZN6Kokkos4Impl33cuda_parallel_launch_local_memoryINS0_11ParallelForINS0_16ViewValueFunctorINS_6DeviceINS_4CudaENS_9CudaSpaceEEEfLb1EEENS_11RangePolicyIJS5_NS_9IndexTypeIlEEEEES5_EEEEvT_' for 'sm_70'
ptxas info : Function properties for _ZN6Kokkos4Impl33cuda_parallel_launch_local_memoryINS0_11ParallelForINS0_16ViewValueFunctorINS_6DeviceINS_4CudaENS_9CudaSpaceEEEfLb1EEENS_11RangePolicyIJS5_NS_9IndexTypeIlEEEEES5_EEEEvT_
0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 12 registers, 472 bytes cmem[0]
ptxas info : Compiling entry function '_ZN6Kokkos4Impl33cuda_parallel_launch_local_memoryINS0_11ParallelForINS0_16ViewValueFunctorINS_6DeviceINS_4CudaENS_9CudaSpaceEEEiLb1EEENS_11RangePolicyIJS5_NS_9IndexTypeIlEEEEES5_EEEEvT_' for 'sm_70'
ptxas info : Function properties for _ZN6Kokkos4Impl33cuda_parallel_launch_local_memoryINS0_11ParallelForINS0_16ViewValueFunctorINS_6DeviceINS_4CudaENS_9CudaSpaceEEEiLb1EEENS_11RangePolicyIJS5_NS_9IndexTypeIlEEEEES5_EEEEvT_
0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 12 registers, 472 bytes cmem[0]
ptxas info : Compiling entry function '_ZN6Kokkos4Impl33cuda_parallel_launch_local_memoryINS0_11ParallelForINS0_16ViewValueFunctorINS_6DeviceINS_4CudaENS_9CudaSpaceEEEjLb1EEENS_11RangePolicyIJS5_NS_9IndexTypeIlEEEEES5_EEEEvT_' for 'sm_70'
ptxas info : Function properties for _ZN6Kokkos4Impl33cuda_parallel_launch_local_memoryINS0_11ParallelForINS0_16ViewValueFunctorINS_6DeviceINS_4CudaENS_9CudaSpaceEEEjLb1EEENS_11RangePolicyIJS5_NS_9IndexTypeIlEEEEES5_EEEEvT_
0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 12 registers, 472 bytes cmem[0]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stack frame size is only affected when there is illegal memory accesses in the code.
That said I agree 1 kilobyte is excessive. I will reduce.
fc7d02e
to
f14fd60
Compare
Retest this please. |
Retest this please... @rgayatri23 there is an insane amount of
being printed to the point the log gets really big. We need to handle these. Would you please look into it? |
I will look into it more deeply but if that message is printed then the correctness will definitely fail as the kernel wont be executed. |
|
||
template <std::size_t... Is> | ||
KOKKOS_FUNCTION decltype(auto) bad_access(std::index_sequence<Is...>) const { | ||
return v((Is * 0)...); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is Is multiplied by zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accessing v(0, 0, 0, ...)
template <class View, class LblOrPtr> | ||
auto make_view(LblOrPtr x) { | ||
return make_view_impl<View>(std::move(x), | ||
std::make_index_sequence<View::rank>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this.
core/src/impl/Kokkos_ViewMapping.hpp
Outdated
char const unmanaged[] = "**UNMANAGED**"; | ||
char const unavailable[] = "**UNAVAILABLE**"; | ||
Map const&) { | ||
char err[1024] = ""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a bit iffy, you essentially increase the stack frame size for any bounds checking by 1kB, this may crash codes in Debug mode now which didn't before.
core/src/impl/Kokkos_ViewMapping.hpp
Outdated
char const unmanaged[] = "**UNMANAGED**"; | ||
char const unavailable[] = "**UNAVAILABLE**"; | ||
Map const&) { | ||
char err[1024] = ""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They would need to increase the stack size for the GPU to make it work again, if they were close before.
Include view label in error message when possible