-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use implicit id counter for shared memory allocation #85
Conversation
5f3a43a
to
78d286c
Compare
78d286c
to
0172e3e
Compare
Thanks for this workaround I will test it next week. |
There is no need to add the id to the CPU alloc structs. The use of an id for shared memory can fully hidden in the GPU specialization. |
I tested you current branch with |
I might have an Idea how to fix this and will test it in the next days. |
btw: I tested the original solution of Filip Roséen with g++4.9 on the host and always get the some ID. |
|
static constexpr std::size_t value = N; | ||
}; | ||
|
||
#ifdef BOOST_COMP_MSVC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This must be #if
else always the first code path is used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake...
I pushed a small update to your pull request that fix some small mistakes. BenjaminW3#1 Shared memory id generation on is working but not with cuda 7.0 |
Thanks, moving the |
Yes if it is not a default parameter at this place no new ids are generated. |
//----------------------------------------------------------------------------- | ||
template< | ||
std::size_t N, | ||
class = char[noexcept(adl_flag(flag<N>{})) ? +1u : -1u]> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current problem with NVCC is that it not supports noexcept
on this way we need it here. NVCC implements this.
The second pass in line 92 is also not working with nvcc :-(
4c1a306
to
1b9605a
Compare
1b9605a
to
dee42de
Compare
84cec09
to
98f245d
Compare
b6553e6
to
e3ae088
Compare
e3ae088
to
b153b05
Compare
I tried a lot of different things but did not get it working with nvcc (neither 7.0 nor 7.5). Therefore I will close this for now. However, clang, MSVC and gcc did work as expected. |
@BenjaminW3 btw, do you know how the discussion ended? ;) |
This branch implements a compile time counter for the shared memory allocation that is compiling on all platforms.
It is based on the code from Filip Roséen.
At least in C++11 and C++14 this code is valid. The C++ Standards Committee is currently debating on wheter this should be legal or not in C++17. See here.