Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use implicit id counter for shared memory allocation #85

Closed
wants to merge 1 commit into from

Conversation

BenjaminW3
Copy link
Member

This branch implements a compile time counter for the shared memory allocation that is compiling on all platforms.
It is based on the code from Filip Roséen.
At least in C++11 and C++14 this code is valid. The C++ Standards Committee is currently debating on wheter this should be legal or not in C++17. See here.

  • Suppress the high number of warnings. (-Wundefined-internal / -Wnon-template-friend)
  • Increase the maximum counter value from 64 to something reasonable. However, this could exceed the maximum template instantiation depth.
  • Add the solution to the Stackoverflow thread.

@BenjaminW3 BenjaminW3 force-pushed the fix-shared-alloc branch 4 times, most recently from 5f3a43a to 78d286c Compare October 29, 2015 16:38
@BenjaminW3 BenjaminW3 changed the title Fix shared alloc DO NOT MERGE Fix shared memory allocation with implicit id counter Oct 29, 2015
@psychocoderHPC
Copy link
Member

Thanks for this workaround I will test it next week.

@psychocoderHPC
Copy link
Member

There is no need to add the id to the CPU alloc structs. The use of an id for shared memory can fully hidden in the GPU specialization.

@psychocoderHPC
Copy link
Member

I tested you current branch with cuda7.0 and two shared memory variables. Both are equal in the kernel which means the current implementation is not working.

@BenjaminW3
Copy link
Member Author

I might have an Idea how to fix this and will test it in the next days.

@psychocoderHPC
Copy link
Member

btw: I tested the original solution of Filip Roséen with g++4.9 on the host and always get the some ID.

@psychocoderHPC
Copy link
Member

OK the original id generation on CPU is not running because of a problem that always the path for BOOST_COMP_MSVC is used also in linux with gcc.
update: Found the problem #ifdef instead od #if is used

static constexpr std::size_t value = N;
};

#ifdef BOOST_COMP_MSVC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This must be #if else always the first code path is used

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake...

@psychocoderHPC
Copy link
Member

I pushed a small update to your pull request that fix some small mistakes. BenjaminW3#1

Shared memory id generation on is working but not with cuda 7.0

@BenjaminW3
Copy link
Member Author

Thanks, moving the uniqueId() call to the default template parameter was the same idea I had.

@psychocoderHPC
Copy link
Member

Yes if it is not a default parameter at this place no new ids are generated.

//-----------------------------------------------------------------------------
template<
std::size_t N,
class = char[noexcept(adl_flag(flag<N>{})) ? +1u : -1u]>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current problem with NVCC is that it not supports noexcept on this way we need it here. NVCC implements this.
The second pass in line 92 is also not working with nvcc :-(

@BenjaminW3 BenjaminW3 closed this Nov 11, 2015
@BenjaminW3 BenjaminW3 reopened this Nov 11, 2015
@BenjaminW3 BenjaminW3 changed the title Fix shared memory allocation with implicit id counter Use implicit id counter for shared memory allocation Nov 11, 2015
@BenjaminW3 BenjaminW3 force-pushed the fix-shared-alloc branch 9 times, most recently from b6553e6 to e3ae088 Compare January 30, 2016 06:41
@BenjaminW3
Copy link
Member Author

I tried a lot of different things but did not get it working with nvcc (neither 7.0 nor 7.5). Therefore I will close this for now. However, clang, MSVC and gcc did work as expected.

@ax3l
Copy link
Member

ax3l commented Aug 1, 2017

The C++ Standards Committee is currently debating on wheter this should be legal or not in C++17.

@BenjaminW3 btw, do you know how the discussion ended? ;)

@BenjaminW3
Copy link
Member Author

@ax3l You can have a look at the C++ core issue here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants