Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std::is_trivially_copy_constructible requirement for kernel parameters still too strong #551

Closed
pauleonix opened this issue Oct 13, 2023 · 10 comments
Labels

Comments

@pauleonix
Copy link

pauleonix commented Oct 13, 2023

Even the most basic "fancy iterator" from Thrust thrust::constant_iterator thrust::counting_iterator doesn't fulfill the requirement making algorithms written using cuda-api-wrappers for launching kernels less flexible.

constant_iterator constructor ```cuda __host__ __device__ constant_iterator(constant_iterator const &rhs) : super_t(rhs.base()), m_value(rhs.m_value) {} ```
    __host__ __device__
    counting_iterator(counting_iterator const &rhs):super_t(rhs.base()){}

See Compiler Explorer Compiler Explorer.

@eyalroz
Copy link
Owner

eyalroz commented Oct 13, 2023

To justify your suggestion, please disprove my logic for choosing this trait:

An argument of a kernel does not get "copy-constructed" in GPU global memory; it is copied as raw bytes. Thus, if it cannot be copy-constructed by trivial copying - it cannot be assumed to be a valid constructed object of the relevant type.

@pauleonix pauleonix changed the title std::is_trivially_copy_cosntructible requirement for kernel parameters still too strong std::is_trivially_copy_constructible requirement for kernel parameters still too strong Oct 13, 2023
@pauleonix
Copy link
Author

pauleonix commented Oct 13, 2023

Well, in theory constant_iterator<T> counting_iterator<T> is just a T with some iterator-specific accessor functions and therefore trivially copyable if T is. But std::is_trivially_copy_constructible still wont agree to that in practice.

In practice I can't update my research group's codebases written by @codecircuit using cuda-api-wrappers 0.4.x to use >=0.5. And in practice I expect a C++ library for CUDA programming to be composable with Thrust iterators. I don't expect the Thrust team to change their working public API for this static_assert. But if you disagree with that expectation, I can try.

Maybe a preprocessor switch to turn off that static_assert is the right middle ground?

@eyalroz
Copy link
Owner

eyalroz commented Oct 13, 2023

I can't go on what a type should be "in theory". If you write:

struct foo {
    int x, y;
    foo(const foo& f) : x(f.x), y(f.y) { }
}

I can't accept your struct, since the compiler doesn't "know" that this is equivalent to trivial copy construction. And that copy ctor could, in principle, say, flip x and y if we're on the device side and not flip them on the host side.

...

but - now that you mention it, I did a bit of DDG'ing, and I'm wondering whether std::is_trivially_copyable is the trait I need. Does the const_iterator<T> you're interested in satisfy that?

@pauleonix
Copy link
Author

pauleonix commented Oct 13, 2023

I just chose thrust::const_iterator, because I thought it were the most basic (now that I read/understood more of the source, thrust::counting_iterator is the most basic one made up just of a T in terms of state while constant_iterator needs an additional counter to make distances/ranges work). In fact all Thrust fancy iterators I tested seem to be neither trivially copyable nor trivially copy-constructible. They all seem to encompass an internal counting_iterator so the first question is why that one isn't trivially copyable.

It is a bit hard to decipher the source due to multiple layers of CRTP (counting_iterator -> counting_iterator_base -> iterator_adaptor -> iterator_adaptor_base -> iterator_facade I think), so I'm not sure why it isn't trivially copyable. There is even a

#if THRUST_CPP_DIALECT >= 2011
    counting_iterator & operator=(const counting_iterator &) = default;
#endif

What I know is that all these iterator types are supposed to be safe in regards to passing them as kernel parameters.

@eyalroz
Copy link
Owner

eyalroz commented Oct 13, 2023

What I know is that all these iterator types are supposed to be safe in regards to passing them as kernel parameters.

Let's say I'm willing to accept it on faith that these types are safe for passing. So, shall I hard-code somehow for all the thrust iterator types you tell me about? That doesn't sound right (if it's at all doable). But if I don't hard-code, and instead just remove the restriction - who's to say users won't pass a struct foo from my example?

"I know it should be safe" is not a criterion I can work with... how about you open a bug report against thrust (or the unified CCCL) about this, and post a link here? I'm willing to have a discussion about this with the developers there. Maybe there's an option I'm failing to consider.

@pauleonix
Copy link
Author

NVIDIA/cccl#562

@eyalroz
Copy link
Owner

eyalroz commented Feb 9, 2024

@pauleonix : Do you think there'll be any traction from the NVIDIA side?

@eyalroz
Copy link
Owner

eyalroz commented Feb 9, 2024

I don't see what's the big deal in just fixing the iterators on their side.

@pauleonix
Copy link
Author

What I know is that they are working on ranges, so I expect some kind of iota_view etc. in the not-too-far future. If this will fix this particular issue or still use the existing iterators or a similar mechanism under the hood, I don't know.

@eyalroz
Copy link
Owner

eyalroz commented Apr 19, 2024

So, we can close this. But - note I am considering strengthening the requirements further: #642 .

@eyalroz eyalroz closed this as completed Apr 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants