Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider dropping device_t::create_event and device_t::create_stream #277

Closed
eyalroz opened this issue Nov 20, 2021 · 9 comments
Closed
Labels

Comments

@eyalroz
Copy link
Owner

eyalroz commented Nov 20, 2021

I should consider dropping device_t::create_event() and device_t::create_stream(), because

  1. device.create_event() and device.create_stream() are not necessary to have: Library users should use event::create() and stream::create() - to get event_t's and stream_t's respectively.
  2. The device wrapper is a fat class already.
  3. The device doesn't do the creation, the CUDA driver does (with device support), so the abstraction doesn't even properly fit.
  4. The C-style runtime API don't have something like that anyway.

Opinions?

@codecircuit
Copy link
Contributor

codecircuit commented Nov 21, 2021

Providing a CUDA host API without an implicit default device is one of the significant features of the cuda-api-wrappers. Code which is developed without the assumption of an implicit device is more readable and can more easily be extended to a multi-device code. Currently, the cuda-api-wrappers are more than a CUDA host API, which does error checking only, which could easily be replaced with an error checking macro. They provide a modern C++ RAII interface without implicit C-style assumptions. I think, we should keep the functions.

Issue #142 is related to that discussion.

@eyalroz
Copy link
Owner Author

eyalroz commented Nov 21, 2021

@codecircuit : What I'm suggestiong will in no way remove this aspect of the API wrappers. It's just that right now we have two ways of saying the same thing:

  • event_t cuda::event::create(device_t device)
  • event_t cuda::device_t::create_event()

(I'm glossing over some extra options here.) None of these involve an implicit default device or current device. And while some of the wrapper classes are RAII, device_t is not a RAII class. And the same thing goes for streams.

So, either I misunderstood your comments or you've misunderstood my question...

@codecircuit
Copy link
Contributor

@eyalroz thank you for the clarification. A similar redundancy is currently also part of the STL with std::size(). You can either obtain the size of a vector with the function std::size(v) or the member function of the vector v.size(). Contrary to our case here, std::size also works for other containers like c-arrays, which do not have a member function size().

My opinion is not strong on this, but I would keep both functions cuda::event::create(device_t device) and cuda::device_t::create_event().

@eyalroz
Copy link
Owner Author

eyalroz commented Nov 22, 2021

@codecircuit So, that's an interesting analogy. In standard C++, many features remain as they were originally implemented, while a new approach is taken by the language standard committee - because they have to maintain backwards compatibility (even ABI compatibility). So with functions like std::size() or std::empty() and such, they have to keep both variants. Also, one of the features Bjarne Stroustrup has wanted to introduce for 25 years already is uniform function call syntax, which would make it so that, essentially, if you define a function foo(A& x) then my_a.foo() simply calls foo(my_a), without class A having a foo() method. Or more fundamentally, removes the boundary between methods and standalone functions. If we had something like that, then my_device.event::create() would mean event::create(my_device).

@eyalroz
Copy link
Owner Author

eyalroz commented Nov 24, 2021

@codecircuit : Actually, this would go further. By the same logic, I would also kill device_t::memory().allocate(), and just keep cuda::memory::device::allocate(device_t). And - instead of having device_t::some_p2p_function(device_t), we would would only have device::some_p2p_function(device_t, device_t). Still thumbing this up?

@codecircuit
Copy link
Contributor

We should think about the pros and cons of keeping or remove the corresponding member functions:

keeping member functions removing member functions
more satisfied user base less satisfied user base
shorter user code longer user code
better readability when chaining less readability when nesting
increased maintenance time decreased maintenance time
more code redundancy less code redundancy

By keeping them, users which are used to call member functions have their desired interface as well as users which are used to free function calls have their desired interface. Moreover, when the free functions are called, more code must be written (usually the namespaces are written explicitly). That might be seen as weak code bloating.

device.some_p2p_function(other_device);
cuda::device::some_p2p_function(device, other_device); // namespaces must be written here

Chaining member function calls is more readable than nesting many free function calls, but our API does not induce the user to write such code.

The reduced code redundancy would make user code more consistent, as there is only one way to, e.g., create an event.

I prefer to keep the member functions due to usability. What member functions would also be removed?
What is about event.record(stream)? I think it is convenient to write it in that way. If the removal of member functions is done consistently we might end up with C-style code because then our classes are only structs without any functionality, which are just passed to free functions.

@eyalroz
Copy link
Owner Author

eyalroz commented Nov 25, 2021

weak code bloating.

@codecircuit : I wonder if ADL wouldn't allow writing just device::some_p2p_function(device, other_device). But that's a valid point I guess.

Chaining member function calls is more readable than nesting many free function calls, but our API does not induce the user to write such code.

You don't really get nesting with what I'm thinking of removing. If you create a stream, act on it, and immediately forget about it, e.g. my_dev.create_stream().enqueue.kernel_launch(whatever), then nothing will happen typically and you'd lose the stream.

Anyway, I think that generally I'll have to give up the idea of slimming down the API this way, because it's too many changes and might be too drastic.

@eyalroz
Copy link
Owner Author

eyalroz commented Nov 25, 2021

Well, we wouldn't end up with C-style code, since we still have our RAII classes; and namespaces; and overloading etc.

But ok, I guess I'm buying that. Then, perhaps you believe we should go another way? For example, allow for:

my_device.make_current().synchronize();

where some of the methods return *this as a (possibly const) device_t &?

@eyalroz
Copy link
Owner Author

eyalroz commented Dec 29, 2021

Anyway, not removing these methods. In fact, I'm adding a few more on the driver-wrappers branch, e.g. create_context(), create_module() for contexts etc.

@eyalroz eyalroz closed this as completed Dec 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants