Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot rely on scoped_current_device_fallback_t #316

Closed
eyalroz opened this issue Apr 13, 2022 · 1 comment
Closed

Cannot rely on scoped_current_device_fallback_t #316

eyalroz opened this issue Apr 13, 2022 · 1 comment

Comments

@eyalroz
Copy link
Owner

eyalroz commented Apr 13, 2022

(This was exposed while looking into #313)

In several implementations of API wrapper functions which don't take a context handle, we use the context::current::detail_::scoped_current_device_fallback_t class to make sure we have some, any, current context when performing some operation. Example: cuda::memory::device::typed_set<T>().

Unfortunately - CUDA is crueller than we thought. In some, or all, of these cases it actually requires the context in which the relevant handles or addresses were created/allocated. Like with our example. That means we have to somehow pass the relevant context (and perhaps device) handle into those functions - as parameters or via wrapper objects.

I am worried we might need to burden the memory region class with a context handle :-( ... and we may even want to hide some of the memory API which take raw pointers, or pointers + length only - since these will become quite unwieldy if they always need to take a context wrapper. Will we need to create a context memory member? Anyway, that looks like it might be a rather big change.

The functions using this class are currently:

  • cuda::memory::host::allocate()
  • cuda::memory::copy() <- this is the doozie... lots of functions depend on this one. We may also need to split this one into a same-context and different-contexts variants.
  • cuda::memory::pointer::detail_::get_attribute()
  • cuda::memory::pointer::detail_::get_attributes()

I hope there aren't any more.

@eyalroz
Copy link
Owner Author

eyalroz commented Apr 15, 2022

It turns out that the point is not passing the correct context. Rather, it's a combination of two requirements:

  1. Some, any, context must be current
  2. The context in which the allocation was made must not have been destroyed before the allocation was used - even if, supposedly, the allocation is not "context-specific" (e.g. pinned host memory, managed memory).

and this affects managed memory copies / set-ings as well.

eyalroz added a commit that referenced this issue Apr 15, 2022
…ext, primary contexts, and ensuring their existence in various circumstanves:

* Renamed: `context::current::detail_::scoped_current_device_fallback_t` -> `scoped_existence_ensurer_t` `context::current::detail_::scoped_context_existence_ensurer`
* context::current::scoped_override_t` now has a ctor which accepts. `primary_context_t&&`'s - to hold on to their PC reference which they are about to let go of.
* Moved: `context::current::scoped_override_t` is now implemented in the multi-wrapper implementations directory; consequently
    * Moved the implementations of  `module_t::get_kernel()` and `module::create<Creator>` to the multi-wrapper directory, since they use `context::current::scoped_override_t`.
    * Added inclusion of `cuda/api/multi_wrapper_impls/module.hpp` to some example code.
* Made a device current in some examples to avoid having no current context when executing certain operations with no wrappers (e.g. memcpy with host-side addresses)
* When allocating managed or pinned-host memory, now increasing the reference of some  context by 1 (choosing the primary context of device 0 since that's the safest), and decreasing it again on destruction. That guarantees that operations involving that allocated memory will not occur with no constructed contexts.
    * Corresponding comment changes on the `allocate()` and `free()` methods for pinned-host and managed memory.
* Factored out the code in `context_t::is_primary()` to a function, `cuda::context::current::detail_::is_primary`, which can now also be used via `cuda::context::current::is_primary()`.
* Kernel launch functions now ensure a launch only occurs / is enqueued within a current context (any context).
* Getting the current device now ensures its primary context is also active (which getting an arbitrary device does not do so).
* Added doxygen comment for `device::detail_::wrap()` mentioning the primary context reference behavior.
eyalroz added a commit that referenced this issue Apr 15, 2022
…t context, primary contexts, and ensuring their existence in various circumstanves:

* Renamed: `context::current::detail_::scoped_current_device_fallback_t` -> `scoped_existence_ensurer_t` `context::current::detail_::scoped_context_existence_ensurer`
* context::current::scoped_override_t` now has a ctor which accepts. `primary_context_t&&`'s - to hold on to their PC reference which they are about to let go of.
* Moved: `context::current::scoped_override_t` is now implemented in the multi-wrapper implementations directory; consequently
    * Moved the implementations of  `module_t::get_kernel()` and `module::create<Creator>` to the multi-wrapper directory, since they use `context::current::scoped_override_t`.
    * Added inclusion of `cuda/api/multi_wrapper_impls/module.hpp` to some example code.
* Made a device current in some examples to avoid having no current context when executing certain operations with no wrappers (e.g. memcpy with host-side addresses)
* When allocating managed or pinned-host memory, now increasing the reference of some  context by 1 (choosing the primary context of device 0 since that's the safest), and decreasing it again on destruction. That guarantees that operations involving that allocated memory will not occur with no constructed contexts.
    * Corresponding comment changes on the `allocate()` and `free()` methods for pinned-host and managed memory.
* Factored out the code in `context_t::is_primary()` to a function, `cuda::context::current::detail_::is_primary`, which can now also be used via `cuda::context::current::is_primary()`.
* Kernel launch functions now ensure a launch only occurs / is enqueued within a current context (any context).
* Getting the current device now ensures its primary context is also active (which getting an arbitrary device does not do so).
* Added doxygen comment for `device::detail_::wrap()` mentioning the primary context reference behavior.
eyalroz added a commit that referenced this issue Apr 16, 2022
…t context, primary contexts, and ensuring their existence in various circumstanves:

* Renamed: `context::current::detail_::scoped_current_device_fallback_t` -> `scoped_existence_ensurer_t` `context::current::detail_::scoped_context_existence_ensurer`
* context::current::scoped_override_t` now has a ctor which accepts. `primary_context_t&&`'s - to hold on to their PC reference which they are about to let go of.
* Moved: `context::current::scoped_override_t` is now implemented in the multi-wrapper implementations directory; consequently
    * Moved the implementations of  `module_t::get_kernel()` and `module::create<Creator>` to the multi-wrapper directory, since they use `context::current::scoped_override_t`.
    * Added inclusion of `cuda/api/multi_wrapper_impls/module.hpp` to some example code.
* Made a device current in some examples to avoid having no current context when executing certain operations with no wrappers (e.g. memcpy with host-side addresses)
* When allocating managed or pinned-host memory, now increasing the reference of some  context by 1 (choosing the primary context of device 0 since that's the safest), and decreasing it again on destruction. That guarantees that operations involving that allocated memory will not occur with no constructed contexts.
    * Corresponding comment changes on the `allocate()` and `free()` methods for pinned-host and managed memory.
* Factored out the code in `context_t::is_primary()` to a function, `cuda::context::current::detail_::is_primary`, which can now also be used via `cuda::context::current::is_primary()`.
* Kernel launch functions now ensure a launch only occurs / is enqueued within a current context (any context).
* Getting the current device now ensures its primary context is also active (which getting an arbitrary device does not do so).
* Added doxygen comment for `device::detail_::wrap()` mentioning the primary context reference behavior.
@eyalroz eyalroz closed this as completed in bb59a97 May 9, 2022
eyalroz added a commit that referenced this issue Jun 20, 2022
…t context, primary contexts, and ensuring their existence in various circumstanves:

* Renamed: `context::current::detail_::scoped_current_device_fallback_t` -> `scoped_existence_ensurer_t` `context::current::detail_::scoped_context_existence_ensurer`
* context::current::scoped_override_t` now has a ctor which accepts. `primary_context_t&&`'s - to hold on to their PC reference which they are about to let go of.
* Moved: `context::current::scoped_override_t` is now implemented in the multi-wrapper implementations directory; consequently
    * Moved the implementations of  `module_t::get_kernel()` and `module::create<Creator>` to the multi-wrapper directory, since they use `context::current::scoped_override_t`.
    * Added inclusion of `cuda/api/multi_wrapper_impls/module.hpp` to some example code.
* Made a device current in some examples to avoid having no current context when executing certain operations with no wrappers (e.g. memcpy with host-side addresses)
* When allocating managed or pinned-host memory, now increasing the reference of some  context by 1 (choosing the primary context of device 0 since that's the safest), and decreasing it again on destruction. That guarantees that operations involving that allocated memory will not occur with no constructed contexts.
    * Corresponding comment changes on the `allocate()` and `free()` methods for pinned-host and managed memory.
* Factored out the code in `context_t::is_primary()` to a function, `cuda::context::current::detail_::is_primary`, which can now also be used via `cuda::context::current::is_primary()`.
* Kernel launch functions now ensure a launch only occurs / is enqueued within a current context (any context).
* Getting the current device now ensures its primary context is also active (which getting an arbitrary device does not do so).
* Added doxygen comment for `device::detail_::wrap()` mentioning the primary context reference behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant