Version 0.6.4 : Minor changes and bug fixes
Changes since v0.6.3:
Features
- #528 Can now register memory areas with the CUDA driver as read-only (in addition to other parameters)
- #519 Quick & somewhat-dirty support for the use of "external memory resources" (mostly Direct3D or NVSCI buffers and semaphores; but not including semaphores etc. for now)
Bug fixes
- #535
copy_parameters_t::set_endpoint_untyped()
now properly calling an innerset_endpoint_untyped()
- #527
copy_parameters_t::set_single_context
no longer mistakenly taking anendpoint_t
parameter - #528 When mapping a region pair, not insisting on the same address on both the host and the device (which had made it impossible for this to succeed with older GPUs).
- #521 Avoid a compiler compiler warning when overriding the context for the current scope
- #520 Removed unnecessary uses of a context-wrapper scoped-context-override
- #517
cuda::memory::typed_set()
no longer mistakenly accepts values of size 8 (which have no special CUDA API call). - #516 Corrected types and casting in
stream_t::enqueue_t::write_single_value()
- #515 Resolved a case of missing includes when including only certain headers
- #514 Now providing a definition of
cuda::memory::managed::allocate(device, num_bytes)
(which was declared but not defined)
Other changes
- #522 Renamed
synch
tosync
in multiple identifiers - #529
program_t::add_registered_globals()
can now take any container of any string-like type. - #523 Now passing
device_t
's by const-ref in more cases, avoiding copying and enabling re-use of a reference to the primary context. - #521 Reduce boilerplate + avoid warning when overriding context for the current scope resolved-on-development task
Build issues
- #504 Fixed build failure with cooperative_groups on GitHub Actions Windows runners and CUDA >= 11.7
Want to help me with testing 0.7? Drop me a line... (it will have CUDA execution graph support)