Skip to content

Version 0.6.4 : Minor changes and bug fixes

Compare
Choose a tag to compare
@eyalroz eyalroz released this 26 Jul 21:21
· 142 commits to master since this release

Changes since v0.6.3:

Features

  • #528 Can now register memory areas with the CUDA driver as read-only (in addition to other parameters)
  • #519 Quick & somewhat-dirty support for the use of "external memory resources" (mostly Direct3D or NVSCI buffers and semaphores; but not including semaphores etc. for now)

Bug fixes

  • #535 copy_parameters_t::set_endpoint_untyped() now properly calling an inner set_endpoint_untyped()
  • #527 copy_parameters_t::set_single_context no longer mistakenly taking an endpoint_t parameter
  • #528 When mapping a region pair, not insisting on the same address on both the host and the device (which had made it impossible for this to succeed with older GPUs).
  • #521 Avoid a compiler compiler warning when overriding the context for the current scope
  • #520 Removed unnecessary uses of a context-wrapper scoped-context-override
  • #517 cuda::memory::typed_set() no longer mistakenly accepts values of size 8 (which have no special CUDA API call).
  • #516 Corrected types and casting in stream_t::enqueue_t::write_single_value()
  • #515 Resolved a case of missing includes when including only certain headers
  • #514 Now providing a definition of cuda::memory::managed::allocate(device, num_bytes) (which was declared but not defined)

Other changes

  • #522 Renamed synch to sync in multiple identifiers
  • #529 program_t::add_registered_globals() can now take any container of any string-like type.
  • #523 Now passing device_t's by const-ref in more cases, avoiding copying and enabling re-use of a reference to the primary context.
  • #521 Reduce boilerplate + avoid warning when overriding context for the current scope resolved-on-development task

Build issues

  • #504 Fixed build failure with cooperative_groups on GitHub Actions Windows runners and CUDA >= 11.7

Want to help me with testing 0.7? Drop me a line... (it will have CUDA execution graph support)