Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion cuda_core/cuda/core/_version.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
#
# SPDX-License-Identifier: Apache-2.0

__version__ = "0.4.2"
__version__ = "0.5.0"
1 change: 1 addition & 0 deletions cuda_core/docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ CUDA runtime
Event
MemoryResource
DeviceMemoryResource
GraphMemoryResource
PinnedMemoryResource
ManagedMemoryResource
LegacyPinnedMemoryResource
Expand Down
72 changes: 72 additions & 0 deletions cuda_core/docs/source/release/0.5.0-notes.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
.. SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
.. SPDX-License-Identifier: Apache-2.0

.. currentmodule:: cuda.core.experimental

``cuda.core`` 0.5.0 Release Notes
=================================


Highlights
----------

- Added memory management support (allocation, deallocation, copy, and fill) for CUDA graphs.
- Added :class:`PinnedMemoryResource` and :class:`ManagedMemoryResource` for advanced memory management.
- Added peer access control to :class:`DeviceMemoryResource`.
- Reduced Python overhead and improved performance for calling :func:`launch`, constructing :class:`LaunchConfig`, and accessing :class:`DeviceMemoryResource` attributes.


Breaking Changes
----------------

The support for setting :attr:`VirtualMemoryResourceOptions.handle_type` to ``"win32"`` is removed. Please reach out to us on GitHub if you have a use case.

The following APIs have been deprecated and will be removed in 0.6.0:

- ``cuda.core.experimental.system.driver_version`` has been replaced with
``cuda.core.experimental.system.get_driver_version()``.
- ``cuda.core.experimental.system.num_devices`` has been replaced with
``cuda.core.experimental.system.get_num_devices()``.
- ``cuda.core.experimental.system.devices`` has been replaced with
``cuda.core.experimental.Device.get_all_devices()``.

Other changes:

- The :meth:`utils.StridedMemoryView.__init__` constructor is deprecated in favor of the new ``from_*`` classmethods, see below.
- Support for Python 3.9 and 3.13t is dropped.


New features
------------

- Added :class:`GraphMemoryResource` for allocating and deallocating memory when building a CUDA graph.
- Added :class:`PinnedMemoryResource` and :class:`PinnedMemoryResourceOptions` for managing host-pinned memory pools with optional IPC support.
- Added :class:`ManagedMemoryResource` and :class:`ManagedMemoryResourceOptions` for managing unified memory pools accessible from both host and device.
- Added :meth:`Buffer.fill` method for efficient memory initialization, supporting ``int``, ``bytes``, and general buffer protocol objects.
- :class:`Buffer` can now wrap external memory allocations with an owner object.
- Added alternative constructors :meth:`~utils.StridedMemoryView.from_buffer`, :meth:`~utils.StridedMemoryView.from_dlpack`, and :meth:`~utils.StridedMemoryView.from_cuda_array_interface`
and a new property :attr:`~utils.StridedMemoryView.size` for :class:`~utils.StridedMemoryView`.
- Added :meth:`ProgramOptions.as_bytes` and :meth:`LinkerOptions.as_bytes` public APIs for converting options to backend-specific byte representations.
- Updated :class:`Device` constructor to accept either a :class:`Device` instance or a device ordinal (``int``).
- Added :meth:`Device.get_all_devices` classmethod.
- IPC-imported buffers can now be re-exported to other processes.


New examples
------------

None.


Fixes and enhancements
----------------------

- Most CUDA resources can be hashed now.
- Python ``bool`` objects are now converted to C++ ``bool`` type when passed as kernel arguments (previously converted to ``int``).
- Restored v0.3.x :class:`MemoryResource` behaviors and missing MR attributes for backward compatibility.
- Added warning when multiprocessing start method is set to ``'fork'``.
- Fixed potential memory leaks when DLPack capsule creation is interrupted.
- Fixed :class:`VirtualMemoryResource` on Windows platforms.
- Fixed NVRTC program name handling on Windows to avoid filesystem issues.
- Improved test determinism by replacing OS sleep with GPU nanosleep kernel in event timing tests.
- Fixed CUDA graph issues with ``cuda-python==12.6.*``.
48 changes: 0 additions & 48 deletions cuda_core/docs/source/release/0.5.x-notes.rst

This file was deleted.

2 changes: 1 addition & 1 deletion cuda_core/pixi.toml
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ cu12 = { features = ["cu12", "test", "cython-tests"], solve-group = "cu12" }
# TODO: check if these can be extracted from pyproject.toml
[package]
name = "cuda-core"
version = "0.4.2"
version = "0.5.0"

[package.build]
backend = { name = "pixi-build-python", version = "*" }
Expand Down