18 Aug 09:28

kmaehashi

25e552d

v13.6.0 Latest

Latest

This is the release note of v13.6.0. See here for the complete list of solved issues and merged PRs.

🌏 We just launched our LinkedIn page. Follow us for the latest news and updates!

💬 Join the Matrix chat to talk with developers and users and ask quick questions!

🙌 Help us sustain the project by sponsoring CuPy!

✨ Highlights

This release adds support for CUDA 13.x. Binary packages are available on PyPI: pip install cupy-cuda13x.

📝 Changes

Enhancements

Update cupyx.scipy.special functions for SciPy 1.16 (#9246)
Add CUDA 13 support (#9300)
Support building NVTX on Windows without Nsight Systems (#9304)
Remove bundled header files (#9305)
Fix freqz for complex w (#9259)
Fix resample error message for SciPy 1.16 update (#9262)

Bug Fixes

Fix overflow in CUB reduction (#9254)
Fix lsmr type promotion rule for complex dtype (#9277)
Fix UnboundLocalError when blocking=True (#9282)
Allow host function call during CUDA graph capture (#9283)
CUDA 11.1 or earlier is no longer supported (#9285)

Documentation

Docs only: s/"recoreded"/recorded (#9288)
CUDA 13 Update docs (#9299)

Installation

[v13] Bump version to v13.6.0 (#9314)

Tests

CI: Introduce per-PR kernel cache (#9235)
Add test cases for batchwise solve_triangular (as xfail) (#9245)
Relax tolerance of test_hilbert (#9255)
Skip some signal q dtype tests (#9256)
Increase CPU memory limit of linux.cuda{128,129} CIs (#9261)
Support nD reductions for sparse arrays (#9268)
[v13] Missing backport of special function tests (#9269)
[v13] Wrong test skip condition of test_zscore_empty (#9270)
Support SciPy 1.16 on Windows (#9276)
Support SciPy 1.16 on Linux (#9284)
CI: NVTX1 removed from Windows machine image (#9303)
Fix CI failure in CUDA 12.4 (#9311)
[v13] Fix scipy version condition of COO matrix test (#9312)

Others

👥 Contributors

The CuPy Team would like to thank all those who contributed to this release!

@asi1024 @brycelelbach @Ellecee @emcastillo @kmaehashi @robertmaynard

Contributors

robertmaynard, brycelelbach, and 4 other contributors

Assets 48

cupy-13.6.0.tar.gz

sha256:3cba30ae3dd32b5d5c6536e710cb98015227cd4ba83c46b3f1825a7ae55b6667

3.17 MB 2025-08-16T09:10:25Z
cupy_cuda11x-13.6.0-cp310-cp310-manylinux2014_aarch64.whl

sha256:d107f0e5079c4ee72714f2b7e4fd8655f5d45418bcfd82727cdd16ab755f9351

107 MB 2025-08-16T09:09:43Z
cupy_cuda11x-13.6.0-cp310-cp310-manylinux2014_x86_64.whl

sha256:0d7f29e4644b468a00d1ef443e9394bba79932b57b8f77746c19d31dccd9ec94

94.4 MB 2025-08-16T09:09:51Z
cupy_cuda11x-13.6.0-cp310-cp310-win_amd64.whl

sha256:8c369423302a7cc654f5cc7ce8969cfb4fb70ffa54349e90c8fb841ffb822253

73.3 MB 2025-08-17T07:26:02Z
cupy_cuda11x-13.6.0-cp311-cp311-manylinux2014_aarch64.whl

sha256:a89efc831b561077e9d940474f77e1d84f81701a9456061c0da8a2a7907610c8

108 MB 2025-08-16T09:09:43Z
cupy_cuda11x-13.6.0-cp311-cp311-manylinux2014_x86_64.whl

sha256:6fbbc042580d3c6170a449a2643235bef16e74cc997a143767c47d4d6ce95ed2

95.2 MB 2025-08-16T09:09:51Z
cupy_cuda11x-13.6.0-cp311-cp311-win_amd64.whl

sha256:5097ea9f88b991e9abe31e93b1be0106ecc74f3f1fe43461803eb673574eb642

73.3 MB 2025-08-17T07:26:02Z
cupy_cuda11x-13.6.0-cp312-cp312-manylinux2014_aarch64.whl

sha256:389a94e1943457fd155836fe7f532fa713024cfd34f22141019718a42aad346e

108 MB 2025-08-16T09:09:43Z
cupy_cuda11x-13.6.0-cp312-cp312-manylinux2014_x86_64.whl

sha256:0418f788908985cb615d4c8fe1dda46bf0c09462626e643eda694b33002ae296

95 MB 2025-08-16T09:09:52Z
cupy_cuda11x-13.6.0-cp312-cp312-win_amd64.whl

sha256:fad3d7fa1e38638dbc5d6101d9c486cc8fe2dec4ce48f3f4709afcaf45770993

73.2 MB 2025-08-17T07:26:02Z
Source code (zip)

2025-08-18T09:28:00Z
Source code (tar.gz)

2025-08-18T09:28:00Z

11 Jul 04:59

kmaehashi

v13.5.1

f450813

v13.5.1

This is the release note of v13.5.1. This is a hot-fix release to address an issue related to the buffer protocol support for UMP added in v13.5.0 (#9223). See here for the complete list of solved issues and merged PRs.

💬 Join the Matrix chat to talk with developers and users and ask quick questions!

🙌 Help us sustain the project by sponsoring CuPy!

📝 Changes

Bug Fixes

Fix buffer protocol to raise TypeError when it is not meant to be supported (#9222)

Installation

Bump version to v13.5.1 (#9224)
Fix long_description missing after pyproject.toml migration (#9231)

👥 Contributors

The CuPy Team would like to thank all those who contributed to this release!

@kmaehashi @leofang

Contributors

kmaehashi and leofang

Assets 33

03 Jul 06:59

kmaehashi

v13.5.0

e30a0cc

v13.5.0

Note

2025-07-11: We have marked this release as "yanked" on PyPI to prevent new installations due to unexpected regressions. The hot-fix release v13.5.1 is available.

This is the release note of v13.5.0. See here for the complete list of solved issues and merged PRs.

💬 Join the Matrix chat to talk with developers and users and ask quick questions!

🙌 Help us sustain the project by sponsoring CuPy!

✨ Highlights

CuPy now supports NVIDIA CUDA 12.9 and AMD ROCm 6.4 platforms, and NumPy 2.3.
Unified Memory Programming support for HMM/ATS-enabled systems (such as NVIDIA Grace Hopper Superchip) has been added. Refer to the documentation for the usage.
Binary packages on PyPI (wheels) can now load NCCL packages installed via Pip (e.g., nvidia-nccl-cu12). In addition, Arm (aarch64) wheels are now built with NCCL support enabled.

Request for Comments

We are going to finalize the following RFC issues.

Drop support for cuDNN in CuPy v14 (#8215)
Update set of supported ROCm versions in CuPy v13/v14 (#8607)
Remove cupyx.tools.install_library in CuPy v14 (#9204)

📝 Changes

New Features

Support system allocated memory (#9033)

Enhancements

Fix rocThrust build for ROCm 6.3 (#9023)
Allow discovering cuTENSOR using major version (#9037)
Support FIPS enabled machines with MD5 hashing (#9055)
Update cutensornet accelerator based on cuquantum-python 25.03 deprecation (#9058)
Refactor hashing (#9059)
Raise user warning in both {to,from}Dlpack & Update the Interoperability page (#9061)
Allow build on ROCm 6.4 (#9100)
Migrate to pyproject.toml (#9135)
Support NCCL for aarch64 (#9141)
Support loading NCCL from Pip packages (#9208)
Support CUDA 12.9 and NCCL 2.26 (#9211)
Fix cupyx.scipy.stats.zscore for SciPy 1.15 (#9024)

Performance Improvements

Implement lazy load for cuquantum (#9104)

Bug Fixes

JIT: Support empty return (#9001)
API: Revert toDlpack() default to the old unversioned one (#9007)
BUG: Hot fix for numpy 2 support in some fusion paths (#9012)
Fix compilation error of cupy.inf in fusion2 (#9043)
Support Cython 3.1 (#9132)
Fix cupyx.scipy.linalg.expm (#9144)

Code Fixes

Fix get_typename to emit thrust::complex (#9054)

Documentation

Add an AI policy to prohibit misuse of the issue tracker (#9095)
Update ROCm docs (#9108)
Docs: Update build-time requirement of Cython (#9145)
Fix WARNING: Inline emphasis start-string without end-string (#9168)
Improve API reference list (#9189)
Bump supported NumPy version to v2.3 (#9203)

Installation

Limit Cython version to 3.0 or 3.1 (#9146)
Bump NumPy version restriction (#9166)
Make rebuild faster for development (#9196)
Bump version to v13.5.0 (#9212)

Tests

CI: Do not run full CI on CUDA 12.0/12.1/12.2 + Windows (#9000)
CI: Pin setuptools version on Windows (#9039)
Revert "CI: Pin setuptools version on Windows" (#9056)
Mark xfails in some spline tests for SciPy 1.15 (#9060)
Support SciPy 1.15 (#9063)
Skip some dtype checks with NumPy 2.x (#9064)
Skip tests for different behavior of integer overflow from NumPy 2 (#9072)
Skip some cupyx.scipy.special tests for SciPy 1.15 (#9073)
Skip some tests for numerical error from NumPy 2 (#9075)
Do gc.collect() in MemoryHook test code to avoid free hook to happen (#9093)
np.sum has numerical change in NumPy 2.3 (#9169)
Fix cp.empty(None) to raise TypeError (#9174)
CI: NumPy 2.3 (#9194)
Add NumPy 2.3 + windows CI (#9197)
Update pre-commit settings (#9202)
Add cupy.win.cuda129 CI (#9214)
Fix test trigger phrase for cupy.win.cuda129 CI (#9217)

Others

Allow specifying no libraries when generating wheel metadata (#9080)
Upgrade pre-commit hooks (#9156)

👥 Contributors

The CuPy Team would like to thank all those who contributed to this release!

@asi1024 @Azusachan @EarlMilktea @ev-br @jakirkham @kmaehashi @leofang @MattTheCuber @rongou @seberg @yangcal

Contributors

seberg, rongou, and 9 other contributors

Assets 33

04 Apr 08:41

kmaehashi

v14.0.0a1

1e8ade1

v14.0.0a1 Pre-release

Pre-release

This is the release note of v14.0.0a1. See here for the complete list of solved issues and merged PRs.

💬 Join the Matrix chat to talk with developers and users and ask quick questions!

🙌 Help us sustain the project by sponsoring CuPy!

✨ Highlights

This is the first alpha release of the CuPy v14 series, containing:

New type promotion rules and behaviors aligned with the NumPy 2 specification.
42 new NumPy/SciPy-compatible APIs, including cupy.concat, cupyx.scipy.interpolate.CubicSpline, cupyx.scipy.spatial.Delaunay, cupyx.scipy.ndimage.find_objects, and cupyx.scipy.special.lambertw. See the Comparison Table for the detailed coverage.

Binary packages are available for testing. Try installing now by:

$ pip install cupy-cuda12x --pre -U -f https://pip.cupy.dev/pre

🛠️ Changes without compatibility

CuPy v14’s behavior will be aligned with NumPy v2.
- Type promotion rules are now NEP50 compatible. See Changes to NumPy data type promotion.
- int is now 64-bit (int64) on Windows. See Windows default integer.
- APIs removed in NumPy v2 (see Changes to namespaces) were marked deprecated in CuPy v14. Although they are kept available in v14 for smooth migration, they are planned to be removed in the next major release (CuPy v15).
- The behavior of copy argument has been changed (#8545). See Adapting to changes in the copy keyword.
Support for Python 3.9, NumPy 1.22 and 1.23, SciPy 1.7, 1.8, and 1.9 has been dropped. (#8491)
cupy.random.choice may return different results from CuPy v13. (#8483)
Building CuPy from source code now requires Cython 3.0. (#8457)
cupyx.scipy.linalg.{tri,tril,triu} APIs were removed from CuPy to follow the latest SciPy’s specification. Use cupy.{tri,tril.triu} instead. (#8499)
NumPy fallback mode (cupyx.fallback_mode) has been removed as discussed in #8497. (#8816)
Legacy DLPack APIs (cupy.toDlpack and cupy.fromDlpack) are now marked deprecated. Use cupy.from_dlpack instead. See the documentation for the usage. (#8831)

📝 Changes

New Features

Add KDTree to cupyx.scipy.spatial (#7671)
Add neighbors option to RbfInterpolator (#7864)
ENH: cupyx/signal: add sweep_poly (#7873)
Add 2D Delaunay triangulation (#7985)
Add cupyx.signal.pulse_compression from cuSignal's non SciPy-compat API (#8022)
Add LinearNDInterpolator to cupyx.scipy.interpolate (#8035)
Add cupyx.signal.convolve1d3o from cuSignal's non SciPy-compat API (#8037)
Add cupyx.signal.{firfilter,firfilter_zi,firfilter2} (#8052)
Add cupyx.signal.{pulse_doppler, cfar_alpha} (#8057)
Add cupyx.signal.{complex_cepstrum,real_cepstrum,inverse_complex_cepstrum,minimum_phase} (#8062)
Add cupyx.signal.mvdr (#8077)
ENH: signal: add lanczos and kaiser_bessel_derived windows (#8081)
Add cupyx.signal.ca_cfar (#8087)
Add cupyx.signal.convolve1d2o (#8101)
Add cupyx.signal.freq_shift (#8128)
Add lambertw function (#8140)
Add cupyx.signal.channelize_poly (#8141)
Add cupyx.scipy.interpolate.CubicSpline (#8175)
Add apply_over_axes API (#8177)
Add cupy.put_along_axis API (#8199)
Add CloughTocher2DInterpolator to cupyx.scipy.interpolate (#8208)
Add NearestNDInterpolator to cupyx.scipy.interpolate (#8220)
Add NdBSpline to cupyx.scipy.interpolate (#8223)
ENH: cupyx/scipy/interpolate: add *UnivariateSpline for 1D smoothing splines (#8267)
Add NdBSpline based interpolation methods to RGI (#8276)
ENH: cupyx/interpolate: port interp1d from scipy (#8289)
Add batched solve_triangular (#8329)
Add Incomplete Elliptic Integrals to special (#8425)
Support system allocated memory (#8442)
Add CUDA graph debug function (#8502)
Add sici and shichi to special for sine and cosine integrals (#8620)
Update unique_xxx (nep52) (#8665)
Add cupyx.scipy.ndimage.find_objects (#8916)

Enhancements

Support for break and continue keywords in CuPy JIT (#8010)
Make cupyx.signal.radartools private (#8047)
Remove usages of numpy.float_ and numpy.complex_ (#8050)
Support cusparseLt 0.6.1 (#8074)
Add incontiguous support for cutensor functions (#8149)
Add complex support for the digamma function (#8163)
Fix expm(complex matrix) (#8206)
Add CutensorMg support (#8212)
Add cudaStreamCreateWithPriority (#8219)
Add the nearest method for percentile/quantile estimation (#8224)
Various Jitify improvements (#8235)
Support fallback algorithm for spgemm (#8252)
Bump to cuTENSOR 2.0.1 (#8282)
Preload cuTENSORMg (#8283)
Use weakref.finalize instead of __del__ for RandomState._generator destruction (#8315)
Support ROCm 6 (#8319)
cupyx: cleanup use of deprecated NumPy functionality (NumPy 2.0 compatibility) (#8320)
Add wright_bessel function to special (#8324)
MAINT: fft, linalg: add __all__ lists (#8333)
Cuda 12.5 Tests (#8337)
Add axes support in ndimage filters module (#8339)
MAINT: interpolate: update RBF to scipy 1.13 (#8343)
Make CuPy import under NumPy 2.0 (#8346)
Lazy-preload NCCL (#8360)
Fix map_coordinates recompilation condition (#8378)
Disable jitify for cub & Bump CCCL (#8412)
Use custom less instead of specializing thrust (#8446)
Port to Cython 3.0 (#8457)
Avoid using Jitify everywhere inside CuPy (#8467)
Get rid of pkg_resources (#8480)
Drop support for Python 3.9, NumPy 1.22 and 1.23, SciPy 1.7, 1.8 and 1.9 (#8491)
Remove deprecated cupyx.scipy.linalg.{tri,tril,triu} (#8499)
Use .toarray() instead of .A attribute (#8508)
Support half option in scipy.signal.minimum_phase (#8510)
Increase MAX_NDIM to 64 (#8511)
Support CUDA 12.6 (#8513)
Fallback to system headers for future CUDA 12.x versions (#8518)
Extend runtime header search logic to conda (#8519)
Support copy=None in cp.array / cp.asarray / cp.asanyarray (#8545)
Fix dtype rule of cupy.scipy.stats.entropy for SciPy 1.14 (#8547)
Support setuptools 74.0.0 or later (#8583)
Add NCCL_ERROR_REMOTE_ERROR to the set of errors from NCCL (#8662)
Replace numpy.ComplexWarning with cupy.exceptions.ComplexWarning (#8676)
ENH: Implement dlpack v1 (#8683)
Fix some NumPy 2.x CI failures (cont.) (#8695)
Bump CUDA version in cuda11x-cuda-python CI (#8737)
[ROCm 6.2.2] Conditionally define CUDA_SUCCESS only if it's not (#8793)
Remove fallback mode (#8816)
Raise user warning in both {to,from}Dlpack & Update the Interoperability page (#8831)
Use a custom Min/Max instead of specializing CUB (#8846)
Updating pylibraft pairwise_distance to cuvs (#8847)
add axes support for additional functions in cupyx.scipy.ndimage (from SciPy 1.15.0) (#8858)
Raise VisibleDeprecationWarning for wavelet functions (#8865)
Support CUDA 12.8 + Blackwell GPUs (sm_100, sm_120) (#8899)
Bump library installers for CUDA 12.8 (#8914)
Use CCCL 2.8.x branch + Use CUPY_CACHE_KEY in hash keys (#8919)
Use NVIDIA CCCL 2.8 latest w/CUDA 12.3 fix (#8924)
Use C++17 in JIT compile (#8940)
Restore CUB histogram and bincount (#8950)
Broaden usage of C++17 (#8952)
cupyx.scipy.distance: initialize output array with empty instead of zeros (#8971)
cupyx.scipy.spatial.distance.cdist remove explicit zeroing of user-provided output array (#8988)
Fix rocThrust build for ROCm 6.3 (#9022)
Allow discovering cuTENSOR using major version (#9030)
Update cutensornet accelerator based on cuquantum-python 25.03 deprecation (#9045)
Support FIPS enabled machines with MD5 hashing (#9053)
Refactor hashing (#9057)

Enhancements for NumPy & SciPy compatibility:

Fix scp.signal.{medfilt,medfilt2d} to raise ValueError for complex64 inputs (#8059)
Deprecate cupyx.scipy wavelet functions (#8061)
Fix csrmatrix.__pow__ to raise ValueError for non-int other (#8063)
Fix cupyx.scipy.special.betainc for invalid inputs (#8065)
scipy.special.{btdtr,btdtri} are deprecated since SciPy 1.12 (#8066)
Fix boxcox_llf for SciPy 1.12 changes (#8095)
NEP50 (#8323)
Resolve Ruff NPY errors - fix exception imports and asfarray usage in test code (#8455)
Fix sparse.linalg function signatures following SciPy 1.14 (#8526)
NumPy 2.0 compatibility: (partially) sync with NEP52 (#8531)
Fix dtype rule of special functions for SciPy 1.14 (#8532)
Fix cupy.histogram arg order to match NumPy (v1.24+) (#8559)
Make cupy.linalg.solve compatible with numpy v2 (#8629)
Silence FutureWarning emitted when rcond is missing (#8638)
Fix some NumPy 2.x CI failures (#8690)
Support kind arg. in sorting methods (#8708)
Fix cupy.percentile for NumPy 2.x (#8726)
Fix some NumPy 2.x CI failures (cupyx) (#8727)
Skip some tests incompatible with NumPy 2.2 (#8817)
Fix scipy.spmatrix.sign for complex dtype inputs (#8822)
Fix return type of cupy.where for scalar arguments for NumPy 2.0 (#8835)
Fix cupyx.scipy.special.logsumexp for NumPy 2.0 (#8836)
Fix cupy.cov (#8839)
Fix cupy.histogramdd for NumPy 2.x (#8873)
Raise ValueError upon attempts to create 3-dim sparse array (#8877)
Disable contiguous_check for COO/dense matmul test (#8878)...

Contributors

seberg, hmaarrfk, and 48 other contributors

Assets 27

21 Mar 07:28

kmaehashi

v13.4.1

6e3c9b7

v13.4.1

This is the release note of v13.4.1. This is a hot-fix release addressing several issues including DLPack compatibility with existing user code. See here for the complete list of solved issues and merged PRs.

💬 Join the Matrix chat to talk with developers and users and ask quick questions!

🙌 Help us sustain the project by sponsoring CuPy!

📝 Changes

Bug Fixes

Revert toDlpack() default to the old unversioned one (#9011)
Hot fix for numpy 2 support in some fusion paths (#9016)
Fix compilation error of cupy.inf in fusion2 (#9044)

Tests

CI: Pin setuptools version on Windows (#9047)

Others

Bump version to v13.4.1 (#9051)

👥 Contributors

The CuPy Team would like to thank all those who contributed to this release!

@asi1024 @kmaehashi @seberg

Contributors

seberg, kmaehashi, and asi1024

Assets 33

28 Feb 06:41

kmaehashi

v13.4.0

fca48bc

v13.4.0

This is the release note of v13.4.0. See here for the complete list of solved issues and merged PRs.

💬 Join the Matrix chat to talk with developers and users and ask quick questions!

🙌 Help us sustain the project by sponsoring CuPy!

✨ Highlights

NVIDIA CUDA 12.8 Support

CuPy now supports CUDA 12.8 and the latest NVIDIA Blackwell architecture.

AMD ROCm 6.x Support

CuPy can now be built with AMD ROCm 6.x.

Python 3.13 Support

Binary packages for Python 3.13 are now available.

🛠️ Changes without compatibility

Cython 3.0 as build requirement (#8959)

To provide support for Python 3.13, CuPy codebase has been updated for Cython 3. To build CuPy from source, Cython 3.0 or later is now required instead of Cython 0.29.x.

📝 Changes

New Features

Add cupyx.signal.mvdr (#8872)

Enhancements

Support ROCm 6 (#8608)
Support setuptools 74.0.0 or later (#8649)
Use custom less instead of specializing thrust (#8653)
Add NCCL_ERROR_REMOTE_ERROR to the set of errors from NCCL (#8667)
Replace numpy.ComplexWarning with cupy.exceptions.ComplexWarning (#8678)
Use weakref.finalize instead of del for RandomState._generator destruction (#8680)
Implement dlpack v1 (#8722)
Fix some NumPy 2.x CI failures (cont.) (#8725)
Bump CUDA version in cuda11x-cuda-python CI (#8743)
ROCm 6.2.2: Conditionally define CUDA_SUCCESS only if it's not (#8799)
Raise VisibleDeprecationWarning for wavelet functions (#8868)
Use a custom Min/Max instead of specializing CUB (#8875)
Updating pylibraft pairwise_distance to cuvs (#8897)
Support CUDA 12.8 + Blackwell GPUs (sm_100, sm_120) (#8915)
Interpolate: update RBF to scipy 1.13 (#8939)
Use C++17 in JIT compile (#8941)
Bump library installers for CUDA 12.8 (#8943)
Use CCCL 2.8.x branch + Use CUPY_CACHE_KEY in hash keys (#8946)
Use NVIDIA CCCL 2.8 latest w/CUDA 12.3 fix (#8948)
Broaden usage of C++17 (#8958)
Port to Cython 3.0 (#8959)
cupyx.scipy.distance: initialize output array with empty instead of zeros (#8981)
cupyx.scipy.spatial.distance.cdist remove explicit zeroing of user-provided output array (#8990)
Skip sparse.linalg.{cg, cgs, gmres} tests for scipy>=1.14 (#8551)
cupyx.scipy.sparse tests for SciPy 1.14 (#8552)
Fix some NumPy 2.x CI failures (cupyx) (#8738)
Fix cupy.percentile for NumPy 2.x (#8752)
Skip some tests incompatible with NumPy 2.2 (#8830)
Disable contiguous_check for COO/dense matmul test (#8888)
Raise ValueError upon attempts to create 3-dim sparse array (#8889)
Skip a test for invalid scipy return value of invalid COO matmul (#8890)
Fix fft.fht following bug fix in SciPy 1.15 (#8891)
Support empty tuple indexing for sparse matrix (#8892)
Deprecate cupyx.scipy.linalg.kron (#8902)
Fix test for special.sph_harm to ignore DeprecationWarning (#8906)

Bug Fixes

Add nccl.broadcast 64-bit support (#8566)
Support building CuPy with setuptools 74 (#8577)
Fix order 'K' with shape given for *_like array creation (#8605)
hipPointerGetAttributes returns error when pointer is unregistered in ROCm 5.7 (#8609)
Guard for ROCm 6.x (#8611)
Fix HIP_VERSION unit (#8619)
Switch to using platform.machine() instead of platform.processor() (#8656)
Properly allocate in RNG when specified dtype is neither float32/float64 (#8658)
Use platform.machine() instead of platform.processor() (#8673)
Fix sosfilt state output shape when ndim < 2 (#8679)
Fix undefined inf/nan constant in CuPy JIT (#8712)
Fix bspline kernel to avoid out of bounds error (#8763)
Fix race during SoftLink initialization (#8787)
fix nanargmin and nanargmax's parameter order and pass optional parameters (#8791)
Fix crashes of quantile and percentile (#8811)
Fix handling of pinned memory (#8852)
Use /bigobj on Windows build (#8967)
Fix cupyx.scipy.spatial.distance's cdist for RAPIDS 24.12 compatibility (#8975)

Code Fixes

Upgrade pre-commit hooks to silence warnings (#8666)
Resolve import loop (#8714)
Resolve uncaught type warning (#8798)
Switch from .A attribute to .toarray() method (#8814)
Fix typo in _cretate_frame_tree (#8944)
Drop unneeded bytes copy of CUPY_CACHE_KEY (#8947)

Documentation

Add docs about CUDA headers (#8595)
Update fft.rst (#8617)
Update documentation to use pre-commit (#8650)
Add tips on Windows development in Contribution Guide (#8704)
Add notice about cupy.array_api removal (#8751)
Add CUDA 12.8 to docs (#8968)
Update list of supported versions (#8991)

Installation

Update conda-build CUDA detection logic for Setuptools 72.2.0 (#8652)
Use relative path of header files to generate cache key (#8930)
Fix minimum CUDA version check and update comments (#8938)
Bump version to v13.4.0 (#8993)

Tests

Relax test_firls atol (#8522)
Skip test_homomorphic in scipy>=1.14 (#8523)
Skip betaincinv test with SciPy 1.14.1 (#8553)
Skip special tests for SciPy 1.14 dtype rule changes (#8554)
Skip special.logsumexp test for empty input (#8555)
Skip cupy.scipy.stats.entropy tessts for SciPy 1.14 dtype rule change (#8556)
Use setuptools==73.0.1 (#8569)
Revert CI timeout bump (#8571)
Support SciPy 1.13 and 1.14 (#8572)
Missing backport for sparse_array.A removal (#8573)
Skip test_log_expit SciPy 1.7 (#8576)
Catch ValueError (#8625)
Use testing.with_requires to skip broken tests (#8627)
CI: Update micro versions of Python (#8635)
Skip tests if scipy is not installed (#8637)
Accept OverflowError in TestCopytoFromScalar for NumPy v2 (#8643)
Skip more tests if scipy is not installed (#8645)
Update precommit (#8663)
Backport the changes introduced in #8690 (#8694)
CI: Fix apt repository URL for Ubuntu 22.04 (#8715)
Remove ndarray.ptp from fallback tests (#8744)
Temporary skip for NumPy 2.0 tests (#8745)
Relax tolerance of test_hilbert for NumPy 2.0 (#8746)
Bump SciPy version to 1.14 in Windows CI (#8764)
Add NumPy 2.x CI for Linux (#8768)
CI: support "skip-ci" label (#8841)
CI: Fix FlexCI compatibility (#8842)
Add NumPy 2.2 to CI (#8855)
Replace flake8 with ruff (#8859)
Support Optuna 4 (#8863)
Add testing.shaped_linspace (#8900)
Disable contiguous_check for some signal.cont2discrete tests (#8901)
Fix splines tests to remove unexpected skips (#8921)
Minor updates for sm120 (#8922)
Add CI for CUDA 12.8 (#8951)
Increase host memory in Windows CI, free GPU memory in example code (#8969)
Skip some signal tests for TypeError for inputs of np.longlong dtype (#8972)
Add CI for Python 3.13 and mpi4py v4 (#8974)
Pass locals dict to exec (#8985)

Others

Add backport reminder (#8684)
Fix script name of backport reminder (#8686)
Update pre-commit hooks (#8910)
Fix pull request project board workflows (#8929)
Regenerate coverage matrix (#8960)

👥 Contributors

The CuPy Team would like to thank all those who contributed to this release!

@99991 @andfoy @asi1024 @Azusachan @bernhardmgruber @Berrysoft @chainer-ci @cjnolet @dagardner-nv @EarlMilktea @eltociear @ev-br @grlee77 @HollowMan6 @jakirkham @jemiryguo @kmaehashi @leofang @littlewu2508 @mohitreddy1996 @mroeschke @seberg @takagi

Contributors

seberg, kmaehashi, and 21 other contributors

Assets 33

22 Aug 07:42

kmaehashi

v13.3.0

118ade4

v13.3.0

This is the release note of v13.3.0. See here for the complete list of solved issues and merged PRs.

💬 Join the Matrix chat to talk with developers and users and ask quick questions!

🙌 Help us sustain the project by sponsoring CuPy!

✨ Highlights

Updated NVIDIA CCCL

The CCCL library bundled with CuPy has been updated to eliminate the Jitify preprocess phase. Users will no longer see the one-time performance warning (Jitify is performing a one-time only warm-up to populate the persistent cache, this may take a few seconds and will be improved in a future release...) unless explicitly requesting the use of Jitify (e.g., cupy.RawModule(..., jitify=True)).

Enhanced NumPy 2.0 Compatibility

This release provides better interoperability with NumPy 2.0.

Support for CUDA 12.5 & 12.6

CuPy is now tested with CUDA 12.5 and 12.6.

RFC: Removing NumPy Fallback Mode in CuPy v14

The CuPy team is discussing the possibility of removing NumPy fallback feature in CuPy v14. Feel free to join the discussion in #8497 if you have any comments or use-cases using this feature.

📝 Changes

Enhancements

Support CUDA 12.5 (#8423)
Avoid using Jitify everywhere inside CuPy (#8473)
Disable jitify for cub & Bump CCCL (#8487)
Get rid of pkg_resources (#8496)
Unregister cupyx.scipy.linalg.{tri,tril,triu} from uarray (reverted in #8516) (#8506)
Use .toarray() instead of .A attribute (#8517)
Extend runtime header search logic to conda (#8520)
Support CUDA 12.6 (#8524)
Fallback to system headers for future CUDA 12.x versions (#8529)

Bug Fixes

Fix spline temp container size in make_interp_spline (#8390)
MAINT: Avoid using np.compat.integer_types (#8413)
Fix type dispatcher for arm64 (#8414)
Fix ndarray.get() not honoring current stream when layout is not contiguous (#8418)
Fix copyto for NumPy 2 compatibility (#8435)
Update compiler.py to avoid the popup of the nvcc.exe console (#8438)
Fix RandomState.seed() for NumPy 2 compatibility (#8439)
Fix the size of temporary CUB output space to consider its alignment (#8447)
Address KeyErrors from importlib_metadata (#8465)
upfirdn: mode=None -> mode="constant" (#8495)
Search header files from CTK wheel (#8504)
Fix CUDA version condition to use headers from wheel (#8507)
Do not unregister cupyx.scipy.linalg.{tri,tril,triu} from uarray (#8516)
Fix ROCm 4.3 binary package build broken (#8534)
Fix cudart header detection for conda (#8535)

Documentation

eigsh doc correction _eigen.py (#8383)
typo: coping -> copying (#8427)
Add CUDA 12.5 to list of supported platform (#8428)
Add comparison table for (cupyx.)scipy.sparse.*_matrix classes class methods (#8458)

Installation

Patch the build system to better support conda-build (#8464)

Tests

Bump NumPy/SciPy versions in cuda-example CI (#8420)
Support SciPy 1.12 (#8422)
Fix CUDA 11.2 CI failure on Linux (#8437)
Decrease number of threads to avoid "system error: excessive memory usage is detected" (#8462)
CI: skip CUDA 12.1/12.2/12.3/12.4 CI on "mini" trigger (#8469)
Resolve Ruff NPY errors - fix exception imports and asfarray usage in test code (#8471)
Skip some tests in aarch64 CI (#8490)

👥 Contributors

The CuPy Team would like to thank all those who contributed to this release!

@andfoy @arkdong @asi1024 @bmerry @EarlMilktea @emcastillo @hmaarrfk @jakirkham @johnnynunez @kmaehashi @leofang @monzelr @seberg @swelborn @takagi @YanivDorGalron

Contributors

seberg, hmaarrfk, and 14 other contributors

Assets 35

13 Jun 05:29

takagi

v13.2.0

b127fb1

v13.2.0

This is the release note of v13.2.0. See here for the complete list of solved issues and merged PRs.

💬 Join the Matrix chat to talk with developers and users and ask quick questions!

🙌 Help us sustain the project by sponsoring CuPy!

✨ Highlights

Support for NumPy 2.0 (#8357)

CuPy can now be imported under NumPy 2.0.

Lazily preloading NCCL (#8367)

CuPy now loads NCCL shared library at the time of import cupy.cuda.nccl, instead of import cupy. This improves NCCL compatibility on mixed-library environments.

📝 Changes

Enhancements

cupyx: cleanup use of deprecated NumPy functionality (NumPy 2.0 compatibility) (#8325)
make CuPy import under NumPy 2.0 (#8357)
Lazy-preload NCCL (#8367)

Bug Fixes

Fix overflow indexing ndarray generated with as_strided (#8349)
Fix CUB build error on win-64 (#8358)
Re-enable NVTX range coloring for NVTX3. (#8361)

Documentation

Update fft.rst (#8310)
Find and fix typos with codespell (#8344)
Add NumPy 2.0 on document (#8371)

Tests

[v13] Use the latest NumPy v1 for head CI (#8355)

Others

👥 Contributors

The CuPy Team would like to thank all those who contributed to this release!
@asi1024 @cclauss @ev-br @grlee77 @kmaehashi @leofang @macrocosme @romerojosh @takagi

Contributors

macrocosme, kmaehashi, and 7 other contributors

Assets 35

19 Apr 07:40

takagi

v13.1.0

4c9821b

v13.1.0

This is the release note of v13.1.0. See here for the complete list of solved issues and merged PRs.

💬 Join the Matrix chat to talk with developers and users and ask quick questions!

🙌 Help us sustain the project by sponsoring CuPy!

✨ Highlights

Support for CUDA 12.3 & 12.4 (#8286)

CuPy now supports CUDA 12.3 and 12.4. Binary packages are available for Linux (x86_64/aarch64) and Windows as cupy-cuda12x.

Fixed Regression on pre-Volta platforms (#8216)

This release fixes the regression in CuPy v13.0.0 that part of CuPy functions were not functioning under pre-Volta platforms (compute capability < 7.0) such as NVIDIA Tesla P100 or GeForce GTX 1080.

📝 Changes

New Features

Add cupyx.signal.{complex_cepstrum,real_cepstrum,inverse_complex_cepstrum,minimum_phase} (#8096)
Add cupyx.signal.{firfilter,firfilter_zi,firfilter2} (#8107)
Add cupyx.signal.freq_shift (#8131)
Add cupyx.signal.channelize_poly (#8148)
Add cupyx.signal.ca_cfar (#8167)

Enhancements

Add incontiguous support for cutensor functions (#8168)
Remove usages of numpy.float_ and numpy.complex_ (#8181)
Fix expm(complex matrix) (#8214)
Various Jitify improvements (#8237)
Bump to cuTENSOR 2.0.1 (#8291)

NumPy-compatibility Improvements

Fix scp.signal.{medfilt,medfilt2d} to raise ValueError for complex64 inputs (#8084)
Fix boxcox_llf for SciPy 1.12 changes (#8132)
Deprecate cupyx.scipy wavelet functions (#8139)

Bug Fixes

Fix #7981, Update _nccl_comm.py (#8112)
Fix Flags not to allow setters (#8138)
Prevent angular brackets from appearing in Jitify's cache filename (#8160)
Set -arch in the compiler options unconditionally (#8161)
Allow cupy.show_config() without CUDA (#8192)
Fix jitify warmup kernel (#8216)
Fix: remove unnecessary include that causes deployment issue (#8217)
Fix build system for Thrust detection (#8230)
Fix: always switch to the submodule dir before checking git tag/commit (#8240)
Fix overflow of index calculation in random generator API (#8246)
Fix Generator API parallelism (#8247)
Fix CUB min/max initial values (#8266)
Fix jitify warmup kernel - Cont'd (#8270)

Documentation

Update conda installation guide (#8135)
Fix pdist docstring in order to specify that the returned matrix is condensed (#8187)
Replace license notice in cupyx.scipy.signal._spectral (#8271)
Update document for CUDA 12.3 and 12.4 (#8284)

Installation

Do not search for static libs (#8143)

Tests

Fix cupyx.scipy.special.betainc for invalid inputs (#8098)
Revert CI timeout changes (#8137)
Fix invalid vectorstength tests (#8145)
Fix actions versions used in workflows to avoid node 16 deprecation warning (#8194)
Add CI to test cupy.show_config() pass without CUDA installed (#8195)
Add import test without CUDA Toolkit (#8231)
BUG: cupyx/scipy/signal: fix mpmath test (#8262)
Tentatively pin SciPy to v1.12 in CI (#8275)
Add support for CUDA 12.3 & 12.4 (#8286)

👥 Contributors

The CuPy Team would like to thank all those who contributed to this release!

@andfoy @asi1024 @emcastillo @ev-br @jemiryguo @kmaehashi @leofang @takagi

Contributors

kmaehashi, takagi, and 6 other contributors

Assets 35

18 Jan 05:54

emcastillo

v13.0.0

1bdaf31

v13.0.0

This is the release note of v13.0.0. See here for the complete list of solved issues and merged PRs.

This release note only covers changes made since the v13.0.0rc1 release. Check out our blog for highlights of the v13 release!

See the Upgrade Guide for the list of possible breaking changes in v13.

💬 Join the Matrix chat to talk with developers and users and ask quick questions!

🙌 Help us sustain the project by sponsoring CuPy!

📝 Changes

For all changes in v13, please refer to the release notes of the pre-releases (alpha1, beta1, rc1).

New Features

Add cupyx.signal.pulse_compression from cuSignal's non SciPy-compat API (#8039)
Add cupyx.signal.convolve1d3o from cuSignal's non SciPy-compat API (#8067)
add cupyx.signal.{pulse_doppler, cfar_alpha} (#8069)
Add cupyx.signal.convolve1d2o (#8113)

Enhancements

Make cupyx.signal.radartools private (#8053)
Fix csrmatrix.__pow__ to raise ValueError for non-int other (#8085)

Performance Improvements

Speed up cupy environment duplicate detection (#8042)

Bug Fixes

Fix lfilter_zi and sosfilt_zi when any IIR coefficient is zero (#8036)
Fix argmax/argmin for large reduction axis (#8041)
Fix cupyx.scipy.fft.{dst,dstn} in type 2/3 (#8082)
Do not use from-import (#8114)

Code Fixes

Refactor convolve1d3o (#8100)
Refactor radartools (#8106)

Documentation

Generate signature for ufunc documentation (#8044)
Use modern dlpack interface in torch interoperability document (#8048)

Installation

Skip CUDA_PATH warning in Conda installation (#8076)
Bump version to v13.0.0 (#8119)

Tests

Bump stable branch to v13 (#8026)
Remove some signal.vectorstrength xfail tests (#8083)
Fix scipy.linalg not to raise DeprecationWarning for zero-size inputs (#8086)
scipy.special.{btdtr,btdtri} are deprecated since SciPy (#8094)
Refactor radartools tests (#8099)
Fix slow test (#8117)

👥 Contributors

@andfoy @asi1024 @emcastillo @hauntsaninja @kmaehashi @takagi

The CuPy Team would like to thank all those who contributed to this release!

Contributors

kmaehashi, takagi, and 4 other contributors

Assets 35

Uh oh!

Releases: cupy/cupy

v13.6.0

✨ Highlights

📝 Changes

Enhancements

Bug Fixes

Documentation

Installation

Tests

Others

👥 Contributors

Contributors

Uh oh!

v13.5.1

📝 Changes

Bug Fixes

Installation

👥 Contributors

Contributors

Uh oh!

v13.5.0

✨ Highlights

Request for Comments

📝 Changes

New Features

Enhancements

Performance Improvements

Bug Fixes

Code Fixes

Documentation

Installation

Tests

Others

👥 Contributors

Contributors

Uh oh!

v14.0.0a1

✨ Highlights

🛠️ Changes without compatibility

📝 Changes

New Features

Enhancements

Contributors

Uh oh!

v13.4.1

📝 Changes

Bug Fixes

Tests

Others

👥 Contributors

Contributors

Uh oh!

v13.4.0

✨ Highlights

NVIDIA CUDA 12.8 Support

AMD ROCm 6.x Support

Python 3.13 Support

🛠️ Changes without compatibility

Cython 3.0 as build requirement (#8959)

📝 Changes

New Features

Enhancements

Bug Fixes

Code Fixes

Documentation

Installation

Tests

Others

👥 Contributors

Contributors

Uh oh!

v13.3.0

✨ Highlights

Updated NVIDIA CCCL

Enhanced NumPy 2.0 Compatibility

Support for CUDA 12.5 & 12.6

RFC: Removing NumPy Fallback Mode in CuPy v14

📝 Changes

Enhancements