@kmaehashi kmaehashi released this Apr 17, 2018 · 222 commits to v4 since this release

Assets 2

This is the major release of CuPy v4.0.0. All of the updates since the previous major version (v2.5.0) can be found in the release notes below:

Summary of v4 update

  • We start providing wheel packages. You can install one using the following command, depending on the CUDA version you are using.
$ pip install cupy-cuda80
$ pip install cupy-cuda90
$ pip install cupy-cuda91

If you already have an old version of CuPy installed, first uninstall it before installing a wheel package. Note that these packages also include binaries of cuDNN and NCCL, so you do not need to place it by yourself.

  • Memory pool is now the default allocator even if CuPy is used alone without Chainer (note that it does not affect those who are using Chainer).
  • Many new functions are added, including FFT support.
  • Version number is aligned with that of Chainer. It means “v3.x.x” series has been skipped.

See the Upgrade Guide for users of migrating from CuPy v2 to v4.

Updates from the release candidate are as follows.

New Features

  • Implement cupy.show_config and cupyx.get_runtime_info (#1120)


  • Support double precision atomicAdd on Maxwell or older GPUs (#1114, thanks @anaruse!)
  • Expose all supported dtypes from numpy (#1130)
  • Handle errors in cupy.show_config() (#1135)
  • Fix to capture CuDNNError in cupyx.runtime (#1151)

Bug Fixes

  • Fix diagflat fail if argument is not cupy.ndarray (#1058)
  • Fix moveaxis bug (#1059, thanks @fukatani!)
  • Fix duplicate declaration of EigMode in cuSPARSE (#1111)
  • Fix a.real and a.imag to return view (#1113)
  • Fix cupy.concatenate to support arrays with >= 2**31 elements (#1115)
  • Limit arch to the maximum value allowed in each NVRTC version (#1119)
  • Fix duplicate delcaration of cudaError_t (#1145)
  • Use streams when calling libraries (#1153)
  • Fix cupy.linalg.inv() breaks its argument (#1154, thanks @hyabe!)
  • Do not use platform-specific CC (#1158)


  • Update documentation for chainer.backends.cuda (#1050)
  • Fix typo (#1051)
  • Fix typo (#1080)
  • Fix document of for_unsigned_dtypes (#1081)
  • Fix wrong references of document (#1102)
  • Remove invalid argument description in cupy.tensordot (#1103)
  • Rewrite installation guide (#1127)
  • Enable flake8 in cupy/indexing/generate.py (#1146)
  • Fix document of r_ and c_ (#1149)
  • Fix document of MemoryHook (#1150)


  • Use --no-cache-dir in Dockerfile (#1061)
  • Avoid embedding CUDA_PATH to RPATH in wheels (#1083)


  • Avoid to import matplotlib to set its backend Agg in code like chainer (#1054)


  • Remove platform-dependent dtype (#1092)
  • Remove nose dependency (#1126)