Pre-release

v4.0.0rc1

@hvy hvy released this Mar 20, 2018 · 879 commits to master since this release

This is the release candidate of v4. See here for the complete list of solved issues and merged PRs.

Announcements

  • We have started supporting CUDA9.1! A new wheel package cupy-cuda91 is also available from this release. You can install it with pip install cupy-cuda91.
  • The master branch has been switched to v5 development. The development of v4 will continue in the v4 branch.
  • The major release of v4 is planned on Apr. 17.

New Features

  • Implement cuDNN convolution interface (#715)
  • Support multi dimensional arrays in solve (#845)
  • Add cupyx.rsqrt (#846)
  • Add destroy method to NcclCommunicator (#975)
  • Implement __setitem__ in fusion function (#1002)

Bug Fixes

  • Fix overflow in indices when indexing (#758, thanks @yuyu2172!)
  • Fix matrix multiplication when matrixes have duplicated entries (#834)
  • Fix multithread bug with CUDA driver API (#916)
  • Remove trailing NULL from values returned from NVRTC (#942)
  • Fix ndarray.diagonal to accept appropriate argument of axis2 (#978, thanks @ronekko!)
  • Fix eliminate_zeros (#998)
  • Remove cudnn STATUS dict (#1012)
  • Fix temporary variables which are used when input_num is given (#1020)
  • Fix cupy.copyto ignore where argument when src is scalar (#1028)

Installation

  • Fix to use rpath only when wheel libs are specified (#980)
  • Update to CUDA 8.0 and use CuPy wheels in Dockerfiles (#991)
  • Support CUDA 9.1 (#997)
  • Fix exception handling fail on Windows with Python 3.x (#1000)

Enhancements

  • Improve matrix inverse speed using LU decomposition (#695, #927, thanks @stevendbrown!)
  • Use CUDA version to decide if it import cuSOLVER or not (#832)
  • Simplify fp16 code in carray.cuh (#870)
  • Fix potential error at the stride for loop over the j-axis in ReductionKernel (#874, thanks @grafi-tt!)
  • Hide Chunk class (#933)
  • Improve concatenate and other functions (#949)
  • Use nogil in FFT (#950)
  • Improve error message when import failed (#970)
  • Use current stream in array method (#981)
  • Change group argument name of create_convolution_descriptor (#988)
  • Use default stream in _scatter_op (#989)
  • Expose CUDNN_BN_MIN_EPSILON from cudnn.h (#1011)

Documents

  • Add upgrade guide for v2 & v4 (#884)
  • Add wheels to installation guide (#955)
  • Update array docstring (#982, thanks @juniorrojas!)
  • Prefer pip in documentation (#985)
  • Add Docker update information to upgrade guide (#993)
  • Remove unnecessary heading from reference (#996)
  • Add URL to the directory in the documentation (#999)
  • Fix spelling mistake of NumPy and CuPy (#1013)
  • Fix typo (#1015)
  • Fix scatter_add docs (#1025)
  • Add complex dtypes on the overview (#1026)
  • Document more on CuPy/NumPy difference (#1027)
  • Add SciPy license to document (#1037)
  • Fix broken link to numpy.sum (#1039)

Tests

  • Free huge memory in slow test; Fix sum test to avoid contiguousness difference between CuPy and NumPy (#971)
  • Add AppVeyor configuration (#1001)
  • Fix shaped_random for complex number (#1017)
  • Skip cuDNN tests when cuDNN is unavailable (#1041)
  • Add Codecov.io configuration (#1003)

Others

  • Remove outdated TODO (#1014)