Browse files

Merge pull request #2896 from stuartarchibald/wip/release_notes_038

Release notes for 0.38.
  • Loading branch information...
sklam committed Apr 13, 2018
2 parents 66b90e4 + d326f9f commit a0390f660c7e78e144237d737f2089c9f0d77f0b
Showing with 137 additions and 0 deletions.
  1. +137 −0 CHANGE_LOG
@@ -1,3 +1,140 @@
Version 0.38.0
Following on from the bug fix focus of the last release, this release swings
back towards the addition of new features and usability improvements based on
community feedback. This release is comparatively large! Three key features/
changes to note are:
* Numba (via llvmlite) is now backed by LLVM 6.0, general vectorization is
improved as a result. A significant long standing LLVM bug that was causing
corruption was also found and fixed.
* Further considerable improvements in vectorization are made available as
Numba now supports Intel's short vector math library (SVML).
Try it out with `conda install -c numba icc_rt`.
* CUDA 8.0 is now the minimum supported CUDA version.
Other highlights include:
* Bug fixes to `parallel=True` have enabled more vectorization opportunities
when using the ParallelAccelerator technology.
* Much effort has gone into improving error reporting and the general usability
of Numba. This includes highlighted error messages and performance tips
documentation. Try it out with `conda install colorama`.
* A number of new NumPy functions are supported, `np.convolve`, `np.correlate`
`np.reshape`, `np.transpose`, `np.permutation`, `np.real`, `np.imag`, and
`np.searchsorted` now supports the`side` kwarg. Further, `np.argsort` now
supports the `kind` kwarg with `quicksort` and `mergesort` available.
* The Numba extension API has gained the ability operate more easily with
functions from Cython modules through the use of
`numba.extending.get_cython_function_address` to obtain function addresses
for direct use in `ctypes.CFUNCTYPE`.
* Numba now allows the passing of jitted functions (and containers of jitted
functions) as arguments to other jitted functions.
* The CUDA functionality has gained support for a larger selection of bit
manipulation intrinsics, also SELP, and has had a number of bugs fixed.
* Initial work to support the PPC64LE platform has been added. It should be
noted that this platform is not yet supported and some problems still exist.
It is hoped that any remaining issues will be fixed in the next release.
* The capacity for advanced users/compiler engineers to define their own
compilation pipelines.
* PR #2660: Support bools from cffi in nopython.
* PR #2741: Enhance error message for undefined variables.
* PR #2744: Add diagnostic error message to test suite discovery failure.
* PR #2748: Added Intel SVML optimizations as opt-out choice working by default
* PR #2762: Support transpose with axes arguments.
* PR #2777: Add support for np.correlate and np.convolve
* PR #2779: Implement np.random.permutation
* PR #2801: Passing jitted functions as args
* PR #2802: Support np.real() and np.imag()
* PR #2807: Expose `import_cython_function`
* PR #2821: Add kwarg 'side' to np.searchsorted
* PR #2822: Adds stable argsort
* PR #2832: Fixups for llvmlite 0.23/llvm 6
* PR #2836: Support `index` method on tuples
* PR #2839: Support for np.transpose and np.reshape.
* PR #2843: Custom pipeline
* PR #2847: Replace signed array access indices in unsiged prange loop body
* PR #2859: Add support for improved error reporting.
* PR #2880: This adds a github issue template.
* PR #2881: Build recipe to clone Intel ICC runtime.
* PR #2882: Update TravisCI to test SVML
* PR #2893: Add reference to the data buffer in array.ctypes object
* PR #2895: Move to CUDA 8.0
* PR #2737: Fix #2007 (part 1). Empty array handling in np.linalg.
* PR #2738: Fix install_requires to allow pip getting pre-release version
* PR #2740: Fix 2208. Generate better error message.
* PR #2765: Fix Bit-ness
* PR #2780: PowerPC reference counting memory fences
* PR #2805: Fix six imports.
* PR #2813: Fix #2812: gufunc scalar output bug.
* PR #2814: Fix the build post #2727
* PR #2831: Attempt to fix #2473
* PR #2842: Fix issue with test discovery and broken CUDA drivers.
* PR #2850: Add rtsys init guard and test.
* PR #2852: Skip vectorization test with targets that are not x86
* PR #2856: Prevent printing to stdout in ``
* PR #2864: Correct C code to prevent compiler warnings.
* PR #2889: Attempt to fix #2386.
* PR #2891: Removed test skipping for inspect_cfg
* PR #2898: Add guard to parallel test on unsupported platforms
Documentation Updates:
* PR #2739: Explicitly state default value of error_model in docstring
* PR #2803: DOC: parallel vectorize requires signatures
* PR #2829: Add Python 2.7 EOL plan to docs
* PR #2838: Use automatic numbering syntax in list.
* PR #2877: Add performance tips documentation.
* PR #2883: Fix #2872: update rng doc about thread/fork-safety
ParallelAccelerator enhancements/fixes:
* PR #2727: Changes to enable vectorization in ParallelAccelerator.
* PR #2816: Array analysis for transpose with arbitrary arguments
* PR #2874: Fix dead code eliminator not to remove a call with side-effect
* PR #2886: Fix ParallelAccelerator arrayexpr repr
CUDA enhancements:
* PR #2734: More Constants From cuda.h
* PR #2767: Add len(..) Support to DeviceNDArray
* PR #2778: Add More Device Array API Functions to CUDA Simulator
* PR #2824: Add CUDA Primitives for Population Count
* PR #2835: Emit selp Instructions to Avoid Branching
* PR #2867: Full support for CUDA device attributes
CUDA fixes:
* PR #2768: Don't Compile Code on Every Assignment
* PR #2878: Fixes a Win64 issue with the test in Pr/2865
The following people contributed to this release.
* Abutalib Aghayev
* Alex Olivas
* Anton Malakhov
* Dong-hee Na
* Ehsan Totoni (core dev)
* John Zwinck
* Josh Wilson
* Kelsey Jordahl
* Nick White
* Olexa Bilaniuk
* Rik-de-Kort
* Siu Kwan Lam (core dev)
* Stan Seibert (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
Version 0.37.0

0 comments on commit a0390f6

Please sign in to comment.