Skip to content
Permalink
master
Switch branches/tags
Go to file
12 contributors

Users who have contributed to this file

@stuartarchibald @sklam @pitrou @seibert @esc @markflorisson @jayvius @ilanschnell @electronwill @bwignall @gmarkall @imba-tjd
Version 0.55.0 (13 January, 2022)
---------------------------------
This release includes a significant number important dependency upgrades along
with a number of new features and bug fixes.
NOTE: Due to NumPy CVE-2021-33430 this release has bypassed the usual release
process so as to promptly provide a Numba release that supports NumPy 1.21. A
single release candidate (RC1) was made and a few issues were reported, these
are summarised as follows and will be fixed in a subsequent 0.55.1 release.
Known issues with this release:
* Incorrect result copying array-typed field of structured array (`#7693 <https://github.com/numba/numba/pull/7693>`_)
* Two issues in DebugInfo generation (`#7726 <https://github.com/numba/numba/pull/7726>`_, `#7730 <https://github.com/numba/numba/pull/7730>`_)
* Compilation failure for ``hash`` of floating point values on 32 bit Windows
when using Python 3.10 (`#7713 <https://github.com/numba/numba/pull/7713>`_).
Highlights of core dependency upgrades:
* Support for Python 3.10
* Support for NumPy 1.21
Python language support enhancements:
* Experimental support for ``isinstance``.
NumPy features/enhancements:
The following functions are now supported:
* ``np.broadcast_to``
* ``np.float_power``
* ``np.cbrt``
* ``np.logspace``
* ``np.take_along_axis``
* ``np.average``
* ``np.argmin`` gains support for the ``axis`` kwarg.
* ``np.ndarray.astype`` gains support for types expressed as literal strings.
Highlights of core changes:
* For users of the Numba extension API, Numba now has a new error handling mode
whereby it will treat all exceptions that do not inherit from
``numba.errors.NumbaException`` as a "hard error" and immediately unwind the
stack. This makes it much easier to debug when writing ``@overload``\s etc
from the extension API as there's now no confusion between Python errors and
Numba errors. This feature can be enabled by setting the environment
variable: ``NUMBA_CAPTURED_ERRORS='new_style'``.
* The threading layer selection priority can now be changed via the environment
variable ``NUMBA_THREADING_LAYER_PRIORITY``.
Highlights of changes for the CUDA target:
* Support for NVIDIA's CUDA Python bindings.
* Support for 16-bit floating point numbers and their basic operations via
intrinsics.
* Streams are provided in the ``Stream.async_done`` result, making it easier to
implement asynchronous work queues.
* Support for structured types in device arrays, character sequences in NumPy
arrays, and some array operations on nested arrays.
* Much underlying refactoring to align the CUDA target more closely with the
CPU target, which lays the groudwork for supporting the high level extension
API in CUDA in future releases.
Intel also kindly sponsored research and development into native debug (DWARF)
support and handling per-function compilation flags:
* Line number/location tracking is much improved.
* Numba's internal representation of containers (e.g. tuples, arrays) are now
encoded as structures.
* Numba's per-function compilation flags are encoded into the ABI field of the
mangled name of the function such that it's possible to compile and
differentiate between versions of the same function with different flags set.
General deprecation notices:
* There are no new general deprecations.
CUDA target deprecation notices:
* There are no new CUDA target deprecations.
Version support/dependency changes:
* Python 3.10 is supported.
* NumPy version 1.21 is supported.
* The minimum supported NumPy version is raised to 1.18 for runtime (compilation
however remains compatible with NumPy 1.11).
Pull-Requests:
* PR `#6075 <https://github.com/numba/numba/pull/6075>`_: add np.float_power and np.cbrt (`Guilherme Leobas <https://github.com/guilhermeleobas>`_)
* PR `#7047 <https://github.com/numba/numba/pull/7047>`_: Support __hash__ for numpy.datetime64 (`Guilherme Leobas <https://github.com/guilhermeleobas>`_ `stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7057 <https://github.com/numba/numba/pull/7057>`_: Fix #7041: Add charseq registry to CUDA target (`Graham Markall <https://github.com/gmarkall>`_ `stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7082 <https://github.com/numba/numba/pull/7082>`_: Added Add/Sub between datetime64 array and timedelta64 scalar (`Nick Riasanovsky <https://github.com/njriasan>`_ `stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7119 <https://github.com/numba/numba/pull/7119>`_: Add support for `np.broadcast_to` (`Guilherme Leobas <https://github.com/guilhermeleobas>`_)
* PR `#7129 <https://github.com/numba/numba/pull/7129>`_: Add support for axis keyword argument to np.argmin() (`Itamar Turner-Trauring <https://github.com/itamarst>`_)
* PR `#7132 <https://github.com/numba/numba/pull/7132>`_: gh #7131 Support for astype with literal strings (`Nick Riasanovsky <https://github.com/njriasan>`_)
* PR `#7177 <https://github.com/numba/numba/pull/7177>`_: Add debug infomation support based on datamodel. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7185 <https://github.com/numba/numba/pull/7185>`_: Add get_impl_key as abstract method to types.Callable (`Alexey Kozlov <https://github.com/kozlov-alexey>`_)
* PR `#7186 <https://github.com/numba/numba/pull/7186>`_: Add support for np.logspace. (`Guoqiang QI <https://github.com/guoqiangqi>`_)
* PR `#7189 <https://github.com/numba/numba/pull/7189>`_: CUDA: Skip IPC tests on ARM (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7190 <https://github.com/numba/numba/pull/7190>`_: CUDA: Fix test_pinned on Jetson (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7192 <https://github.com/numba/numba/pull/7192>`_: Fix missing import in array.argsort impl and add more tests. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7196 <https://github.com/numba/numba/pull/7196>`_: Fixes for lineinfo emission (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7197 <https://github.com/numba/numba/pull/7197>`_: don't post to python announce on the first RC (`esc <https://github.com/esc>`_)
* PR `#7202 <https://github.com/numba/numba/pull/7202>`_: Initial implementation of np.take_along_axis (`Itamar Turner-Trauring <https://github.com/itamarst>`_)
* PR `#7203 <https://github.com/numba/numba/pull/7203>`_: remove duplicate changelog entries (`esc <https://github.com/esc>`_)
* PR `#7216 <https://github.com/numba/numba/pull/7216>`_: Update CHANGE_LOG for 0.54.0rc2 (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7219 <https://github.com/numba/numba/pull/7219>`_: bump llvmlite dependency to 0.38.0dev0 for Numba 0.55.0dev0 (`esc <https://github.com/esc>`_)
* PR `#7220 <https://github.com/numba/numba/pull/7220>`_: update release checklist post 0.54rc1+2 (`esc <https://github.com/esc>`_)
* PR `#7221 <https://github.com/numba/numba/pull/7221>`_: Show GPU UUIDs in cuda.detect() output (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7222 <https://github.com/numba/numba/pull/7222>`_: CUDA: Warn when debug=True and opt=True (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7223 <https://github.com/numba/numba/pull/7223>`_: Replace assertion errors on IR assumption violation (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7226 <https://github.com/numba/numba/pull/7226>`_: Add support for structured types in Device Arrays (`Michael Collison <https://github.com/testhound>`_)
* PR `#7227 <https://github.com/numba/numba/pull/7227>`_: FIX: Typo (`Srinath Kailasa <https://github.com/skailasa>`_)
* PR `#7230 <https://github.com/numba/numba/pull/7230>`_: PR #7171 bugfix only (`stuartarchibald <https://github.com/stuartarchibald>`_ `Todd A. Anderson <https://github.com/DrTodd13>`_)
* PR `#7234 <https://github.com/numba/numba/pull/7234>`_: add THREADING_LAYER_PRIORITY & NUMBA_THREADING_LAYER_PRIORITY (`Kolen Cheung <https://github.com/ickc>`_)
* PR `#7235 <https://github.com/numba/numba/pull/7235>`_: replace wordings of WIP by draft PR (`Kolen Cheung <https://github.com/ickc>`_)
* PR `#7236 <https://github.com/numba/numba/pull/7236>`_: CUDA: Skip managed alloc tests on ARM (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7237 <https://github.com/numba/numba/pull/7237>`_: fix a typo in a string (`Kolen Cheung <https://github.com/ickc>`_)
* PR `#7241 <https://github.com/numba/numba/pull/7241>`_: Set aliasing information for inplace_binops.. (`Todd A. Anderson <https://github.com/DrTodd13>`_)
* PR `#7242 <https://github.com/numba/numba/pull/7242>`_: FIX: typo (`Srinath Kailasa <https://github.com/skailasa>`_)
* PR `#7244 <https://github.com/numba/numba/pull/7244>`_: Implement partial literal propagation pass (support 'isinstance') (`Guilherme Leobas <https://github.com/guilhermeleobas>`_ `stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7247 <https://github.com/numba/numba/pull/7247>`_: Solve memory leak to fix issue #7210 (`Siu Kwan Lam <https://github.com/sklam>`_ `Graham Markall <https://github.com/gmarkall>`_ `ysheffer <https://github.com/ysheffer>`_)
* PR `#7251 <https://github.com/numba/numba/pull/7251>`_: Fix #6001: typed.List ignores ctor arguments with JIT disabled (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7256 <https://github.com/numba/numba/pull/7256>`_: Fix link to the discourse forum in README (`Kenichi Maehashi <https://github.com/kmaehashi>`_)
* PR `#7257 <https://github.com/numba/numba/pull/7257>`_: Use normal list constructor in List.__new__() (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7260 <https://github.com/numba/numba/pull/7260>`_: Support typed lists in `heapq` (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7263 <https://github.com/numba/numba/pull/7263>`_: Updated issue URL for error messages #7261 (`DeviousLab <https://github.com/DeviousLab>`_)
* PR `#7265 <https://github.com/numba/numba/pull/7265>`_: Fix linspace to use np.divide and clamp to stop. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7266 <https://github.com/numba/numba/pull/7266>`_: CUDA: Skip multi-GPU copy test with peer access disabled (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7267 <https://github.com/numba/numba/pull/7267>`_: Fix #7258. Bug in SROA optimization (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7271 <https://github.com/numba/numba/pull/7271>`_: Update 3rd party license text. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7272 <https://github.com/numba/numba/pull/7272>`_: Allow annotations in njit-ed functions (`LunarLanding <https://github.com/LunarLanding>`_)
* PR `#7273 <https://github.com/numba/numba/pull/7273>`_: Update CHANGE_LOG for 0.54.0rc3. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7283 <https://github.com/numba/numba/pull/7283>`_: Added NPM to Glossary and linked to mentions (`Nihal Shetty <https://github.com/nihalshetty-boop>`_)
* PR `#7285 <https://github.com/numba/numba/pull/7285>`_: CUDA: Fix OOB in test_kernel_arg (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7288 <https://github.com/numba/numba/pull/7288>`_: Handle cval as a np attr in stencil generation. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7294 <https://github.com/numba/numba/pull/7294>`_: Continuation of PR #7280, fixing lifetime of TBB task_scheduler_handle (`Sergey Pokhodenko <https://github.com/PokhodenkoSA>`_ `stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7296 <https://github.com/numba/numba/pull/7296>`_: Fix generator lowering not casting to the actual yielded type (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7298 <https://github.com/numba/numba/pull/7298>`_: Use CBC to pin GCC to 7 on most linux and 9 on aarch64. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7304 <https://github.com/numba/numba/pull/7304>`_: Continue PR#3655: add support for np.average (`Hadia Ahmed <https://github.com/hadia206>`_ `slnguyen <https://github.com/slnguyen>`_)
* PR `#7307 <https://github.com/numba/numba/pull/7307>`_: Prevent mutation of arrays in global tuples. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7309 <https://github.com/numba/numba/pull/7309>`_: Update MapConstraint to handle type coercion for typed.Dict correctly. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7312 <https://github.com/numba/numba/pull/7312>`_: Fix #7302. Workaround missing pthread problem on ppc64le (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7315 <https://github.com/numba/numba/pull/7315>`_: Link ELF obj as DSO for radare2 disassembly CFG (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7316 <https://github.com/numba/numba/pull/7316>`_: Use float64 for consistent typing in heapq tests. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7317 <https://github.com/numba/numba/pull/7317>`_: In TBB tsh test switch os.fork for mp fork ctx (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7319 <https://github.com/numba/numba/pull/7319>`_: Update CHANGE_LOG for 0.54.0 final. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7329 <https://github.com/numba/numba/pull/7329>`_: Improve documentation in reference to CUDA local memory (`Sterling Baird <https://github.com/sgbaird>`_)
* PR `#7330 <https://github.com/numba/numba/pull/7330>`_: Cuda matmul docs (`Sterling Baird <https://github.com/sgbaird>`_)
* PR `#7340 <https://github.com/numba/numba/pull/7340>`_: Add size_t and ssize_t types (`Bruce Merry <https://github.com/bmerry>`_)
* PR `#7345 <https://github.com/numba/numba/pull/7345>`_: Add check for ipykernel file in IPython cache locator (`Sahil Gupta <https://github.com/sahil1105>`_)
* PR `#7347 <https://github.com/numba/numba/pull/7347>`_: fix:updated url for error report and feature rquest using issue template (`DEBARGHA SAHA <https://github.com/Stark-developer01>`_)
* PR `#7349 <https://github.com/numba/numba/pull/7349>`_: Allow arbitrary walk-back in reduction nodes to find inplace_binop. (`Todd A. Anderson <https://github.com/DrTodd13>`_)
* PR `#7359 <https://github.com/numba/numba/pull/7359>`_: Extend support for nested arrays inside numpy records (`Graham Markall <https://github.com/gmarkall>`_ `luk-f-a <https://github.com/luk-f-a>`_)
* PR `#7375 <https://github.com/numba/numba/pull/7375>`_: CUDA: Run doctests as part of numba.cuda.tests and fix test_cg (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7395 <https://github.com/numba/numba/pull/7395>`_: Fix #7394 and #6550 & Added test & improved error message (`MegaIng <https://github.com/MegaIng>`_)
* PR `#7397 <https://github.com/numba/numba/pull/7397>`_: Add option to catch only Numba `numba.core.errors` derived exceptions. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7398 <https://github.com/numba/numba/pull/7398>`_: Add support for arrayanalysis of tuple args. (`Todd A. Anderson <https://github.com/DrTodd13>`_)
* PR `#7403 <https://github.com/numba/numba/pull/7403>`_: Fix for issue 7402: implement missing numpy ufunc interface (`Guilherme Leobas <https://github.com/guilhermeleobas>`_)
* PR `#7404 <https://github.com/numba/numba/pull/7404>`_: fix typo in literal_unroll docs (`esc <https://github.com/esc>`_)
* PR `#7419 <https://github.com/numba/numba/pull/7419>`_: insert missing backtick in comment (`esc <https://github.com/esc>`_)
* PR `#7422 <https://github.com/numba/numba/pull/7422>`_: Update Omitted Type to use Hashable Values as Keys for Caching (`Nick Riasanovsky <https://github.com/njriasan>`_)
* PR `#7429 <https://github.com/numba/numba/pull/7429>`_: Update CHANGE_LOG for 0.54.1 (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7432 <https://github.com/numba/numba/pull/7432>`_: add github release task to checklist (`esc <https://github.com/esc>`_)
* PR `#7440 <https://github.com/numba/numba/pull/7440>`_: Refactor TargetConfig naming. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7441 <https://github.com/numba/numba/pull/7441>`_: Permit any string as a key in literalstrkeydict type. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7442 <https://github.com/numba/numba/pull/7442>`_: Add some diagnostics to SVML test failures. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7443 <https://github.com/numba/numba/pull/7443>`_: Refactor template selection logic for targets. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7444 <https://github.com/numba/numba/pull/7444>`_: use correct variable name in closure (`esc <https://github.com/esc>`_)
* PR `#7447 <https://github.com/numba/numba/pull/7447>`_: cleanup Numba metadata (`esc <https://github.com/esc>`_)
* PR `#7453 <https://github.com/numba/numba/pull/7453>`_: CUDA: Provide stream in async_done result (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7456 <https://github.com/numba/numba/pull/7456>`_: Fix invalid codegen for #7451. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7457 <https://github.com/numba/numba/pull/7457>`_: Factor out target registry selection logic (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7459 <https://github.com/numba/numba/pull/7459>`_: Include compiler flags in symbol mangling (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7460 <https://github.com/numba/numba/pull/7460>`_: Add FP16 support for CUDA (`Michael Collison <https://github.com/testhound>`_ `Graham Markall <https://github.com/gmarkall>`_)
* PR `#7461 <https://github.com/numba/numba/pull/7461>`_: Support NVIDIA's CUDA Python bindings (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7465 <https://github.com/numba/numba/pull/7465>`_: Update changelog for 0.54.1 release (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7477 <https://github.com/numba/numba/pull/7477>`_: Fix unicode operator.eq handling of Optional types. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7479 <https://github.com/numba/numba/pull/7479>`_: CUDA: Print format string and warn for > 32 print() args (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7483 <https://github.com/numba/numba/pull/7483>`_: NumPy 1.21 support (`Sebastian Berg <https://github.com/seberg>`_ `stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7484 <https://github.com/numba/numba/pull/7484>`_: Fixed outgoing link to nvidia documentation. (`Dhruv Patel <https://github.com/DhruvPatel01>`_)
* PR `#7493 <https://github.com/numba/numba/pull/7493>`_: Consolidate TLS stacks in target configuration (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7496 <https://github.com/numba/numba/pull/7496>`_: CUDA: Use a single dispatcher class for all kinds of functions (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7498 <https://github.com/numba/numba/pull/7498>`_: refactor with-detection logic (`stuartarchibald <https://github.com/stuartarchibald>`_ `esc <https://github.com/esc>`_)
* PR `#7499 <https://github.com/numba/numba/pull/7499>`_: Add build scripts for CUDA testing on gpuCI (`Charles Blackmon-Luca <https://github.com/charlesbluca>`_ `Graham Markall <https://github.com/gmarkall>`_)
* PR `#7500 <https://github.com/numba/numba/pull/7500>`_: Update parallel.rst (`Julius Bier Kirkegaard <https://github.com/juliusbierk>`_)
* PR `#7506 <https://github.com/numba/numba/pull/7506>`_: Enhance Flags mangling/demangling (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7514 <https://github.com/numba/numba/pull/7514>`_: Fixup cuda debuginfo emission for 7177 (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7525 <https://github.com/numba/numba/pull/7525>`_: Make sure` demangle()` returns `str` type. (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7538 <https://github.com/numba/numba/pull/7538>`_: Fix `@overload_glue` performance regression. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7539 <https://github.com/numba/numba/pull/7539>`_: Fix str decode issue from merge #7525/#7506 (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7546 <https://github.com/numba/numba/pull/7546>`_: Fix handling of missing const key in LiteralStrKeyDict (`Siu Kwan Lam <https://github.com/sklam>`_ `stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7547 <https://github.com/numba/numba/pull/7547>`_: Remove 32bit linux scipy installation. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7548 <https://github.com/numba/numba/pull/7548>`_: Correct evaluation order in assert statement (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7552 <https://github.com/numba/numba/pull/7552>`_: Prepend the inlined function name to inlined variables. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7557 <https://github.com/numba/numba/pull/7557>`_: Python3.10 v2 (`stuartarchibald <https://github.com/stuartarchibald>`_ `esc <https://github.com/esc>`_)
* PR `#7560 <https://github.com/numba/numba/pull/7560>`_: Refactor with detection py310 (`Siu Kwan Lam <https://github.com/sklam>`_ `esc <https://github.com/esc>`_)
* PR `#7561 <https://github.com/numba/numba/pull/7561>`_: fix a typo (`Kolen Cheung <https://github.com/ickc>`_)
* PR `#7567 <https://github.com/numba/numba/pull/7567>`_: Update docs to note meetings are public. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7570 <https://github.com/numba/numba/pull/7570>`_: Update the docs and error message for errors when importing Numba. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7580 <https://github.com/numba/numba/pull/7580>`_: Fix #7507. catch `NotImplementedError` in `.get_function()` (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7581 <https://github.com/numba/numba/pull/7581>`_: Add support for casting from int enums (`Michael Collison <https://github.com/testhound>`_)
* PR `#7583 <https://github.com/numba/numba/pull/7583>`_: Make numba.types.Optional __str__ less verbose. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7588 <https://github.com/numba/numba/pull/7588>`_: Fix casting of start/stop in linspace (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7591 <https://github.com/numba/numba/pull/7591>`_: Remove deprecations (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7596 <https://github.com/numba/numba/pull/7596>`_: Fix max symbol match length for r2 (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7597 <https://github.com/numba/numba/pull/7597>`_: Update gdb docs for new DWARF enhancements. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7603 <https://github.com/numba/numba/pull/7603>`_: Fix list.insert() for refcounted values (`Ehsan Totoni <https://github.com/ehsantn>`_)
* PR `#7605 <https://github.com/numba/numba/pull/7605>`_: Fix TBB 2021 DSO names on OSX/Win and make TBB reporting consistent (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7606 <https://github.com/numba/numba/pull/7606>`_: Ensure a prescribed threading layer can load in CI. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7610 <https://github.com/numba/numba/pull/7610>`_: Fix #7609. Type should not be mutated. (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7618 <https://github.com/numba/numba/pull/7618>`_: Fix the doc build: docutils 0.18 not compatible with pinned sphinx (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7626 <https://github.com/numba/numba/pull/7626>`_: Fix issues with package dependencies. (`stuartarchibald <https://github.com/stuartarchibald>`_ `esc <https://github.com/esc>`_)
* PR `#7627 <https://github.com/numba/numba/pull/7627>`_: PR 7321 continued (`stuartarchibald <https://github.com/stuartarchibald>`_ `Eric Wieser <https://github.com/eric-wieser>`_)
* PR `#7628 <https://github.com/numba/numba/pull/7628>`_: Move to using windows-2019 images in Azure (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7632 <https://github.com/numba/numba/pull/7632>`_: Capture output in CUDA matmul doctest (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7636 <https://github.com/numba/numba/pull/7636>`_: Copy prange loop header to after the parfor. (`Todd A. Anderson <https://github.com/DrTodd13>`_)
* PR `#7637 <https://github.com/numba/numba/pull/7637>`_: Increase the timeout on the SVML tests for loaded machines. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7645 <https://github.com/numba/numba/pull/7645>`_: In debuginfo, do not add noinline to functions marked alwaysinline (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7650 <https://github.com/numba/numba/pull/7650>`_: Move Azure builds to OSX 10.15 (`stuartarchibald <https://github.com/stuartarchibald>`_ `esc <https://github.com/esc>`_ `Siu Kwan Lam <https://github.com/sklam>`_)
Authors:
* `Bruce Merry <https://github.com/bmerry>`_
* `Charles Blackmon-Luca <https://github.com/charlesbluca>`_
* `DeviousLab <https://github.com/DeviousLab>`_
* `Dhruv Patel <https://github.com/DhruvPatel01>`_
* `Todd A. Anderson <https://github.com/DrTodd13>`_
* `Ehsan Totoni <https://github.com/ehsantn>`_
* `Eric Wieser <https://github.com/eric-wieser>`_
* `esc <https://github.com/esc>`_
* `Graham Markall <https://github.com/gmarkall>`_
* `Guilherme Leobas <https://github.com/guilhermeleobas>`_
* `Guoqiang QI <https://github.com/guoqiangqi>`_
* `Hadia Ahmed <https://github.com/hadia206>`_
* `Kolen Cheung <https://github.com/ickc>`_
* `Itamar Turner-Trauring <https://github.com/itamarst>`_
* `Julius Bier Kirkegaard <https://github.com/juliusbierk>`_
* `Kenichi Maehashi <https://github.com/kmaehashi>`_
* `Alexey Kozlov <https://github.com/kozlov-alexey>`_
* `luk-f-a <https://github.com/luk-f-a>`_
* `LunarLanding <https://github.com/LunarLanding>`_
* `MegaIng <https://github.com/MegaIng>`_
* `Nihal Shetty <https://github.com/nihalshetty-boop>`_
* `Nick Riasanovsky <https://github.com/njriasan>`_
* `Sergey Pokhodenko <https://github.com/PokhodenkoSA>`_
* `Sahil Gupta <https://github.com/sahil1105>`_
* `Sebastian Berg <https://github.com/seberg>`_
* `Sterling Baird <https://github.com/sgbaird>`_
* `Srinath Kailasa <https://github.com/skailasa>`_
* `Siu Kwan Lam <https://github.com/sklam>`_
* `slnguyen <https://github.com/slnguyen>`_
* `DEBARGHA SAHA <https://github.com/Stark-developer01>`_
* `stuartarchibald <https://github.com/stuartarchibald>`_
* `Michael Collison <https://github.com/testhound>`_
* `ysheffer <https://github.com/ysheffer>`_
Version 0.54.1 (7 October, 2021)
--------------------------------
This is a bugfix release for 0.54.0. It fixes a regression in structured array
type handling, a potential leak on initialization failure in the CUDA target, a
regression caused by Numba's vendored cloudpickle module resetting dynamic
classes and a few minor testing/infrastructure related problems.
* PR `#7348 <https://github.com/numba/numba/pull/7348>`_: test_inspect_cli: Decode exception with default (utf-8) codec (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7360 <https://github.com/numba/numba/pull/7360>`_: CUDA: Fix potential leaks when initialization fails (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7386 <https://github.com/numba/numba/pull/7386>`_: Ensure the NRT is initialized prior to use in external NRT tests. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7388 <https://github.com/numba/numba/pull/7388>`_: Patch cloudpickle to not reset dynamic class each time it is unpickled (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7393 <https://github.com/numba/numba/pull/7393>`_: skip azure pipeline test if file not present (`esc <https://github.com/esc>`_)
* PR `#7428 <https://github.com/numba/numba/pull/7428>`_: Fix regression #7355: cannot set items in structured array data types (`Siu Kwan Lam <https://github.com/sklam>`_)
Authors:
* `esc <https://github.com/esc>`_
* `Graham Markall <https://github.com/gmarkall>`_
* `Siu Kwan Lam <https://github.com/sklam>`_
* `stuartarchibald <https://github.com/stuartarchibald>`_
Version 0.54.0 (19 August, 2021)
--------------------------------
This release includes a significant number of new features, important
refactoring, critical bug fixes and a number of dependency upgrades.
Python language support enhancements:
* Basic support for ``f-strings``.
* ``dict`` comprehensions are now supported.
* The ``sum`` built-in function is implemented.
NumPy features/enhancements:
The following functions are now supported:
* ``np.clip``
* ``np.iscomplex``
* ``np.iscomplexobj``
* ``np.isneginf``
* ``np.isposinf``
* ``np.isreal``
* ``np.isrealobj``
* ``np.isscalar``
* ``np.random.dirichlet``
* ``np.rot90``
* ``np.swapaxes``
Also ``np.argmax`` has gained support for the ``axis`` keyword argument and it's
now possible to use ``0d`` NumPy arrays as scalars in ``__setitem__`` calls.
Internal changes:
* Debugging support through DWARF has been fixed and enhanced.
* Numba now optimises the way in which locals are emitted to help reduce time
spent in LLVM's SROA passes.
CUDA target changes:
* Support for emitting ``lineinfo`` to be consumed by profiling tools such as
Nsight Compute
* Improved fastmath code generation for various trig, division, and other
functions
* Faster compilation using lazy addition of libdevice to compiled units
* Support for IPC on Windows
* Support for passing tuples to CUDA ufuncs
* Performance warnings:
* When making implicit copies by calling a kernel on arrays in host memory
* When occupancy is poor due to kernel or ufunc/gufunc configuration
* Support for implementing warp-aggregated intrinsics:
* Using support for more CUDA functions: ``activemask()``, ``lanemask_lt()``
* The ``ffs()`` function now works correctly!
* Support for ``@overload`` in the CUDA target
Intel kindly sponsored research and development that lead to a number of new
features and internal support changes:
* Dispatchers can now be retargetted to a new target via a user defined context
manager.
* Support for custom NumPy array subclasses has been added (including an
overloadable memory allocator).
* An inheritance based model for targets that permits targets to share
``@overload`` implementations.
* Per function compiler flags with inheritance behaviours.
* The extension API now has support for overloading class methods via the
``@overload_classmethod`` decorator.
Deprecations:
* The ``ROCm`` target (for AMD ROC GPUs) has been moved to an "unmaintained"
status and a seperate repository stub has been created for it at:
https://github.com/numba/numba-rocm
CUDA target deprecations and breaking changes:
* Relaxed strides checking is now the default when computing the contiguity of
device arrays.
* The ``inspect_ptx()`` method is deprecated. For use cases that obtain PTX for
further compilation outside of Numba, use ``compile_ptx()`` instead.
* Eager compilation of device functions (the case when ``device=True`` and a
signature is provided) is deprecated.
Version support/dependency changes:
* LLVM 11 is now supported on all platforms via llvmlite.
* The minimum supported Python version is raised to 3.7.
* NumPy version 1.20 is supported.
* The minimum supported NumPy version is raised to 1.17 for runtime (compilation
however remains compatible with NumPy 1.11).
* Vendor `cloudpickle <https://github.com/cloudpipe/cloudpickle>`_ `v1.6.0` --
now used for all ``pickle`` operations.
* TBB >= 2021 is now supported and all prior versions are unsupported (not
easily possible to maintain the ABI breaking changes).
Pull-Requests:
* PR `#4516 <https://github.com/numba/numba/pull/4516>`_: Make setitem accept 0d np-arrays (`Guilherme Leobas <https://github.com/guilhermeleobas>`_)
* PR `#4610 <https://github.com/numba/numba/pull/4610>`_: Implement np.is* functions (`Guilherme Leobas <https://github.com/guilhermeleobas>`_)
* PR `#5984 <https://github.com/numba/numba/pull/5984>`_: Handle idx and size unification in wrap_index manually. (`Todd A. Anderson <https://github.com/DrTodd13>`_)
* PR `#6468 <https://github.com/numba/numba/pull/6468>`_: Access ``replace_functions_map`` via PreParforPass instance (`Sergey Pokhodenko <https://github.com/PokhodenkoSA>`_ `Reazul Hoque <https://github.com/reazulhoque>`_)
* PR `#6469 <https://github.com/numba/numba/pull/6469>`_: Add address space in pointer type (`Sergey Pokhodenko <https://github.com/PokhodenkoSA>`_ `Reazul Hoque <https://github.com/reazulhoque>`_)
* PR `#6608 <https://github.com/numba/numba/pull/6608>`_: Support f-strings for common cases (`Ehsan Totoni <https://github.com/ehsantn>`_)
* PR `#6619 <https://github.com/numba/numba/pull/6619>`_: Improved fastmath code generation for trig, log, and exp/pow. (`Graham Markall <https://github.com/gmarkall>`_ `Michael Collison <https://github.com/testhound>`_)
* PR `#6681 <https://github.com/numba/numba/pull/6681>`_: Explicitly catch ``with..as`` and raise error. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6689 <https://github.com/numba/numba/pull/6689>`_: Fix setup.py build command detection (`Hannes Pahl <https://github.com/HPLegion>`_)
* PR `#6695 <https://github.com/numba/numba/pull/6695>`_: Enable negative indexing for cuda atomic operations (`Ashutosh Varma <https://github.com/ashutoshvarma>`_)
* PR `#6696 <https://github.com/numba/numba/pull/6696>`_: flake8: made more files flake8 compliant (`Ashutosh Varma <https://github.com/ashutoshvarma>`_)
* PR `#6698 <https://github.com/numba/numba/pull/6698>`_: Fix #6697: Wrong dtype when using np.asarray on DeviceNDArray (`Ashutosh Varma <https://github.com/ashutoshvarma>`_)
* PR `#6700 <https://github.com/numba/numba/pull/6700>`_: Add UUID to CUDA devices (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6709 <https://github.com/numba/numba/pull/6709>`_: Block matplotlib in test examples (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6718 <https://github.com/numba/numba/pull/6718>`_: doc: fix typo in rewrites.rst (extra iterates) (`Alexander-Makaryev <https://github.com/Alexander-Makaryev>`_)
* PR `#6720 <https://github.com/numba/numba/pull/6720>`_: Faster compile (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#6730 <https://github.com/numba/numba/pull/6730>`_: Fix Typeguard error (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6731 <https://github.com/numba/numba/pull/6731>`_: Add CUDA-specific pipeline (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6735 <https://github.com/numba/numba/pull/6735>`_: CUDA: Don't parse IR for modules with llvmlite (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6736 <https://github.com/numba/numba/pull/6736>`_: Support for dict comprehension (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6742 <https://github.com/numba/numba/pull/6742>`_: Do not add overload function definitions to index. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6750 <https://github.com/numba/numba/pull/6750>`_: Bump to llvmlite 0.37 series (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#6751 <https://github.com/numba/numba/pull/6751>`_: Suppress typeguard warnings that affect testing. (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#6753 <https://github.com/numba/numba/pull/6753>`_: The check for internal types in RewriteArrayExprs (`Alexander-Makaryev <https://github.com/Alexander-Makaryev>`_)
* PR `#6755 <https://github.com/numba/numba/pull/6755>`_: install llvmlite from numba/label/dev (`esc <https://github.com/esc>`_)
* PR `#6758 <https://github.com/numba/numba/pull/6758>`_: patch to compile _devicearray.cpp with c++11 (`esc <https://github.com/esc>`_)
* PR `#6760 <https://github.com/numba/numba/pull/6760>`_: Fix scheduler bug where it rounds to 0 divisions for a chunk. (`Todd A. Anderson <https://github.com/DrTodd13>`_)
* PR `#6762 <https://github.com/numba/numba/pull/6762>`_: Glue wrappers to create @overload from split typing and lowering. (`stuartarchibald <https://github.com/stuartarchibald>`_ `Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#6766 <https://github.com/numba/numba/pull/6766>`_: Fix DeviceNDArray null shape issue (`Michael Collison <https://github.com/testhound>`_)
* PR `#6769 <https://github.com/numba/numba/pull/6769>`_: CUDA: Replace ``CachedPTX`` and ``CachedCUFunction`` with ``CUDACodeLibrary`` functionality (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6776 <https://github.com/numba/numba/pull/6776>`_: Fix issue with TBB interface causing warnings and parfors counting them (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6779 <https://github.com/numba/numba/pull/6779>`_: Fix wrap_index type unification. (`Todd A. Anderson <https://github.com/DrTodd13>`_)
* PR `#6786 <https://github.com/numba/numba/pull/6786>`_: Fix gufunc kwargs support (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#6788 <https://github.com/numba/numba/pull/6788>`_: Add support for fastmath 32-bit floating point divide (`Michael Collison <https://github.com/testhound>`_)
* PR `#6789 <https://github.com/numba/numba/pull/6789>`_: Fix warnings struct ref typeguard (`stuartarchibald <https://github.com/stuartarchibald>`_ `Siu Kwan Lam <https://github.com/sklam>`_ `esc <https://github.com/esc>`_)
* PR `#6794 <https://github.com/numba/numba/pull/6794>`_: refactor and move create_temp_module into numba.tests.support (`Alexander-Makaryev <https://github.com/Alexander-Makaryev>`_)
* PR `#6795 <https://github.com/numba/numba/pull/6795>`_: CUDA: Lazily add libdevice to compilation units (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6798 <https://github.com/numba/numba/pull/6798>`_: CUDA: Add optional Driver API argument logging (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6799 <https://github.com/numba/numba/pull/6799>`_: Print Numba and llvmlite versions in sysinfo (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6800 <https://github.com/numba/numba/pull/6800>`_: Make a common standard API for querying ufunc impl (`Sergey Pokhodenko <https://github.com/PokhodenkoSA>`_ `Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#6801 <https://github.com/numba/numba/pull/6801>`_: ParallelAccelerator no long will convert StaticSetItem to SetItem because record arrays require StaticSetItems. (`Todd A. Anderson <https://github.com/DrTodd13>`_)
* PR `#6802 <https://github.com/numba/numba/pull/6802>`_: Add lineinfo flag to PTX and SASS compilation (`Graham Markall <https://github.com/gmarkall>`_ `Max Katz <https://github.com/maxpkatz>`_)
* PR `#6804 <https://github.com/numba/numba/pull/6804>`_: added runtime version to ``numba -s`` (`Kalyan <https://github.com/rawwar>`_)
* PR `#6808 <https://github.com/numba/numba/pull/6808>`_: #3468 continued: Add support for ``np.clip`` (`Graham Markall <https://github.com/gmarkall>`_ `Aaron Russell Voelker <https://github.com/arvoelke>`_)
* PR `#6809 <https://github.com/numba/numba/pull/6809>`_: #3203 additional info in cuda detect (`Kalyan <https://github.com/rawwar>`_)
* PR `#6810 <https://github.com/numba/numba/pull/6810>`_: Fix tiny formatting error in ROC kernel docs (`Felix Divo <https://github.com/felixdivo>`_)
* PR `#6811 <https://github.com/numba/numba/pull/6811>`_: CUDA: Remove test of runtime being a supported version (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6813 <https://github.com/numba/numba/pull/6813>`_: Mostly CUDA: Replace llvmpy API usage with llvmlite APIs (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6814 <https://github.com/numba/numba/pull/6814>`_: Improving context stack (`stuartarchibald <https://github.com/stuartarchibald>`_ `Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#6818 <https://github.com/numba/numba/pull/6818>`_: CUDA: Support IPC on Windows (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6822 <https://github.com/numba/numba/pull/6822>`_: Add support for np.rot90 (`stuartarchibald <https://github.com/stuartarchibald>`_ `Daniel Nagel <https://github.com/braniii>`_)
* PR `#6829 <https://github.com/numba/numba/pull/6829>`_: Fix accuracy of np.arange and np.linspace (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6830 <https://github.com/numba/numba/pull/6830>`_: CUDA: Use relaxed strides checking to compute contiguity (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6833 <https://github.com/numba/numba/pull/6833>`_: Raise TypeError exception if numpy array is cast to scalar (`Michael Collison <https://github.com/testhound>`_)
* PR `#6834 <https://github.com/numba/numba/pull/6834>`_: Remove illegal "debug" kw argument (`Shaun Cutts <https://github.com/shaunc>`_)
* PR `#6836 <https://github.com/numba/numba/pull/6836>`_: CUDA: Documentation updates (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6840 <https://github.com/numba/numba/pull/6840>`_: CUDA: Remove items deprecated in 0.53 + simulator test fixes (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6841 <https://github.com/numba/numba/pull/6841>`_: CUDA: Fix source location on kernel entry and enable breakpoints to be set on kernels by mangled name (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6843 <https://github.com/numba/numba/pull/6843>`_: cross-referenced Array type in docs (`Kalyan <https://github.com/rawwar>`_)
* PR `#6844 <https://github.com/numba/numba/pull/6844>`_: CUDA: Remove NUMBAPRO env var warnings, envvars.py + other small tidy-ups (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6848 <https://github.com/numba/numba/pull/6848>`_: Ignore .ycm_extra_conf.py (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6849 <https://github.com/numba/numba/pull/6849>`_: Add __hash__ for IntEnum (`Hannes Pahl <https://github.com/HPLegion>`_)
* PR `#6850 <https://github.com/numba/numba/pull/6850>`_: Fix up more internal warnings (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6854 <https://github.com/numba/numba/pull/6854>`_: PR 6096 continued (`stuartarchibald <https://github.com/stuartarchibald>`_ `Ivan Butygin <https://github.com/Hardcode84>`_)
* PR `#6861 <https://github.com/numba/numba/pull/6861>`_: updated reference to hsa with roc (`Kalyan <https://github.com/rawwar>`_)
* PR `#6867 <https://github.com/numba/numba/pull/6867>`_: Update changelog for 0.53.1 (`esc <https://github.com/esc>`_)
* PR `#6869 <https://github.com/numba/numba/pull/6869>`_: Implement builtin sum() (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6870 <https://github.com/numba/numba/pull/6870>`_: Add support for dispatcher retargeting using with-context (`stuartarchibald <https://github.com/stuartarchibald>`_ `Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#6871 <https://github.com/numba/numba/pull/6871>`_: Force text-align:left when using Annotate (`Guilherme Leobas <https://github.com/guilhermeleobas>`_)
* PR `#6873 <https://github.com/numba/numba/pull/6873>`_: docs: Update reference to @jitclass location (`David Nadlinger <https://github.com/dnadlinger>`_)
* PR `#6876 <https://github.com/numba/numba/pull/6876>`_: Add trailing slashes to dir paths in CODEOWNERS (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6877 <https://github.com/numba/numba/pull/6877>`_: Add doc for recent target extension features (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#6878 <https://github.com/numba/numba/pull/6878>`_: CUDA: Support passing tuples to ufuncs (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6879 <https://github.com/numba/numba/pull/6879>`_: CUDA: NumPy and string dtypes for local and shared arrays (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6880 <https://github.com/numba/numba/pull/6880>`_: Add attribute lower_extension to CPUContext (`Reazul Hoque <https://github.com/reazulhoque>`_)
* PR `#6883 <https://github.com/numba/numba/pull/6883>`_: Add support of np.swapaxes #4074 (`Daniel Nagel <https://github.com/braniii>`_)
* PR `#6885 <https://github.com/numba/numba/pull/6885>`_: CUDA: Explicitly specify objmode + looplifting for jit functions in cuda.random (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6886 <https://github.com/numba/numba/pull/6886>`_: CUDA: Fix parallel testing for all testsuite submodules (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6888 <https://github.com/numba/numba/pull/6888>`_: Get overload to consider compiler flags in cache lookup (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#6889 <https://github.com/numba/numba/pull/6889>`_: Address guvectorize too slow for cuda target (`Michael Collison <https://github.com/testhound>`_)
* PR `#6890 <https://github.com/numba/numba/pull/6890>`_: fixes #6884 (`Kalyan <https://github.com/rawwar>`_)
* PR `#6898 <https://github.com/numba/numba/pull/6898>`_: Work on overloading by hardware target. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6911 <https://github.com/numba/numba/pull/6911>`_: CUDA: Add support for activemask(), lanemask_lt(), and nanosleep() (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6912 <https://github.com/numba/numba/pull/6912>`_: Prevent use of varargs in closure calls. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6913 <https://github.com/numba/numba/pull/6913>`_: Add runtests option to gitdiff on the common ancestor (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#6915 <https://github.com/numba/numba/pull/6915>`_: Update _Intrinsic for sphinx to capture the inner docstring (`Guilherme Leobas <https://github.com/guilhermeleobas>`_)
* PR `#6917 <https://github.com/numba/numba/pull/6917>`_: Add type conversion for StringLiteral to unicode_type and test. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6918 <https://github.com/numba/numba/pull/6918>`_: Start section on commonly encounted unsupported parfors code. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6924 <https://github.com/numba/numba/pull/6924>`_: CUDA: Fix ``ffs`` (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6928 <https://github.com/numba/numba/pull/6928>`_: Add support for axis keyword arg to numpy.argmax() (`stuartarchibald <https://github.com/stuartarchibald>`_ `Itamar Turner-Trauring <https://github.com/itamarst>`_)
* PR `#6929 <https://github.com/numba/numba/pull/6929>`_: Fix CI failure when gitpython is missing. (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#6935 <https://github.com/numba/numba/pull/6935>`_: fixes broken link in numba-runtime.rst (`Kalyan <https://github.com/rawwar>`_)
* PR `#6936 <https://github.com/numba/numba/pull/6936>`_: CUDA: Implement support for PTDS globally (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6937 <https://github.com/numba/numba/pull/6937>`_: Fix memory leak in bytes boxing (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6940 <https://github.com/numba/numba/pull/6940>`_: Fix function resolution for intrinsics across hardware. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6941 <https://github.com/numba/numba/pull/6941>`_: ABC the target descriptor and make consistent throughout. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6944 <https://github.com/numba/numba/pull/6944>`_: CUDA: Support for ``@overload`` (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6945 <https://github.com/numba/numba/pull/6945>`_: Fix issue with array analysis tests needing scipy. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6948 <https://github.com/numba/numba/pull/6948>`_: Refactor registry init. (`stuartarchibald <https://github.com/stuartarchibald>`_ `Graham Markall <https://github.com/gmarkall>`_ `Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#6953 <https://github.com/numba/numba/pull/6953>`_: CUDA: Fix and deprecate ``inspect_ptx()``, fix NVVM option setup for device functions (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6958 <https://github.com/numba/numba/pull/6958>`_: Inconsistent behavior of reshape between numpy and numba/cuda device array (`Lauren Arnett <https://github.com/laurenarnett>`_)
* PR `#6961 <https://github.com/numba/numba/pull/6961>`_: Update overload glue to deal with typing_key (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6964 <https://github.com/numba/numba/pull/6964>`_: Move minimum supported Python version to 3.7 (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6966 <https://github.com/numba/numba/pull/6966>`_: Fix issue with TBB test detecting forks from incorrect state. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6971 <https://github.com/numba/numba/pull/6971>`_: Fix CUDA ``@intrinsic`` use (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6977 <https://github.com/numba/numba/pull/6977>`_: Vendor cloudpickle (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#6978 <https://github.com/numba/numba/pull/6978>`_: Implement operator.contains for empty Tuples (`Brandon T. Willard <https://github.com/brandonwillard>`_)
* PR `#6981 <https://github.com/numba/numba/pull/6981>`_: Fix LLVM IR parsing error on use of ``np.bool_`` in globals (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6983 <https://github.com/numba/numba/pull/6983>`_: Support Optional types in ufuncs. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6985 <https://github.com/numba/numba/pull/6985>`_: Implement static set/get items on records with integer index (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6986 <https://github.com/numba/numba/pull/6986>`_: document release checklist (`esc <https://github.com/esc>`_)
* PR `#6989 <https://github.com/numba/numba/pull/6989>`_: update threading docs for function loading (`esc <https://github.com/esc>`_)
* PR `#6990 <https://github.com/numba/numba/pull/6990>`_: Refactor hardware extension API to refer to "target" instead. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6991 <https://github.com/numba/numba/pull/6991>`_: Move ROCm target status to "unmaintained". (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#6995 <https://github.com/numba/numba/pull/6995>`_: Resolve issue where nan was being assigned to int type numpy array (`Michael Collison <https://github.com/testhound>`_)
* PR `#6996 <https://github.com/numba/numba/pull/6996>`_: Add constant lowering support for `SliceType`s (`Brandon T. Willard <https://github.com/brandonwillard>`_)
* PR `#6997 <https://github.com/numba/numba/pull/6997>`_: CUDA: Remove catch of NotImplementedError in target.py (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#6999 <https://github.com/numba/numba/pull/6999>`_: Fix errors introduced by the cloudpickle patch (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7003 <https://github.com/numba/numba/pull/7003>`_: More mainline fixes (`stuartarchibald <https://github.com/stuartarchibald>`_ `Graham Markall <https://github.com/gmarkall>`_ `Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7004 <https://github.com/numba/numba/pull/7004>`_: Test extending the CUDA target (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7007 <https://github.com/numba/numba/pull/7007>`_: Made stencil compilation not fail for arrays of conflicting types. (`MegaIng <https://github.com/MegaIng>`_)
* PR `#7008 <https://github.com/numba/numba/pull/7008>`_: Added support for np.random.dirichlet with all size arguments (`Rishi Kulkarni <https://github.com/rishi-kulkarni>`_)
* PR `#7016 <https://github.com/numba/numba/pull/7016>`_: Docs: Add DALI to list of CAI-supporting libraries (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7018 <https://github.com/numba/numba/pull/7018>`_: Remove cu{blas,sparse,rand,fft} from library checks (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7019 <https://github.com/numba/numba/pull/7019>`_: Support NumPy 1.20 (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7020 <https://github.com/numba/numba/pull/7020>`_: Fix #7017. Adds util class PickleCallableByPath (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7024 <https://github.com/numba/numba/pull/7024>`_: fixed llvmir usage in create_module method (`stuartarchibald <https://github.com/stuartarchibald>`_ `Kalyan <https://github.com/rawwar>`_)
* PR `#7027 <https://github.com/numba/numba/pull/7027>`_: Fix nrt debug print (`MegaIng <https://github.com/MegaIng>`_)
* PR `#7031 <https://github.com/numba/numba/pull/7031>`_: Fix inliner to use a single scope for all blocks (`Alexey Kozlov <https://github.com/kozlov-alexey>`_ `Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7040 <https://github.com/numba/numba/pull/7040>`_: Add Github action to mark issues as stale (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7044 <https://github.com/numba/numba/pull/7044>`_: Fixes for LLVM 11 (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7049 <https://github.com/numba/numba/pull/7049>`_: Make NumPy random module use @overload_glue (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7050 <https://github.com/numba/numba/pull/7050>`_: Add overload_classmethod (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7052 <https://github.com/numba/numba/pull/7052>`_: Fix string support in CUDA target (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7056 <https://github.com/numba/numba/pull/7056>`_: Change prange conversion approach to reuse header block. (`Todd A. Anderson <https://github.com/DrTodd13>`_)
* PR `#7061 <https://github.com/numba/numba/pull/7061>`_: Add ndarray allocator classmethod (`stuartarchibald <https://github.com/stuartarchibald>`_ `Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7064 <https://github.com/numba/numba/pull/7064>`_: Testhound/host array performance warning (`Michael Collison <https://github.com/testhound>`_)
* PR `#7066 <https://github.com/numba/numba/pull/7066>`_: Fix #7065: Add expected exception messages for NumPy 1.20 to tests (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7068 <https://github.com/numba/numba/pull/7068>`_: Enhancing docs about PRNG seeding (`Jérome Eertmans <https://github.com/jeertmans>`_)
* PR `#7070 <https://github.com/numba/numba/pull/7070>`_: Improve the issue templates and pull request template. (`Guoqiang QI <https://github.com/guoqiangqi>`_)
* PR `#7080 <https://github.com/numba/numba/pull/7080>`_: Fix ``__eq__`` for Flags and cpu_options classes (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7087 <https://github.com/numba/numba/pull/7087>`_: Add note to docs about zero-initialization of variables. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7088 <https://github.com/numba/numba/pull/7088>`_: Initialize NUMBA_DEFAULT_NUM_THREADS with a batch scheduler aware value (`Thomas VINCENT <https://github.com/t20100>`_)
* PR `#7100 <https://github.com/numba/numba/pull/7100>`_: Replace deprecated call to cuDeviceComputeCapability (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7113 <https://github.com/numba/numba/pull/7113>`_: Temporarily disable debug env export. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7114 <https://github.com/numba/numba/pull/7114>`_: CUDA: Deprecate eager compilation of device functions (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7116 <https://github.com/numba/numba/pull/7116>`_: Fix various issues with dwarf emission: (`stuartarchibald <https://github.com/stuartarchibald>`_ `vlad-perevezentsev <https://github.com/vlad-perevezentsev>`_)
* PR `#7118 <https://github.com/numba/numba/pull/7118>`_: Remove print to stdout (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7121 <https://github.com/numba/numba/pull/7121>`_: Continue work on numpy subclasses (`Todd A. Anderson <https://github.com/DrTodd13>`_ `Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7122 <https://github.com/numba/numba/pull/7122>`_: Rtd/sphinx compat (`esc <https://github.com/esc>`_)
* PR `#7134 <https://github.com/numba/numba/pull/7134>`_: Move minimum LLVM version to 11. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7137 <https://github.com/numba/numba/pull/7137>`_: skip pycc test on Python 3.7 + macOS because of distutils issue (`esc <https://github.com/esc>`_)
* PR `#7138 <https://github.com/numba/numba/pull/7138>`_: Update the Azure default linux image to Ubuntu 18.04 (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7141 <https://github.com/numba/numba/pull/7141>`_: Require llvmlite 0.37 as minimum supported. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7143 <https://github.com/numba/numba/pull/7143>`_: Update version checks in __init__ for np 1.17 (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7145 <https://github.com/numba/numba/pull/7145>`_: Fix mainline (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7146 <https://github.com/numba/numba/pull/7146>`_: Fix ``inline_closurecall`` may not be imported (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7147 <https://github.com/numba/numba/pull/7147>`_: Revert "Workaround gitpython 3.1.18 dependency issue" (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7149 <https://github.com/numba/numba/pull/7149>`_: Fix issue in bytecode analysis where target and next are same. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7152 <https://github.com/numba/numba/pull/7152>`_: Fix iterators in CUDA (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7156 <https://github.com/numba/numba/pull/7156>`_: Fix ``ir_utils._max_label`` being updated incorrectly (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7160 <https://github.com/numba/numba/pull/7160>`_: Split parfors tests (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7161 <https://github.com/numba/numba/pull/7161>`_: Update README for 0.54 (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7162 <https://github.com/numba/numba/pull/7162>`_: CUDA: Fix linkage of device functions when compiling for debug (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7163 <https://github.com/numba/numba/pull/7163>`_: Split legalization pass to consider IR and features separately. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7165 <https://github.com/numba/numba/pull/7165>`_: Fix use of np.clip where out is not provided. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7189 <https://github.com/numba/numba/pull/7189>`_: CUDA: Skip IPC tests on ARM (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7190 <https://github.com/numba/numba/pull/7190>`_: CUDA: Fix test_pinned on Jetson (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7192 <https://github.com/numba/numba/pull/7192>`_: Fix missing import in array.argsort impl and add more tests. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7196 <https://github.com/numba/numba/pull/7196>`_: Fixes for lineinfo emission. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7203 <https://github.com/numba/numba/pull/7203>`_: remove duplicate changelog entries (`esc <https://github.com/esc>`_)
* PR `#7209 <https://github.com/numba/numba/pull/7209>`_: Clamp numpy (`esc <https://github.com/esc>`_)
* PR `#7216 <https://github.com/numba/numba/pull/7216>`_: Update CHANGE_LOG for 0.54.0rc2. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7223 <https://github.com/numba/numba/pull/7223>`_: Replace assertion errors on IR assumption violation (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7230 <https://github.com/numba/numba/pull/7230>`_: PR #7171 bugfix only (`Todd A. Anderson <https://github.com/DrTodd13>`_ `stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7236 <https://github.com/numba/numba/pull/7236>`_: CUDA: Skip managed alloc tests on ARM (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7267 <https://github.com/numba/numba/pull/7267>`_: Fix #7258. Bug in SROA optimization (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7271 <https://github.com/numba/numba/pull/7271>`_: Update 3rd party license text. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7272 <https://github.com/numba/numba/pull/7272>`_: Allow annotations in njit-ed functions (`LunarLanding <https://github.com/LunarLanding>`_)
* PR `#7273 <https://github.com/numba/numba/pull/7273>`_: Update CHANGE_LOG for 0.54.0rc3. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7285 <https://github.com/numba/numba/pull/7285>`_: CUDA: Fix OOB in test_kernel_arg (`Graham Markall <https://github.com/gmarkall>`_)
* PR `#7294 <https://github.com/numba/numba/pull/7294>`_: Continuation of PR #7280, fixing lifetime of TBB task_scheduler_handle (`Sergey Pokhodenko <https://github.com/PokhodenkoSA>`_ `stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7298 <https://github.com/numba/numba/pull/7298>`_: Use CBC to pin GCC to 7 on most linux and 9 on aarch64. (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7312 <https://github.com/numba/numba/pull/7312>`_: Fix #7302. Workaround missing pthread problem on ppc64le (`Siu Kwan Lam <https://github.com/sklam>`_)
* PR `#7317 <https://github.com/numba/numba/pull/7317>`_: In TBB tsh test switch os.fork for mp fork ctx (`stuartarchibald <https://github.com/stuartarchibald>`_)
* PR `#7319 <https://github.com/numba/numba/pull/7319>`_: Update CHANGE_LOG for 0.54.0 final. (`stuartarchibald <https://github.com/stuartarchibald>`_)
Authors:
* `Alexander-Makaryev <https://github.com/Alexander-Makaryev>`_
* `Todd A. Anderson <https://github.com/DrTodd13>`_
* `Hannes Pahl <https://github.com/HPLegion>`_
* `Ivan Butygin <https://github.com/Hardcode84>`_
* `MegaIng <https://github.com/MegaIng>`_
* `Sergey Pokhodenko <https://github.com/PokhodenkoSA>`_
* `Aaron Russell Voelker <https://github.com/arvoelke>`_
* `Ashutosh Varma <https://github.com/ashutoshvarma>`_
* `Ben Greiner <https://github.com/bnavigator>`_
* `Brandon T. Willard <https://github.com/brandonwillard>`_
* `Daniel Nagel <https://github.com/braniii>`_
* `David Nadlinger <https://github.com/dnadlinger>`_
* `Ehsan Totoni <https://github.com/ehsantn>`_
* `esc <https://github.com/esc>`_
* `Felix Divo <https://github.com/felixdivo>`_
* `Graham Markall <https://github.com/gmarkall>`_
* `Guilherme Leobas <https://github.com/guilhermeleobas>`_
* `Guoqiang QI <https://github.com/guoqiangqi>`_
* `Itamar Turner-Trauring <https://github.com/itamarst>`_
* `Jérome Eertmans <https://github.com/jeertmans>`_
* `Alexey Kozlov <https://github.com/kozlov-alexey>`_
* `Lauren Arnett <https://github.com/laurenarnett>`_
* `LunarLanding <https://github.com/LunarLanding>`_
* `Max Katz <https://github.com/maxpkatz>`_
* `Kalyan <https://github.com/rawwar>`_
* `Reazul Hoque <https://github.com/reazulhoque>`_
* `Rishi Kulkarni <https://github.com/rishi-kulkarni>`_
* `Shaun Cutts <https://github.com/shaunc>`_
* `Siu Kwan Lam <https://github.com/sklam>`_
* `stuartarchibald <https://github.com/stuartarchibald>`_
* `Thomas VINCENT <https://github.com/t20100>`_
* `Michael Collison <https://github.com/testhound>`_
* `vlad-perevezentsev <https://github.com/vlad-perevezentsev>`_
Version 0.53.1 (25 March, 2021)
-------------------------------
This is a bugfix release for 0.53.0. It contains the following four
pull-requests which fix two critical regressions and two build failures
reported by the openSuSe team:
* PR #6826 Fix regression on gufunc serialization
* PR #6828 Fix regression in CUDA: Set stream in mapped and managed array
device_setup
* PR #6837 Ignore warnings from packaging module when testing import behaviour.
* PR #6851 set non-reported llvm timing values to 0.0
Authors:
* Ben Greiner
* Graham Markall
* Siu Kwan Lam
* Stuart Archibald
Version 0.53.0 (11 March, 2021)
-------------------------------
This release continues to add new features, bug fixes and stability improvements
to Numba.
Highlights of core changes:
* Support for Python 3.9 (Stuart Archibald).
* Function sub-typing (Lucio Fernandez-Arjona).
* Initial support for dynamic ``gufuncs`` (i.e. from ``@guvectorize``)
(Guilherme Leobas).
* Parallel Accelerator (``@njit(parallel=True)`` now supports Fortran ordered
arrays (Todd A. Anderson and Siu Kwan Lam).
Intel also kindly sponsored research and development that lead to two new
features:
* Exposing LLVM compilation pass timings for diagnostic purposes (Siu Kwan
Lam).
* An event system for broadcasting compiler events (Siu Kwan Lam).
Highlights of changes for the CUDA target:
* CUDA 11.2 onwards (versions of the toolkit using NVVM IR 1.6 / LLVM IR 7.0.1)
are now supported (Graham Markall).
* A fast cube root function is added (Michael Collison).
* Support for atomic ``xor``, increment, decrement, exchange, are added, and
compare-and-swap is extended to support 64-bit integers (Michael Collison).
* Addition of ``cuda.is_supported_version()`` to check if the CUDA runtime
version is supported (Graham Markall).
* The CUDA dispatcher now shares infrastructure with the CPU dispatcher,
improving launch times for lazily-compiled kernels (Graham Markall).
* The CUDA Array Interface is updated to version 3, with support for streams
added (Graham Markall).
* Tuples and ``namedtuples`` can now be passed to kernels (Graham Markall).
* Initial support for Cooperative Groups is added, with support for Grid Groups
and Grid Sync (Graham Markall and Nick White).
* Support for ``math.log2`` and ``math.remainder`` is added (Guilherme Leobas).
General deprecation notices:
* There are no new general deprecations.
CUDA target deprecation notices:
* CUDA support on macOS is deprecated with this release (it still works, it is
just unsupported).
* The ``argtypes``, ``restypes``, and ``bind`` keyword arguments to the
``cuda.jit`` decorator, deprecated since 0.51.0, are removed
* The ``Device.COMPUTE_CAPABILITY`` property, deprecated since 2014, has been
removed (use ``compute_capability`` instead).
* The ``to_host`` method of device arrays is removed (use ``copy_to_host``
instead).
General Enhancements:
* PR #4769: objmode complex type spelling (Siu Kwan Lam)
* PR #5579: Function subtyping (Lucio Fernandez-Arjona)
* PR #5659: Add support for parfors creating 'F'ortran layout Numpy arrays.
(Todd A. Anderson)
* PR #5936: Improve array analysis for user-defined data types. (Todd A.
Anderson)
* PR #5938: Initial support for dynamic gufuncs (Guilherme Leobas)
* PR #5958: Making typed.List a typing Generic (Lucio Fernandez-Arjona)
* PR #6334: Support attribute access from other modules (Farah Hariri)
* PR #6373: Allow Dispatchers to be cached (Eric Wieser)
* PR #6519: Avoid unnecessary ir.Del generation and removal (Ehsan Totoni)
* PR #6545: Refactoring ParforDiagnostics (Elena Totmenina)
* PR #6560: Add LLVM pass timer (Siu Kwan Lam)
* PR #6573: Improve ``__str__`` for typed.List when invoked from IPython shell
(Amin Sadeghi)
* PR #6575: Avoid temp variable assignments (Ehsan Totoni)
* PR #6578: Add support for numpy ``intersect1d`` and basic test cases
(``@caljrobe``)
* PR #6579: Python 3.9 support. (Stuart Archibald)
* PR #6580: Store partial typing errors in compiler state (Ehsan Totoni)
* PR #6626: A simple event system to broadcast compiler events (Siu Kwan Lam)
* PR #6635: Try to resolve dynamic getitems as static post unroll transform.
(Stuart Archibald)
* PR #6636: Adds llvm_lock event (Siu Kwan Lam)
* PR #6664: Adds tests for PR 5659 (Siu Kwan Lam)
* PR #6680: Allow getattr to work in objmode output type spec (Siu Kwan Lam)
Fixes:
* PR #6176: Remove references to deprecated numpy globals (Eric Wieser)
* PR #6374: Use Python 3 style OSError handling (Eric Wieser)
* PR #6402: Fix ``typed.Dict`` and ``typed.List`` crashing on parametrized types
(Andreas Sodeur)
* PR #6403: Add ``types.ListType.key`` (Andreas Sodeur)
* PR #6410: Fixes issue #6386 (Danny Weitekamp)
* PR #6425: Fix unicode join for issue #6405 (Teugea Ioan-Teodor)
* PR #6437: Don't pass reduction variables known in an outer parfor to inner
parfors when analyzing reductions. (Todd A. Anderson)
* PR #6453: Keep original variable names in metadata to improve diagnostics
(Ehsan Totoni)
* PR #6454: FIX: Fixes for literals (Eric Larson)
* PR #6463: Bump llvmlite to 0.36 series (Stuart Archibald)
* PR #6466: Remove the misspelling of finalize_dynamic_globals (Sergey
Pokhodenko)
* PR #6489: Improve the error message for unsupported Buffer in Buffer
situation. (Stuart Archibald)
* PR #6503: Add test to ensure Numba imports without warnings. (Stuart
Archibald)
* PR #6508: Defer requirements to setup.py (Siu Kwan Lam)
* PR #6521: Skip annotated jitclass test if typeguard is running. (Stuart
Archibald)
* PR #6524: Fix typed.List return value (Lucio Fernandez-Arjona)
* PR #6562: Correcting typo in numba sysinfo output (Nick Sutcliffe)
* PR #6574: Run parfor fusion if 2 or more parfors (Ehsan Totoni)
* PR #6582: Fix typed dict error with uninitialized padding bytes (Siu Kwan
Lam)
* PR #6584: Remove jitclass from ``__init__`` ``__all__``. (Stuart Archibald)
* PR #6586: Run closure inlining ahead of branch pruning in case of nonlocal
(Stuart Archibald)
* PR #6591: Fix inlineasm test failure. (Siu Kwan Lam)
* PR #6622: Fix 6534, handle unpack of assign-like tuples. (Stuart Archibald)
* PR #6652: Simplify PR-6334 (Siu Kwan Lam)
* PR #6653: Fix get_numba_envvar (Siu Kwan Lam)
* PR #6654: Fix #6632 support alternative dtype string spellings (Stuart
Archibald)
* PR #6685: Add Python 3.9 to classifiers. (Stuart Archibald)
* PR #6693: patch to compile _devicearray.cpp with c++11 (Valentin Haenel)
* PR #6716: Consider assignment lhs live if used in rhs (Fixes #6715) (Ehsan
Totoni)
* PR #6727: Avoid errors in array analysis for global tuples with non-int
(Ehsan Totoni)
* PR #6733: Fix segfault and errors in #6668 (Siu Kwan Lam)
* PR #6741: Enable SSA in IR inliner (Ehsan Totoni)
* PR #6763: use an alternative constraint for the conda packages (Valentin
Haenel)
* PR #6786: Fix gufunc kwargs support (Siu Kwan Lam)
CUDA Enhancements/Fixes:
* PR #5162: Specify synchronization semantics of CUDA Array Interface (Graham
Markall)
* PR #6245: CUDA Cooperative grid groups (Graham Markall and Nick White)
* PR #6333: Remove dead ``_Kernel.__call__`` (Graham Markall)
* PR #6343: CUDA: Add support for passing tuples and namedtuples to kernels
(Graham Markall)
* PR #6349: Refactor Dispatcher to remove unnecessary indirection (Graham
Markall)
* PR #6358: Add log2 and remainder implementations for cuda (Guilherme Leobas)
* PR #6376: Added a fixed seed in test_atomics.py for issue #6370 (Teugea
Ioan-Teodor)
* PR #6377: CUDA: Fix various issues in test suite (Graham Markall)
* PR #6409: Implement cuda atomic xor (Michael Collison)
* PR #6422: CUDA: Remove deprecated items, expect CUDA 11.1 (Graham Markall)
* PR #6427: Remove duplicate repeated definition of gufunc (Amit Kumar)
* PR #6432: CUDA: Use ``_dispatcher.Dispatcher`` as base Dispatcher class
(Graham Markall)
* PR #6447: CUDA: Add get_regs_per_thread method to Dispatcher (Graham Markall)
* PR #6499: CUDA atomic increment, decrement, exchange and compare and swap
(Michael Collison)
* PR #6510: CUDA: Make device array assignment synchronous where necessary
(Graham Markall)
* PR #6517: CUDA: Add NVVM test of all 8-bit characters (Graham Markall)
* PR #6567: Refactor llvm replacement code into separate function (Michael
Collison)
* PR #6642: Testhound/cuda cuberoot (Michael Collison)
* PR #6661: CUDA: Support NVVM70 / CUDA 11.2 (Graham Markall)
* PR #6663: Fix error caused by missing "-static" libraries defined for some
platforms (Siu Kwan Lam)
* PR #6666: CUDA: Add a function to query whether the runtime version is
supported. (Graham Markall)
* PR #6725: CUDA: Fix compile to PTX with debug for CUDA 11.2 (Graham Markall)
Documentation Updates:
* PR #5740: Add FAQ entry on how to create a MWR. (Stuart Archibald)
* PR #6346: DOC: add where to get dev builds from to FAQ (Eyal Trabelsi)
* PR #6418: docs: use https for homepage (``@imba-tjd``)
* PR #6430: CUDA docs: Add RNG example with 3D grid and strided loops (Graham
Markall)
* PR #6436: docs: remove typo in Deprecation Notices (Thibault Ballier)
* PR #6440: Add note about performance of typed containers from the interpreter.
(Stuart Archibald)
* PR #6457: Link to read the docs instead of numba homepage (Hannes Pahl)
* PR #6470: Adding PyCon Sweden 2020 talk on numba (Ankit Mahato)
* PR #6472: Document ``numba.extending.is_jitted`` (Stuart Archibald)
* PR #6495: Fix typo in literal list docs. (Stuart Archibald)
* PR #6501: Add doc entry on Numba's limited resources and how to help. (Stuart
Archibald)
* PR #6502: Add CODEOWNERS file. (Stuart Archibald)
* PR #6531: Update canonical URL. (Stuart Archibald)
* PR #6544: Minor typo / grammar fixes to 5 minute guide (Ollin Boer Bohan)
* PR #6599: docs: fix simple typo, consevatively -> conservatively (Tim Gates)
* PR #6609: Recommend miniforge instead of c4aarch64 (Isuru Fernando)
* PR #6671: Update environment creation example to python 3.8 (Lucio
Fernandez-Arjona)
* PR #6676: Update hardware and software versions in various docs. (Stuart
Archibald)
* PR #6682: Update deprecation notices for 0.53 (Stuart Archibald)
CI/Infrastructure Updates:
* PR #6458: Enable typeguard in CI (Siu Kwan Lam)
* PR #6500: Update bug and feature request templates. (Stuart Archibald)
* PR #6516: Fix RTD build by using conda. (Stuart Archibald)
* PR #6587: Add zenodo badge (Siu Kwan Lam)
Authors:
* Amin Sadeghi
* Amit Kumar
* Andreas Sodeur
* Ankit Mahato
* Chris Barnes
* Danny Weitekamp
* Ehsan Totoni (core dev)
* Eric Larson
* Eric Wieser
* Eyal Trabelsi
* Farah Hariri
* Graham Markall
* Guilherme Leobas
* Hannes Pahl
* Isuru Fernando
* Lucio Fernandez-Arjona
* Michael Collison
* Nick Sutcliffe
* Nick White
* Ollin Boer Bohan
* Sergey Pokhodenko
* Siu Kwan Lam (core dev)
* Stuart Archibald (core dev)
* Teugea Ioan-Teodor
* Thibault Ballier
* Tim Gates
* Todd A. Anderson (core dev)
* Valentin Haenel (core dev)
* ``@caljrobe``
* ``@imba-tjd``
Version 0.52.0 (30 November, 2020)
----------------------------------
This release focuses on performance improvements, but also adds some new
features and contains numerous bug fixes and stability improvements.
Highlights of core performance improvements include:
* Intel kindly sponsored research and development into producing a new reference
count pruning pass. This pass operates at the LLVM level and can prune a
number of common reference counting patterns. This will improve performance
for two primary reasons:
* There will be less pressure on the atomic locks used to do the reference
counting.
* Removal of reference counting operations permits more inlining and the
optimisation passes can in general do more with what is present.
(Siu Kwan Lam).
* Intel also sponsored work to improve the performance of the
``numba.typed.List`` container, particularly in the case of ``__getitem__``
and iteration (Stuart Archibald).
* Superword-level parallelism vectorization is now switched on and the
optimisation pipeline has been lightly analysed and tuned so as to be able to
vectorize more and more often (Stuart Archibald).
Highlights of core feature changes include:
* The ``inspect_cfg`` method on the JIT dispatcher object has been
significantly enhanced and now includes highlighted output and interleaved
line markers and Python source (Stuart Archibald).
* The BSD operating system is now unofficially supported (Stuart Archibald).
* Numerous features/functionality improvements to NumPy support, including
support for:
* ``np.asfarray`` (Guilherme Leobas)
* "subtyping" in record arrays (Lucio Fernandez-Arjona)
* ``np.split`` and ``np.array_split`` (Isaac Virshup)
* ``operator.contains`` with ``ndarray`` (``@mugoh``).
* ``np.asarray_chkfinite`` (Rishabh Varshney).
* NumPy 1.19 (Stuart Archibald).
* the ``ndarray`` allocators, ``empty``, ``ones`` and ``zeros``, accepting a
``dtype`` specified as a string literal (Stuart Archibald).
* Booleans are now supported as literal types (Alexey Kozlov).
* On the CUDA target:
* CUDA 9.0 is now the minimum supported version (Graham Markall).
* Support for Unified Memory has been added (Max Katz).
* Kernel launch overhead is reduced (Graham Markall).
* Cudasim support for mapped array, memcopies and memset has been added (Mike
Williams).
* Access has been wired in to all libdevice functions (Graham Markall).
* Additional CUDA atomic operations have been added (Michael Collison).
* Additional math library functions (``frexp``, ``ldexp``, ``isfinite``)
(Zhihao Yuan).
* Support for ``power`` on complex numbers (Graham Markall).
Deprecations to note:
There are no new deprecations. However, note that "compatibility" mode, which
was added some 40 releases ago to help transition from 0.11 to 0.12+, has been
removed! Also, the shim to permit the import of ``jitclass`` from Numba's top
level namespace has now been removed as per the deprecation schedule.
General Enhancements:
* PR #5418: Add np.asfarray impl (Guilherme Leobas)
* PR #5560: Record subtyping (Lucio Fernandez-Arjona)
* PR #5609: Jitclass Infer Spec from Type Annotations (Ethan Pronovost)
* PR #5699: Implement np.split and np.array_split (Isaac Virshup)
* PR #6015: Adding BooleanLiteral type (Alexey Kozlov)
* PR #6027: Support operators inlining in InlineOverloads (Alexey Kozlov)
* PR #6038: Closes #6037, fixing FreeBSD compilation (László Károlyi)
* PR #6086: Add more accessible version information (Stuart Archibald)
* PR #6157: Add pipeline_class argument to @cfunc as supported by @jit. (Arthur
Peters)
* PR #6262: Support dtype from str literal. (Stuart Archibald)
* PR #6271: Support ``ndarray`` contains (``@mugoh``)
* PR #6295: Enhance inspect_cfg (Stuart Archibald)
* PR #6304: Support NumPy 1.19 (Stuart Archibald)
* PR #6309: Add suitable file search path for BSDs. (Stuart Archibald)
* PR #6341: Re roll 6279 (Rishabh Varshney and Valentin Haenel)
Performance Enhancements:
* PR #6145: Patch to fingerprint namedtuples. (Stuart Archibald)
* PR #6202: Speed up str(int) (Stuart Archibald)
* PR #6261: Add np.ndarray.ptp() support. (Stuart Archibald)
* PR #6266: Use custom LLVM refcount pruning pass (Siu Kwan Lam)
* PR #6275: Switch on SLP vectorize. (Stuart Archibald)
* PR #6278: Improve typed list performance. (Stuart Archibald)
* PR #6335: Split optimisation passes. (Stuart Archibald)
* PR #6455: Fix refprune on obfuscated refs and stabilize optimisation WRT
wrappers. (Stuart Archibald)
Fixes:
* PR #5639: Make UnicodeType inherit from Hashable (Stuart Archibald)
* PR #6006: Resolves incorrectly hoisted list in parfor. (Todd A. Anderson)
* PR #6126: fix version_info if version can not be determined (Valentin Haenel)
* PR #6137: Remove references to Python 2's long (Eric Wieser)
* PR #6139: Use direct syntax instead of the ``add_metaclass`` decorator (Eric
Wieser)
* PR #6140: Replace calls to utils.iteritems(d) with d.items() (Eric Wieser)
* PR #6141: Fix #6130 objmode cache segfault (Siu Kwan Lam)
* PR #6156: Remove callers of ``reraise`` in favor of using ``with_traceback``
directly (Eric Wieser)
* PR #6162: Move charseq support out of init (Stuart Archibald)
* PR #6165: #5425 continued (Amos Bird and Stuart Archibald)
* PR #6166: Remove Python 2 compatibility from numba.core.utils (Eric Wieser)
* PR #6185: Better error message on NotDefinedError (Luiz Almeida)
* PR #6194: Remove recursion from traverse_types (Radu Popovici)
* PR #6200: Workaround #5973 (Stuart Archibald)
* PR #6203: Make find_callname only lookup functions that are likely part of
NumPy. (Stuart Archibald)
* PR #6204: Fix unicode kind selection for getitem. (Stuart Archibald)
* PR #6206: Build all extension modules with -g -Wall -Werror on Linux x86,
provide -O0 flag option (Graham Markall)
* PR #6212: Fix for objmode recompilation issue (Alexey Kozlov)
* PR #6213: Fix #6177. Remove AOT dependency on the Numba package (Siu Kwan Lam)
* PR #6224: Add support for tuple concatenation to array analysis. (#5396
continued) (Todd A. Anderson)
* PR #6231: Remove compatibility mode (Graham Markall)
* PR #6254: Fix win-32 hashing bug (from Stuart Archibald) (Ray Donnelly)
* PR #6265: Fix #6260 (Stuart Archibald)
* PR #6267: speed up a couple of really slow unittests (Stuart Archibald)
* PR #6281: Remove numba.jitclass shim as per deprecation schedule. (Stuart
Archibald)
* PR #6294: Make return type propagate to all return variables (Andreas Sodeur)
* PR #6300: Un-skip tests that were skipped because of #4026. (Owen Anderson)
* PR #6307: Remove restrictions on SVML version due to bug in LLVM SVML CC
(Stuart Archibald)
* PR #6316: Make IR inliner tests not self mutating. (Stuart Archibald)
* PR #6318: PR #5892 continued (Todd A. Anderson, via Stuart Archibald)
* PR #6319: Permit switching off boundschecking when debug is on. (Stuart
Archibald)
* PR #6324: PR 6208 continued (Ivan Butygin and Stuart Archibald)
* PR #6337: Implements ``key`` on ``types.TypeRef`` (Andreas Sodeur)
* PR #6354: Bump llvmlite to 0.35. series. (Stuart Archibald)
* PR #6357: Fix enumerate invalid decref (Siu Kwan Lam)
* PR #6359: Fixes typed list indexing on 32bit (Stuart Archibald)
* PR #6378: Fix incorrect CPU override in vectorization test. (Stuart Archibald)
* PR #6379: Use O0 to enable inline and not affect loop-vectorization by later
O3... (Siu Kwan Lam)
* PR #6384: Fix failing tests to match on platform invariant int spelling.
(Stuart Archibald)
* PR #6390: Updates inspect_cfg (Stuart Archibald)
* PR #6396: Remove hard dependency on tbb package. (Stuart Archibald)
* PR #6408: Don't do array analysis for tuples that contain arrays. (Todd A.
Anderson)
* PR #6441: Fix ASCII flag in Unicode slicing (0.52.0rc2 regression) (Ehsan
Totoni)
* PR #6442: Fix array analysis regression in 0.52 RC2 for tuple of 1D arrays
(Ehsan Totoni)
* PR #6446: Fix #6444: pruner issues with reference stealing functions (Siu
Kwan Lam)
* PR #6450: Fix asfarray kwarg default handling. (Stuart Archibald)
* PR #6486: fix abstract base class import (Valentin Haenel)
* PR #6487: Restrict maximum version of python (Siu Kwan Lam)
* PR #6527: setup.py: fix py version guard (Chris Barnes)
CUDA Enhancements/Fixes:
* PR #5465: Remove macro expansion and replace uses with FE typing + BE lowering
(Graham Markall)
* PR #5741: CUDA: Add two-argument implementation of round() (Graham Markall)
* PR #5900: Enable CUDA Unified Memory (Max Katz)
* PR #6042: CUDA: Lower launch overhead by launching kernel directly (Graham
Markall)
* PR #6064: Lower math.frexp and math.ldexp in numba.cuda (Zhihao Yuan)
* PR #6066: Lower math.isfinite in numba.cuda (Zhihao Yuan)
* PR #6092: CUDA: Add mapped_array_like and pinned_array_like (Graham Markall)
* PR #6127: Fix race in reduction kernels on Volta, require CUDA 9, add syncwarp
with default mask (Graham Markall)
* PR #6129: Extend Cudasim to support most of the memory functionality. (Mike
Williams)
* PR #6150: CUDA: Turn on flake8 for cudadrv and fix errors (Graham Markall)
* PR #6152: CUDA: Provide wrappers for all libdevice functions, and fix typing
of math function (#4618) (Graham Markall)
* PR #6227: Raise exception when no supported architectures are found (Jacob
Tomlinson)
* PR #6244: CUDA Docs: Make workflow using simulator more explicit (Graham
Markall)
* PR #6248: Add support for CUDA atomic subtract operations (Michael Collison)
* PR #6289: Refactor atomic test cases to reduce code duplication (Michael
Collison)
* PR #6290: CUDA: Add support for complex power (Graham Markall)
* PR #6296: Fix flake8 violations in numba.cuda module (Graham Markall)
* PR #6297: Fix flake8 violations in numba.cuda.tests.cudapy module (Graham
Markall)
* PR #6298: Fix flake8 violations in numba.cuda.tests.cudadrv (Graham Markall)
* PR #6299: Fix flake8 violations in numba.cuda.simulator (Graham Markall)
* PR #6306: Fix flake8 in cuda atomic test from merge. (Stuart Archibald)
* PR #6325: Refactor code for atomic operations (Michael Collison)
* PR #6329: Flake8 fix for a CUDA test (Stuart Archibald)
* PR #6331: Explicitly state that NUMBA_ENABLE_CUDASIM needs to be set before
import (Graham Markall)
* PR #6340: CUDA: Fix #6339, performance regression launching specialized
kernels (Graham Markall)
* PR #6380: Only test managed allocations on Linux (Graham Markall)
Documentation Updates:
* PR #6090: doc: Add doc on direct creation of Numba typed-list (``@rht``)
* PR #6110: Update CONTRIBUTING.md (Stuart Archibald)
* PR #6128: CUDA Docs: Restore Dispatcher.forall() docs (Graham Markall)
* PR #6277: fix: cross2d wrong doc. reference (issue #6276) (``@jeertmans``)
* PR #6282: Remove docs on Python 2(.7) EOL. (Stuart Archibald)
* PR #6283: Add note on how public CI is impl and what users can do to help.
(Stuart Archibald)
* PR #6292: Document support for structured array attribute access
(Graham Markall)
* PR #6310: Declare unofficial \*BSD support (Stuart Archibald)
* PR #6342: Fix docs on literally usage. (Stuart Archibald)
* PR #6348: doc: fix typo in jitclass.rst ("initilising" -> "initialising")
(``@muxator``)
* PR #6362: Move llvmlite support in README to 0.35 (Stuart Archibald)
* PR #6363: Note that reference counted types are not permitted in set().
(Stuart Archibald)
* PR #6364: Move deprecation schedules for 0.52 (Stuart Archibald)
CI/Infrastructure Updates:
* PR #6252: Show channel URLs (Siu Kwan Lam)
* PR #6338: Direct user questions to Discourse instead of the Google Group.
(Stan Seibert)
* PR #6474: Add skip on PPC64LE for tests causing SIGABRT in LLVM. (Stuart
Archibald)
Authors:
* Alexey Kozlov
* Amos Bird
* Andreas Sodeur
* Arthur Peters
* Chris Barnes
* Ehsan Totoni (core dev)
* Eric Wieser
* Ethan Pronovost
* Graham Markall
* Guilherme Leobas
* Isaac Virshup
* Ivan Butygin
* Jacob Tomlinson
* Luiz Almeida
* László Károlyi
* Lucio Fernandez-Arjona
* Max Katz
* Michael Collison
* Mike Williams
* Owen Anderson
* Radu Popovici
* Ray Donnelly
* Rishabh Varshney
* Siu Kwan Lam (core dev)
* Stan Seibert (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
* Valentin Haenel (core dev)
* Zhihao Yuan
* ``@jeertmans``
* ``@mugoh``
* ``@muxator``
* ``@rht``
Version 0.51.2 (September 2, 2020)
----------------------------------
This is a bugfix release for 0.51.1. It fixes a critical performance bug in the
CFG back edge computation algorithm that leads to exponential time complexity
arising in compilation for use cases with certain pathological properties.
* PR #6195: PR 6187 Continue. Don't visit already checked successors
Authors:
* Graham Markall
* Siu Kwan Lam (core dev)
Version 0.51.1 (August 26, 2020)
--------------------------------
This is a bugfix release for 0.51.0, it fixes a critical bug in caching, another
critical bug in the CUDA target initialisation sequence and also fixes some
compile time performance regressions:
* PR #6141: Fix #6130 objmode cache segfault
* PR #6146: Fix compilation slowdown due to controlflow analysis
* PR #6147: CUDA: Don't make a runtime call on import
* PR #6153: Fix for #6151. Make UnicodeCharSeq into str for comparison.
* PR #6168: Fix Issue #6167: Failure in test_cuda_submodules
Authors:
* Graham Markall
* Siu Kwan Lam (core dev)
* Stuart Archibald (core dev)
Version 0.51.0 (August 12, 2020)
--------------------------------
This release continues to add new features to Numba and also contains a
significant number of bug fixes and stability improvements.
Highlights of core feature changes include:
* The compilation chain is now based on LLVM 10 (Valentin Haenel).
* Numba has internally switched to prefer non-literal types over literal ones so
as to reduce function over-specialisation, this with view of speeding up
compile times (Siu Kwan Lam).
* On the CUDA target: Support for CUDA Toolkit 11, Ampere, and Compute
Capability 8.0; Printing of ``SASS`` code for kernels; Callbacks to Python
functions can be inserted into CUDA streams, and streams are async awaitable;
Atomic ``nanmin`` and ``nanmax`` functions are added; Fixes for various
miscompilations and segfaults. (mostly Graham Markall; call backs on
streams by Peter Würtz).
Intel also kindly sponsored research and development that lead to some exciting
new features:
* Support for heterogeneous immutable lists and heterogeneous immutable string
key dictionaries. Also optional initial/construction value capturing for all
lists and dictionaries containing literal values (Stuart Archibald).
* A new pass-by-reference mutable structure extension type ``StructRef`` (Siu
Kwan Lam).
* Object mode blocks are now cacheable, with the side effect of numerous bug
fixes and performance improvements in caching. This also permits caching of
functions defined in closures (Siu Kwan Lam).
Deprecations to note:
To align with other targets, the ``argtypes`` and ``restypes`` kwargs to
``@cuda.jit`` are now deprecated, the ``bind`` kwarg is also deprecated.
Further the ``target`` kwarg to the ``numba.jit`` decorator family is
deprecated.
General Enhancements:
* PR #5463: Add str(int) impl
* PR #5526: Impl. np.asarray(literal)
* PR #5619: Add support for multi-output ufuncs
* PR #5711: Division with timedelta input
* PR #5763: Support minlength argument to np.bincount
* PR #5779: Return zero array from np.dot when the arguments are empty.
* PR #5796: Add implementation for np.positive
* PR #5849: Setitem for records when index is StringLiteral, including literal
unroll
* PR #5856: Add support for conversion of inplace_binop to parfor.
* PR #5893: Allocate 1D iteration space one at a time for more even
distribution.
* PR #5922: Reduce objmode and unpickling overhead
* PR #5944: re-enable OpenMP in wheels
* PR #5946: Implement literal dictionaries and lists.
* PR #5956: Update numba_sysinfo.py
* PR #5978: Add structref as a mutable struct that is pass-by-ref
* PR #5980: Deprecate target kwarg for numba.jit.
* PR #6058: Add prefer_literal option to overload API
Fixes:
* PR #5674: Fix #3955. Allow `with objmode` to be cached
* PR #5724: Initialize process lock lazily to prevent multiprocessing issue
* PR #5783: Make np.divide and np.remainder code more similar
* PR #5808: Fix 5665 Block jit(nopython=True, forceobj=True) and suppress
njit(forceobj=True)
* PR #5834: Fix the is operator on Ellipsis
* PR #5838: Ensure ``Dispatcher.__eq__`` always returns a bool
* PR #5841: cleanup: Use PythonAPI.bool_from_bool in more places
* PR #5862: Do not leak loop iteration variables into the numba.np.npyimpl
namespace
* PR #5869: Update repomap
* PR #5879: Fix erroneous input mutation in linalg routines
* PR #5882: Type check function in jit decorator
* PR #5925: Use np.inf and -np.inf for max and min float values respectively.
* PR #5935: Fix default arguments with multiprocessing
* PR #5952: Fix "Internal error ... local variable 'errstr' referenced before
assignment during BoundFunction(...)"
* PR #5962: Fix SVML tests with LLVM 10 and AVX512
* PR #5972: fix flake8 for numba/runtests.py
* PR #5995: Update setup.py with new llvmlite versions
* PR #5996: Set lower bound for llvmlite to 0.33
* PR #6004: Fix problem in branch pruning with LiteralStrKeyDict
* PR #6017: Fixing up numba_do_raise
* PR #6028: Fix #6023
* PR #6031: Continue 5821
* PR #6035: Fix overspecialize of literal
* PR #6046: Fixes statement reordering bug in maximize fusion step.
* PR #6056: Fix issue on invalid inlining of non-empty build_list by
inline_arraycall
* PR #6057: fix aarch64/python_3.8 failure on master
* PR #6070: Fix overspecialized containers
* PR #6071: Remove f-strings in setup.py
* PR #6072: Fix for #6005
* PR #6073: Fixes invalid C prototype in helper function.
* PR #6078: Duplicate NumPy's PyArray_DescrCheck macro
* PR #6081: Fix issue with cross drive use and relpath.
* PR #6083: Fix bug in initial value unify.
* PR #6087: remove invalid sanity check from randrange tests
* PR #6089: Fix invalid reference to TypingError
* PR #6097: Add function code and closure bytes into cache key
* PR #6099: Restrict upper limit of TBB version due to ABI changes.
* PR #6101: Restrict lower limit of icc_rt version due to assumed SVML bug.
* PR #6107: Fix and test #6095
* PR #6109: Fixes an issue reported in #6094
* PR #6111: Decouple LiteralList and LiteralStrKeyDict from tuple
* PR #6116: Fix #6102. Problem with non-unique label.
CUDA Enhancements/Fixes:
* PR #5359: Remove special-casing of 0d arrays
* PR #5709: CUDA: Refactoring of cuda.jit and kernel / dispatcher abstractions
* PR #5732: CUDA Docs: document ``forall`` method of kernels
* PR #5745: CUDA stream callbacks and async awaitable streams
* PR #5761: Add implmentation for int types for isnan and isinf for CUDA
* PR #5819: Add support for CUDA 11 and Ampere / CC 8.0
* PR #5826: CUDA: Add function to get SASS for kernels
* PR #5846: CUDA: Allow disabling NVVM optimizations, and fix debug issues
* PR #5851: CUDA EMM enhancements - add default get_ipc_handle implementation,
skip a test conditionally
* PR #5852: CUDA: Fix ``cuda.test()``
* PR #5857: CUDA docs: Add notes on resetting the EMM plugin
* PR #5859: CUDA: Fix reduce docs and style improvements
* PR #6016: Fixes change of list spelling in a cuda test.
* PR #6020: CUDA: Fix #5820, adding atomic nanmin / nanmax
* PR #6030: CUDA: Don't optimize IR before sending it to NVVM
* PR #6052: Fix dtype for atomic_add_double testsuite
* PR #6080: CUDA: Prevent auto-upgrade of atomic intrinsics
* PR #6123: Fix #6121
Documentation Updates:
* PR #5782: Host docs on Read the Docs
* PR #5830: doc: Mention that caching uses pickle
* PR #5963: Fix broken link to numpy ufunc signature docs
* PR #5975: restructure communication section
* PR #5981: Document bounds-checking behavior in python deviations page
* PR #5993: Docs for structref
* PR #6008: Small fix so bullet points are rendered by sphinx
* PR #6013: emphasize cuda kernel functions are asynchronous
* PR #6036: Update deprecation doc from numba.errors to numba.core.errors
* PR #6062: Change references to numba.pydata.org to https
CI updates:
* PR #5850: Updates the "New Issue" behaviour to better redirect users.
* PR #5940: Add discourse badge
* PR #5960: Setting mypy on CI
Enhancements from user contributed PRs (with thanks!):
* Aisha Tammy added the ability to switch off TBB support at compile time in
#5821 (continued in #6031 by Stuart Archibald).
* Alexander Stiebing fixed a reference before assignment bug in #5952.
* Alexey Kozlov fixed a bug in tuple getitem for literals in #6028.
* Andrew Eckart updated the repomap in #5869, added support for Read the Docs
in #5782, fixed a bug in the ``np.dot`` implementation to correctly handle
empty arrays in #5779 and added support for ``minlength`` to ``np.bincount``
in #5763.
* ``@bitsisbits`` updated ``numba_sysinfo.py`` to handle HSA agents correctly in
#5956.
* Daichi Suzuo Fixed a bug in the threading backend initialisation sequence such
that it is now correctly a lazy lock in #5724.
* Eric Wieser contributed a number of patches, particularly in enhancing and
improving the ``ufunc`` capabilities:
* #5359: Remove special-casing of 0d arrays
* #5834: Fix the is operator on Ellipsis
* #5619: Add support for multi-output ufuncs
* #5841: cleanup: Use PythonAPI.bool_from_bool in more places
* #5862: Do not leak loop iteration variables into the numba.np.npyimpl
namespace
* #5838: Ensure ``Dispatcher.__eq__`` always returns a bool
* #5830: doc: Mention that caching uses pickle
* #5783: Make np.divide and np.remainder code more similar
* Ethan Pronovost added a guard to prevent the common mistake of applying a jit
decorator to the same function twice in #5881.
* Graham Markall contributed many patches to the CUDA target, as follows:
* #6052: Fix dtype for atomic_add_double tests
* #6030: CUDA: Don't optimize IR before sending it to NVVM
* #5846: CUDA: Allow disabling NVVM optimizations, and fix debug issues
* #5826: CUDA: Add function to get SASS for kernels
* #5851: CUDA EMM enhancements - add default get_ipc_handle implementation,
skip a test conditionally
* #5709: CUDA: Refactoring of cuda.jit and kernel / dispatcher abstractions
* #5819: Add support for CUDA 11 and Ampere / CC 8.0
* #6020: CUDA: Fix #5820, adding atomic nanmin / nanmax
* #5857: CUDA docs: Add notes on resetting the EMM plugin
* #5859: CUDA: Fix reduce docs and style improvements
* #5852: CUDA: Fix ``cuda.test()``
* #5732: CUDA Docs: document ``forall`` method of kernels
* Guilherme Leobas added support for ``str(int)`` in #5463 and
``np.asarray(literal value)``` in #5526.
* Hameer Abbasi deprecated the ``target`` kwarg for ``numba.jit`` in #5980.
* Hannes Pahl added a badge to the Numba github page linking to the new
discourse forum in #5940 and also fixed a bug that permitted illegal
combinations of flags to be passed into ``@jit`` in #5808.
* Kayran Schmidt emphasized that CUDA kernel functions are asynchronous in the
documentation in #6013.
* Leonardo Uieda fixed a broken link to the NumPy ufunc signature docs in #5963.
* Lucio Fernandez-Arjona added mypy to CI and started adding type annotations to
the code base in #5960, also fixed a (de)serialization problem on the
dispatcher in #5935, improved the undefined variable error message in #5876,
added support for division with timedelta input in #5711 and implemented
``setitem`` for records when the index is a ``StringLiteral`` in #5849.
* Ludovic Tiako documented Numba's bounds-checking behavior in the python
deviations page in #5981.
* Matt Roeschke changed all ``http`` references ``https`` in #6062.
* ``@niteya-shah`` implemented ``isnan`` and ``isinf`` for integer types on the
CUDA target in #5761 and implemented ``np.positive`` in #5796.
* Peter Würtz added CUDA stream callbacks and async awaitable streams in #5745.
* ``@rht`` fixed an invalid import referred to in the deprecation documentation
in #6036.
* Sergey Pokhodenko updated the SVML tests for LLVM 10 in #5962.
* Shyam Saladi fixed a Sphinx rendering bug in #6008.
Authors:
* Aisha Tammy
* Alexander Stiebing
* Alexey Kozlov
* Andrew Eckart
* ``@bitsisbits``
* Daichi Suzuo
* Eric Wieser
* Ethan Pronovost
* Graham Markall
* Guilherme Leobas
* Hameer Abbasi
* Hannes Pahl
* Kayran Schmidt
* Kozlov, Alexey
* Leonardo Uieda
* Lucio Fernandez-Arjona
* Ludovic Tiako
* Matt Roeschke
* ``@niteya-shah``
* Peter Würtz
* Sergey Pokhodenko
* Shyam Saladi
* ``@rht``
* Siu Kwan Lam (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
* Valentin Haenel (core dev)
Version 0.50.1 (Jun 24, 2020)
-----------------------------
This is a bugfix release for 0.50.0, it fixes a critical bug in error reporting
and a number of other smaller issues:
* PR #5861: Added except for possible Windows get_terminal_size exception
* PR #5876: Improve undefined variable error message
* PR #5884: Update the deprecation notices for 0.50.1
* PR #5889: Fixes literally not forcing re-dispatch for inline='always'
* PR #5912: Fix bad attr access on certain typing templates breaking exceptions.
* PR #5918: Fix cuda test due to #5876
Authors:
* ``@pepping_dore``
* Lucio Fernandez-Arjona
* Siu Kwan Lam (core dev)
* Stuart Archibald (core dev)
Version 0.50.0 (Jun 10, 2020)
-----------------------------
This is a more usual release in comparison to the others that have been made in
the last six months. It comprises the result of a number of maintenance tasks
along with some new features and a lot of bug fixes.
Highlights of core feature changes include:
* The compilation chain is now based on LLVM 9.
* The error handling and reporting system has been improved to reduce the size
of error messages, and also improve quality and specificity.
* The CUDA target has more stream constructors available and a new function for
compiling to PTX without linking and loading the code to a device. Further,
the macro-based system for describing CUDA threads and blocks has been
replaced with standard typing and lowering implementations, for improved
debugging and extensibility.
IMPORTANT: The backwards compatibility shim, that was present in 0.49.x to
accommodate the refactoring of Numba's internals, has been removed. If a module
is imported from a moved location an ``ImportError`` will occur.
General Enhancements:
* PR #5060: Enables np.sum for timedelta64
* PR #5225: Adjust interpreter to make conditionals predicates via bool() call.
* PR #5506: Jitclass static methods
* PR #5580: Revert shim
* PR #5591: Fix #5525 Add figure for total memory to ``numba -s`` output.
* PR #5616: Simplify the ufunc kernel registration
* PR #5617: Remove /examples from the Numba repo.
* PR #5673: Fix inliners to run all passes on IR and clean up correctly.
* PR #5700: Make it easier to understand type inference: add SSA dump, use for
``DEBUG_TYPEINFER``
* PR #5702: Fixes for LLVM 9
* PR #5722: Improve error messages.
* PR #5758: Support NumPy 1.18
Fixes:
* PR #5390: add error handling for lookup_module
* PR #5464: Jitclass drops annotations to avoid error
* PR #5478: Fix #5471. Issue with omitted type not recognized as literal value.
* PR #5517: Fix numba.typed.List extend for singleton and empty iterable
* PR #5549: Check type getitem
* PR #5568: Add skip to entrypoint test on windows
* PR #5581: Revert #5568
* PR #5602: Fix segfault caused by pop from numba.typed.List
* PR #5645: Fix SSA redundant CFG computation
* PR #5686: Fix issue with SSA not minimal
* PR #5689: Fix bug in unified_function_type (issue 5685)
* PR #5694: Skip part of slice array analysis if any part is not analyzable.
* PR #5697: Fix usedef issue with parfor loopnest variables.
* PR #5705: A fix for cases where SSA looks like a reduction variable.
* PR #5714: Fix bug in test
* PR #5717: Initialise Numba extensions ahead of any compilation starting.
* PR #5721: Fix array iterator layout.
* PR #5738: Unbreak master on buildfarm
* PR #5757: Force LLVM to use ZMM registers for vectorization.
* PR #5764: fix flake8 errors
* PR #5768: Interval example: fix import
* PR #5781: Moving record array examples to a test module
* PR #5791: Fix up no cgroups problem
* PR #5795: Restore refct removal pass and make it strict
* PR #5807: Skip failing test on POWER8 due to PPC CTR Loop problem.
* PR #5812: Fix side issue from #5792, @overload inliner cached IR being
mutated.
* PR #5815: Pin llvmlite to 0.33
* PR #5833: Fixes the source location appearing incorrectly in error messages.
CUDA Enhancements/Fixes:
* PR #5347: CUDA: Provide more stream constructors
* PR #5388: CUDA: Fix OOB write in test_round{f4,f8}
* PR #5437: Fix #5429: Exception using ``.get_ipc_handle(...)`` on array from
``as_cuda_array(...)``
* PR #5481: CUDA: Replace macros with typing and lowering implementations
* PR #5556: CUDA: Make atomic semantics match Python / NumPy, and fix #5458
* PR #5558: CUDA: Only release primary ctx if retained
* PR #5561: CUDA: Add function for compiling to PTX (+ other small fixes)
* PR #5573: CUDA: Skip tests under cuda-memcheck that hang it
* PR #5578: Implement math.modf for CUDA target
* PR #5704: CUDA Eager compilation: Fix max_registers kwarg
* PR #5718: CUDA lib path tests: unset CUDA_PATH when CUDA_HOME unset
* PR #5800: Fix LLVM 9 IR for NVVM
* PR #5803: CUDA Update expected error messages to fix #5797
Documentation Updates:
* PR #5546: DOC: Add documentation about cost model to inlining notes.
* PR #5653: Update doc with respect to try-finally case
Enhancements from user contributed PRs (with thanks!):
* Elias Kuthe fixed in issue with imports in the Interval example in #5768
* Eric Wieser Simplified the ufunc kernel registration mechanism in #5616
* Ethan Pronovost patched a problem with ``__annotations__`` in ``jitclass`` in
#5464, fixed a bug that lead to infinite loops in Numba's ``Type.__getitem__``
in #5549, fixed a bug in ``np.arange`` testing in #5714 and added support for
``@staticmethod`` to ``jitclass`` in #5506.
* Gabriele Gemmi implemented ``math.modf`` for the CUDA target in #5578
* Graham Markall contributed many patches, largely to the CUDA target, as
follows:
* #5347: CUDA: Provide more stream constructors
* #5388: CUDA: Fix OOB write in test_round{f4,f8}
* #5437: Fix #5429: Exception using ``.get_ipc_handle(...)`` on array from
``as_cuda_array(...)``
* #5481: CUDA: Replace macros with typing and lowering implementations
* #5556: CUDA: Make atomic semantics match Python / NumPy, and fix #5458
* #5558: CUDA: Only release primary ctx if retained
* #5561: CUDA: Add function for compiling to PTX (+ other small fixes)
* #5573: CUDA: Skip tests under cuda-memcheck that hang it
* #5648: Unset the memory manager after EMM Plugin tests
* #5700: Make it easier to understand type inference: add SSA dump, use for
``DEBUG_TYPEINFER``
* #5704: CUDA Eager compilation: Fix max_registers kwarg
* #5718: CUDA lib path tests: unset CUDA_PATH when CUDA_HOME unset
* #5800: Fix LLVM 9 IR for NVVM
* #5803: CUDA Update expected error messages to fix #5797
* Guilherme Leobas updated the documentation surrounding try-finally in #5653
* Hameer Abbasi added documentation about the cost model to the notes on
inlining in #5546
* Jacques Gaudin rewrote ``numba -s`` to produce and consume a dictionary of
output about the current system in #5591
* James Bourbeau Updated min/argmin and max/argmax to handle non-leading nans
(via #5758)
* Lucio Fernandez-Arjona moved the record array examples to a test module in
#5781 and added ``np.timedelta64`` handling to ``np.sum`` in #5060
* Pearu Peterson Fixed a bug in unified_function_type in #5689
* Sergey Pokhodenko fixed an issue impacting LLVM 10 regarding vectorization
widths on Intel SkyLake processors in #5757
* Shan Sikdar added error handling for ``lookup_module`` in #5390
* @toddrme2178 add CI testing for NumPy 1.18 (via #5758)
Authors:
* Elias Kuthe
* Eric Wieser
* Ethan Pronovost
* Gabriele Gemmi
* Graham Markall
* Guilherme Leobas
* Hameer Abbasi
* Jacques Gaudin
* James Bourbeau
* Lucio Fernandez-Arjona
* Pearu Peterson
* Sergey Pokhodenko
* Shan Sikdar
* Siu Kwan Lam (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
* ``@toddrme2178``
* Valentin Haenel (core dev)
Version 0.49.1 (May 7, 2020)
----------------------------
This is a bugfix release for 0.49.0, it fixes some residual issues with SSA
form, a critical bug in the branch pruning logic and a number of other smaller
issues:
* PR #5587: Fixed #5586 Threading Implementation Typos
* PR #5592: Fixes #5583 Remove references to cffi_support from docs and examples
* PR #5614: Fix invalid type in resolve for comparison expr in parfors.
* PR #5624: Fix erroneous rewrite of predicate to bit const on prune.
* PR #5627: Fixes #5623, SSA local def scan based on invalid equality
assumption.
* PR #5629: Fixes naming error in array_exprs
* PR #5630: Fix #5570. Incorrect race variable detection due to SSA naming.
* PR #5638: Make literal_unroll function work as a freevar.
* PR #5648: Unset the memory manager after EMM Plugin tests
* PR #5651: Fix some SSA issues
* PR #5652: Pin to sphinx=2.4.4 to avoid problem with C declaration
* PR #5658: Fix unifying undefined first class function types issue
* PR #5669: Update example in 5m guide WRT SSA type stability.
* PR #5676: Restore ``numba.types`` as public API
Authors:
* Graham Markall
* Juan Manuel Cruz Martinez
* Pearu Peterson
* Sean Law
* Stuart Archibald (core dev)
* Siu Kwan Lam (core dev)
Version 0.49.0 (Apr 16, 2020)
-----------------------------
This release is very large in terms of code changes. Large scale removal of
unsupported Python and NumPy versions has taken place along with a significant
amount of refactoring to simplify the Numba code base to make it easier for
contributors. Numba's intermediate representation has also undergone some
important changes to solve a number of long standing issues. In addition some
new features have been added and a large number of bugs have been fixed!
IMPORTANT: In this release Numba's internals have moved about a lot. A backwards
compatibility "shim" is provided for this release so as to not immediately break
projects using Numba's internals. If a module is imported from a moved location
the shim will issue a deprecation warning and suggest how to update the import
statement for the new location. The shim will be removed in 0.50.0!
Highlights of core feature changes include:
* Removal of all Python 2 related code and also updating the minimum supported
Python version to 3.6, the minimum supported NumPy version to 1.15 and the
minimum supported SciPy version to 1.0. (Stuart Archibald).
* Refactoring of the Numba code base. The code is now organised into submodules
by functionality. This cleans up Numba's top level namespace.
(Stuart Archibald).
* Introduction of an ``ir.Del`` free static single assignment form for Numba's
intermediate representation (Siu Kwan Lam and Stuart Archibald).
* An OpenMP-like thread masking API has been added for use with code using the
parallel CPU backends (Aaron Meurer and Stuart Archibald).
* For the CUDA target, all kernel launches now require a configuration, this
preventing accidental launches of kernels with the old default of a single
thread in a single block. The hard-coded autotuner is also now removed, such
tuning is deferred to CUDA API calls that provide the same functionality
(Graham Markall).
* The CUDA target also gained an External Memory Management plugin interface to
allow Numba to use another CUDA-aware library for all memory allocations and
deallocations (Graham Markall).
* The Numba Typed List container gained support for construction from iterables
(Valentin Haenel).
* Experimental support was added for first-class function types
(Pearu Peterson).
Enhancements from user contributed PRs (with thanks!):
* Aaron Meurer added support for thread masking at runtime in #4615.
* Andreas Sodeur fixed a long standing bug that was preventing ``cProfile`` from
working with Numba JIT compiled functions in #4476.
* Arik Funke fixed error messages in ``test_array_reductions`` (#5278), fixed
an issue with test discovery (#5239), made it so the documentation would build
again on windows (#5453) and fixed a nested list problem in the docs in #5489.
* Antonio Russo fixed a SyntaxWarning in #5252.
* Eric Wieser added support for inferring the types of object arrays (#5348) and
iterating over 2D arrays (#5115), also fixed some compiler warnings due to
missing (void) in #5222. Also helped improved the "shim" and associated
warnings in #5485, #5488, #5498 and partly #5532.
* Ethan Pronovost fixed a problem with the shim erroneously warning for jitclass
use in #5454 and also prevented illegal return values in jitclass ``__init__``
in #5505.
* Gabriel Majeri added SciPy 2019 talks to the docs in #5106.
* Graham Markall changed the Numba HTML documentation theme to resolve a number
of long standing issues in #5346. Also contributed were a large number of CUDA
enhancements and fixes, namely:
* #5519: CUDA: Silence the test suite - Fix #4809, remove autojit, delete
prints
* #5443: Fix #5196: Docs: assert in CUDA only enabled for debug
* #5436: Fix #5408: test_set_registers_57 fails on Maxwell
* #5423: Fix #5421: Add notes on printing in CUDA kernels
* #5400: Fix #4954, and some other small CUDA testsuite fixes
* #5328: NBEP 7: External Memory Management Plugin Interface
* #5144: Fix #4875: Make #2655 test with debug expect to pass
* #5323: Document lifetime semantics of CUDA Array Interface
* #5061: Prevent kernel launch with no configuration, remove autotuner
* #5099: Fix #5073: Slices of dynamic shared memory all alias
* #5136: CUDA: Enable asynchronous operations on the default stream
* #5085: Support other itemsizes with view
* #5059: Docs: Explain how to use Memcheck with Numba, fixups in CUDA
documentation
* #4957: Add notes on overwriting gufunc inputs to docs
* Greg Jennings fixed an issue with ``np.random.choice`` not acknowledging the
RNG seed correctly in #3897/#5310.
* Guilherme Leobas added support for ``np.isnat`` in #5293.
* Henry Schreiner made the llvmlite requirements more explicit in
requirements.txt in #5150.
* Ivan Butygin helped fix an issue with parfors sequential lowering in
#5114/#5250.
* Jacques Gaudin fixed a bug for Python >= 3.8 in ``numba -s`` in #5548.
* Jim Pivarski added some hints for debugging entry points in #5280.
* John Kirkham added ``numpy.dtype`` coercion for the ``dtype`` argument to CUDA
device arrays in #5252.
* Leo Fang added a list of libraries that support ``__cuda_array_interface__``
in #5104.
* Lucio Fernandez-Arjona added ``getitem`` for the NumPy record type when the
index is a ``StringLiteral`` type in #5182 and improved the documentation
rendering via additions to the TOC and removal of numbering in #5450.
* Mads R. B. Kristensen fixed an issue with ``__cuda_array_interface__`` not
requiring the context in #5189.
* Marcin Tolysz added support for nested modules in AOT compilation in #5174.
* Mike Williams fixed some issues with NumPy records and ``getitem`` in the CUDA
simulator in #5343.
* Pearu Peterson added experimental support for first-class function types in
#5287 (and fixes in #5459, #5473/#5429, and #5557).
* Ravi Teja Gutta added support for ``np.flip`` in #4376/#5313.
* Rohit Sanjay fixed an issue with type refinement for unicode input supplied to
typed-list ``extend()`` (#5295) and fixed unicode ``.strip()`` to strip all
whitespace characters in #5213.
* Vladimir Lukyanov fixed an awkward bug in ``typed.dict`` in #5361, added a fix
to ensure the LLVM and assembly dumps are highlighted correctly in #5357 and
implemented a Numba IR Lexer and added highlighting to Numba IR dumps in
#5333.
* hdf fixed an issue with the ``boundscheck`` flag in the CUDA jit target in
#5257.
General Enhancements:
* PR #4615: Allow masking threads out at runtime
* PR #4798: Add branch pruning based on raw predicates.
* PR #5115: Add support for iterating over 2D arrays
* PR #5117: Implement ord()/chr()
* PR #5122: Remove Python 2.
* PR #5127: Calling convention adaptor for boxer/unboxer to call jitcode
* PR #5151: implement None-typed typed-list
* PR #5174: Nested modules https://github.com/numba/numba/issues/4739
* PR #5182: Add getitem for Record type when index is StringLiteral
* PR #5185: extract code-gen utilities from closures
* PR #5197: Refactor Numba, part I
* PR #5210: Remove more unsupported Python versions from build tooling.
* PR #5212: Adds support for viewing the CFG of the ELF disassembly.
* PR #5227: Immutable typed-list
* PR #5231: Added support for ``np.asarray`` to be used with
``numba.typed.List``
* PR #5235: Added property ``dtype`` to ``numba.typed.List``
* PR #5272: Refactor parfor: split up ParforPass
* PR #5281: Make IR ir.Del free until legalized.
* PR #5287: First-class function type
* PR #5293: np.isnat
* PR #5294: Create typed-list from iterable
* PR #5295: refine typed-list on unicode input to extend
* PR #5296: Refactor parfor: better exception from passes
* PR #5308: Provide ``numba.extending.is_jitted``
* PR #5320: refactor array_analysis
* PR #5325: Let literal_unroll accept types.Named*Tuple
* PR #5330: refactor common operation in parfor lowering into a new util
* PR #5333: Add: highlight Numba IR dump
* PR #5342: Support for tuples passed to parfors.
* PR #5348: Add support for inferring the types of object arrays
* PR #5351: SSA again
* PR #5352: Add shim to accommodate refactoring.
* PR #5356: implement allocated parameter in njit
* PR #5369: Make test ordering more consistent across feature availability
* PR #5428: Wip/deprecate jitclass location
* PR #5441: Additional changes to first class function
* PR #5455: Move to llvmlite 0.32.*
* PR #5457: implement repr for untyped lists
Fixes:
* PR #4476: Another attempt at fixing frame injection in the dispatcher tracing
path
* PR #4942: Prevent some parfor aliasing. Rename copied function var to prevent
recursive type locking.
* PR #5092: Fix 5087
* PR #5150: More explicit llvmlite requirement in requirements.txt
* PR #5172: fix version spec for llvmlite
* PR #5176: Normalize kws going into fold_arguments.
* PR #5183: pass 'inline' explicitly to overload
* PR #5193: Fix CI failure due to missing files when installed
* PR #5213: Fix ``.strip()`` to strip all whitespace characters
* PR #5216: Fix namedtuple mistreated by dispatcher as simple tuple
* PR #5222: Fix compiler warnings due to missing (void)
* PR #5232: Fixes a bad import that breaks master
* PR #5239: fix test discovery for unittest
* PR #5247: Continue PR #5126
* PR #5250: Part fix/5098
* PR #5252: Trivially fix SyntaxWarning
* PR #5276: Add prange variant to has_no_side_effect.
* PR #5278: fix error messages in test_array_reductions
* PR #5310: PR #3897 continued
* PR #5313: Continues PR #4376
* PR #5318: Remove AUTHORS file reference from MANIFEST.in
* PR #5327: Add warning if FNV hashing is found as the default for CPython.
* PR #5338: Remove refcount pruning pass
* PR #5345: Disable test failing due to removed pass.
* PR #5357: Small fix to have llvm and asm highlighted properly
* PR #5361: 5081 typed.dict
* PR #5431: Add tolerance to numba extension module entrypoints.
* PR #5432: Fix code causing compiler warnings.
* PR #5445: Remove undefined variable
* PR #5454: Don't warn for numba.experimental.jitclass
* PR #5459: Fixes issue 5448
* PR #5480: Fix for #5477, literal_unroll KeyError searching for getitems
* PR #5485: Show the offending module in "no direct replacement" error message
* PR #5488: Add missing ``numba.config`` shim
* PR #5495: Fix missing null initializer for variable after phi strip
* PR #5498: Make the shim deprecation warnings work on python 3.6 too
* PR #5505: Better error message if __init__ returns value
* PR #5527: Attempt to fix #5518
* PR #5529: PR #5473 continued
* PR #5532: Make ``numba.<mod>`` available without an import
* PR #5542: Fixes RC2 module shim bug
* PR #5548: Fix #5537 Removed reference to ``platform.linux_distribution``
* PR #5555: Fix #5515 by reverting changes to ArrayAnalysis
* PR #5557: First-class function call cannot use keyword arguments
* PR #5569: Fix RewriteConstGetitems not registering calltype for new expr
* PR #5571: Pin down llvmlite requirement
CUDA Enhancements/Fixes:
* PR #5061: Prevent kernel launch with no configuration, remove autotuner
* PR #5085: Support other itemsizes with view
* PR #5099: Fix #5073: Slices of dynamic shared memory all alias
* PR #5104: Add a list of libraries that support __cuda_array_interface__
* PR #5136: CUDA: Enable asynchronous operations on the default stream
* PR #5144: Fix #4875: Make #2655 test with debug expect to pass
* PR #5189: __cuda_array_interface__ not requiring context
* PR #5253: Coerce ``dtype`` to ``numpy.dtype``
* PR #5257: boundscheck fix
* PR #5319: Make user facing error string use abs path not rel.
* PR #5323: Document lifetime semantics of CUDA Array Interface
* PR #5328: NBEP 7: External Memory Management Plugin Interface
* PR #5343: Fix cuda spoof
* PR #5400: Fix #4954, and some other small CUDA testsuite fixes
* PR #5436: Fix #5408: test_set_registers_57 fails on Maxwell
* PR #5519: CUDA: Silence the test suite - Fix #4809, remove autojit, delete
prints
Documentation Updates:
* PR #4957: Add notes on overwriting gufunc inputs to docs
* PR #5059: Docs: Explain how to use Memcheck with Numba, fixups in CUDA
documentation
* PR #5106: Add SciPy 2019 talks to docs
* PR #5147: Update master for 0.48.0 updates
* PR #5155: Explain what inlining at Numba IR level will do
* PR #5161: Fix README.rst formatting
* PR #5207: Remove AUTHORS list
* PR #5249: fix target path for See also
* PR #5262: fix typo in inlining docs
* PR #5270: fix 'see also' in typeddict docs
* PR #5280: Added some hints for debugging entry points.
* PR #5297: Update docs with intro to {g,}ufuncs.
* PR #5326: Update installation docs with OpenMP requirements.
* PR #5346: Docs: use sphinx_rtd_theme
* PR #5366: Remove reference to Python 2.7 in install check output
* PR #5423: Fix #5421: Add notes on printing in CUDA kernels
* PR #5438: Update package deps for doc building.
* PR #5440: Bump deprecation notices.
* PR #5443: Fix #5196: Docs: assert in CUDA only enabled for debug
* PR #5450: Docs: remove numbers and add titles to TOC
* PR #5453: fix building docs on windows
* PR #5489: docs: fix rendering of nested bulleted list
CI updates:
* PR #5314: Update the image used in Azure CI for OSX.
* PR #5360: Remove Travis CI badge.
Authors:
* Aaron Meurer
* Andreas Sodeur
* Antonio Russo
* Arik Funke
* Eric Wieser
* Ethan Pronovost
* Gabriel Majeri
* Graham Markall
* Greg Jennings
* Guilherme Leobas
* hdf
* Henry Schreiner
* Ivan Butygin
* Jacques Gaudin
* Jim Pivarski
* John Kirkham
* Leo Fang
* Lucio Fernandez-Arjona
* Mads R. B. Kristensen
* Marcin Tolysz
* Mike Williams
* Pearu Peterson
* Ravi Teja Gutta
* Rohit Sanjay
* Siu Kwan Lam (core dev)
* Stan Seibert (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
* Valentin Haenel (core dev)
* Vladimir Lukyanov
Version 0.48.0 (Jan 27, 2020)
-----------------------------
This release is particularly small as it was present to catch anything that
missed the 0.47.0 deadline (the deadline deliberately coincided with the end of
support for Python 2.7). The next release will be considerably larger.
The core changes in this release are dominated by the start of the clean up
needed for the end of Python 2.7 support, improvements to the CUDA target and
support for numerous additional unicode string methods.
Enhancements from user contributed PRs (with thanks!):
* Brian Wignall fixed more spelling typos in #4998.
* Denis Smirnov added support for string methods ``capitalize`` (#4823),
``casefold`` (#4824), ``swapcase`` (#4825), ``rsplit`` (#4834), ``partition``
(#4845) and ``splitlines`` (#4849).
* Elena Totmenina extended support for string methods ``startswith`` (#4867) and
added ``endswith`` (#4868).
* Eric Wieser made ``type_callable`` return the decorated function itself in
#4760
* Ethan Pronovost added support for ``np.argwhere`` in #4617
* Graham Markall contributed a large number of CUDA enhancements and fixes,
namely:
* #5068: Remove Python 3.4 backports from utils
* #4975: Make ``device_array_like`` create contiguous arrays (Fixes #4832)
* #5023: Don't launch ForAll kernels with 0 elements (Fixes #5017)
* #5016: Fix various issues in CUDA library search (Fixes #4979)
* #5014: Enable use of records and bools for shared memory, remove ddt, add
additional transpose tests
* #4964: Fix #4628: Add more appropriate typing for CUDA device arrays
* #5007: test_consuming_strides: Keep dev array alive
* #4997: State that CUDA Toolkit 8.0 required in docs
* James Bourbeau added the Python 3.8 classifier to setup.py in #5027.
* John Kirkham added a clarification to the ``__cuda_array_interface__``
documentation in #5049.
* Leo Fang Fixed an indexing problem in ``dummyarray`` in #5012.
* Marcel Bargull fixed a build and test issue for Python 3.8 in #5029.
* Maria Rubtsov added support for string methods ``isdecimal`` (#4842),
``isdigit`` (#4843), ``isnumeric`` (#4844) and ``replace`` (#4865).
General Enhancements:
* PR #4760: Make type_callable return the decorated function
* PR #5010: merge string prs
This merge PR included the following:
* PR #4823: Implement str.capitalize() based on CPython
* PR #4824: Implement str.casefold() based on CPython
* PR #4825: Implement str.swapcase() based on CPython
* PR #4834: Implement str.rsplit() based on CPython
* PR #4842: Implement str.isdecimal
* PR #4843: Implement str.isdigit
* PR #4844: Implement str.isnumeric
* PR #4845: Implement str.partition() based on CPython
* PR #4849: Implement str.splitlines() based on CPython
* PR #4865: Implement str.replace
* PR #4867: Functionality extension str.startswith() based on CPython
* PR #4868: Add functionality for str.endswith()
* PR #5039: Disable help messages.
* PR #4617: Add coverage for ``np.argwhere``
Fixes:
* PR #4724: Only use lives (and not aliases) to create post parfor live set.
* PR #4998: Fix more spelling typos
* PR #5024: Propagate semantic constants ahead of static rewrites.
* PR #5027: Add Python 3.8 classifier to setup.py
* PR #5046: Update setup.py and buildscripts for dependency requirements
* PR #5053: Convert from arrays to names in define() and don't invalidate for
multiple consistent defines.
* PR #5058: Permit mixed int types in wrap_index
* PR #5078: Catch the use of global typed-list in JITed functions
* PR #5092: Fix #5087, bug in bytecode analysis.
CUDA Enhancements/Fixes:
* PR #4964: Fix #4628: Add more appropriate typing for CUDA device arrays
* PR #4975: Make ``device_array_like`` create contiguous arrays (Fixes #4832)
* PR #4997: State that CUDA Toolkit 8.0 required in docs
* PR #5007: test_consuming_strides: Keep dev array alive
* PR #5012: Fix IndexError when accessing the "-1" element of dummyarray
* PR #5014: Enable use of records and bools for shared memory, remove ddt, add
additional transpose tests
* PR #5016: Fix various issues in CUDA library search (Fixes #4979)
* PR #5023: Don't launch ForAll kernels with 0 elements (Fixes #5017)
* PR #5068: Remove Python 3.4 backports from utils
Documentation Updates:
* PR #5049: Clarify what dictionary means
* PR #5062: Update docs for updated version requirements
* PR #5090: Update deprecation notices for 0.48.0
CI updates:
* PR #5029: Install optional dependencies for Python 3.8 tests
* PR #5040: Drop Py2.7 and Py3.5 from public CI
* PR #5048: Fix CI py38
Authors:
* Brian Wignall
* Denis Smirnov
* Elena Totmenina
* Eric Wieser
* Ethan Pronovost
* Graham Markall
* James Bourbeau
* John Kirkham
* Leo Fang
* Marcel Bargull
* Maria Rubtsov
* Siu Kwan Lam (core dev)
* Stan Seibert (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
* Valentin Haenel (core dev)
Version 0.47.0 (Jan 2, 2020)
-----------------------------
This release expands the capability of Numba in a number of important areas and
is also significant as it is the last major point release with support for
Python 2 and Python 3.5 included. The next release (0.48.0) will be for Python
3.6+ only! (This follows NumPy's deprecation schedule as specified in
`NEP 29 <https://numpy.org/neps/nep-0029-deprecation_policy.html>`_.)
Highlights of core feature changes include:
* Full support for Python 3.8 (Siu Kwan Lam)
* Opt-in bounds checking (Aaron Meurer)
* Support for ``map``, ``filter`` and ``reduce`` (Stuart Archibald)
Intel also kindly sponsored research and development that lead to some exciting
new features:
* Initial support for basic ``try``/``except`` use (Siu Kwan Lam)
* The ability to pass functions created from closures/lambdas as arguments
(Stuart Archibald)
* ``sorted`` and ``list.sort()`` now accept the ``key`` argument (Stuart
Archibald and Siu Kwan Lam)
* A new compiler pass triggered through the use of the function
``numba.literal_unroll`` which permits iteration over heterogeneous tuples
and constant lists of constants. (Stuart Archibald)
Enhancements from user contributed PRs (with thanks!):
* Ankit Mahato added a reference to a new talk on Numba at PyCon India 2019 in
#4862
* Brian Wignall kindly fixed some spelling mistakes and typos in #4909
* Denis Smirnov wrote numerous methods to considerable enhance string support
including:
* ``str.rindex()`` in #4861
* ``str.isprintable()`` in #4836
* ``str.index()`` in #4860
* ``start/end`` parameters for ``str.find()`` in #4866
* ``str.isspace()`` in #4835
* ``str.isidentifier()`` #4837
* ``str.rpartition()`` in #4841
* ``str.lower()`` and ``str.islower()`` in #4651
* Elena Totmenina implemented both ``str.isalnum()``, ``str.isalpha()`` and
``str.isascii`` in #4839, #4840 and #4847 respectively.
* Eric Larson fixed a bug in literal comparison in #4710
* Ethan Pronovost updated the ``np.arange`` implementation in #4770 to allow
the use of the ``dtype`` key word argument and also added ``bool``
implementations for several types in #4715.
* Graham Markall fixed some issues with the CUDA target, namely:
* #4931: Added physical limits for CC 7.0 / 7.5 to CUDA autotune
* #4934: Fixed bugs in TestCudaWarpOperations
* #4938: Improved errors / warnings for the CUDA vectorize decorator
* Guilherme Leobas fixed a typo in the ``urem`` implementation in #4667
* Isaac Virshup contributed a number of patches that fixed bugs, added support
for more NumPy functions and enhanced Python feature support. These
contributions included:
* #4729: Allow array construction with mixed type shape tuples
* #4904: Implementing ``np.lcm``
* #4780: Implement np.gcd and math.gcd
* #4779: Make slice constructor more similar to python.
* #4707: Added support for slice.indices
* #4578: Clarify numba ufunc supported features
* James Bourbeau fixed some issues with tooling, #4794 add ``setuptools`` as a
dependency and #4501 add pre-commit hooks for ``flake8`` compliance.
* Leo Fang made ``numba.dummyarray.Array`` iterable in #4629
* Marc Garcia fixed the ``numba.jit`` parameter name signature_or_function in
#4703
* Marcelo Duarte Trevisani patched the llvmlite requirement to ``>=0.30.0`` in
#4725
* Matt Cooper fixed a long standing CI problem in #4737 by remove maxParallel
* Matti Picus fixed an issue with ``collections.abc`` in #4734
from Azure Pipelines.
* Rob Ennis patched a bug in ``np.interp`` ``float32`` handling in #4911
* VDimir fixed a bug in array transposition layouts in #4777 and re-enabled and
fixed some idle tests in #4776.
* Vyacheslav Smirnov Enable support for `str.istitle()`` in #4645
General Enhancements:
* PR #4432: Bounds checking
* PR #4501: Add pre-commit hooks
* PR #4536: Handle kw args in inliner when callee is a function
* PR #4599: Permits closures to become functions, enables map(), filter()
* PR #4611: Implement method title() for unicode based on Cpython
* PR #4645: Enable support for istitle() method for unicode string
* PR #4651: Implement str.lower() and str.islower()
* PR #4652: Implement str.rfind()
* PR #4695: Refactor `overload*` and support `jit_options` and `inline`
* PR #4707: Added support for slice.indices
* PR #4715: Add `bool` overload for several types
* PR #4729: Allow array construction with mixed type shape tuples
* PR #4755: Python3.8 support
* PR #4756: Add parfor support for ndarray.fill.
* PR #4768: Update typeconv error message to ask for sys.executable.
* PR #4770: Update `np.arange` implementation with `@overload`
* PR #4779: Make slice constructor more similar to python.
* PR #4780: Implement np.gcd and math.gcd
* PR #4794: Add setuptools as a dependency
* PR #4802: put git hash into build string
* PR #4803: Better compiler error messages for improperly used reduction
variables.
* PR #4817: Typed list implement and expose allocation
* PR #4818: Typed list faster copy
* PR #4835: Implement str.isspace() based on CPython
* PR #4836: Implement str.isprintable() based on CPython
* PR #4837: Implement str.isidentifier() based on CPython
* PR #4839: Implement str.isalnum() based on CPython
* PR #4840: Implement str.isalpha() based on CPython
* PR #4841: Implement str.rpartition() based on CPython
* PR #4847: Implement str.isascii() based on CPython
* PR #4851: Add graphviz output for FunctionIR
* PR #4854: Python3.8 looplifting
* PR #4858: Implement str.expandtabs() based on CPython
* PR #4860: Implement str.index() based on CPython
* PR #4861: Implement str.rindex() based on CPython
* PR #4866: Support params start/end for str.find()
* PR #4874: Bump to llvmlite 0.31
* PR #4896: Specialise arange dtype on arch + python version.
* PR #4902: basic support for try except
* PR #4904: Implement np.lcm
* PR #4910: loop canonicalisation and type aware tuple unroller/loop body
versioning passes
* PR #4961: Update hash(tuple) for Python 3.8.
* PR #4977: Implement sort/sorted with key.
* PR #4987: Add `is_internal` property to all Type classes.
Fixes:
* PR #4090: Update to LLVM8 memset/memcpy intrinsic
* PR #4582: Convert sub to add and div to mul when doing the reduction across
the per-thread reduction array.
* PR #4648: Handle 0 correctly as slice parameter.
* PR #4660: Remove multiply defined variables from all blocks' equivalence sets.
* PR #4672: Fix pickling of dufunc
* PR #4710: BUG: Comparison for literal
* PR #4718: Change get_call_table to support intermediate Vars.
* PR #4725: Requires llvmlite >=0.30.0
* PR #4734: prefer to import from collections.abc
* PR #4736: fix flake8 errors
* PR #4776: Fix and enable idle tests from test_array_manipulation
* PR #4777: Fix transpose output array layout
* PR #4782: Fix issue with SVML (and knock-on function resolution effects).
* PR #4785: Treat 0d arrays like scalars.
* PR #4787: fix missing incref on flags
* PR #4789: fix typos in numba/targets/base.py
* PR #4791: fix typos
* PR #4811: fix spelling in now-failing tests
* PR #4852: windowing test should check equality only up to double precision
errors
* PR #4881: fix refining list by using extend on an iterator
* PR #4882: Fix return type in arange and zero step size handling.
* PR #4885: suppress spurious RuntimeWarning about ufunc sizes
* PR #4891: skip the xfail test for now. Py3.8 CFG refactor seems to have
changed the test case
* PR #4892: regex needs to accept singular form of "argument"
* PR #4901: fix typed list equals
* PR #4909: Fix some spelling typos
* PR #4911: np.interp bugfix for float32 handling
* PR #4920: fix creating list with JIT disabled
* PR #4921: fix creating dict with JIT disabled
* PR #4935: Better handling of prange with multiple reductions on the same
variable.
* PR #4946: Improve the error message for `raise <string>`.
* PR #4955: Move overload of literal_unroll to avoid circular dependency that
breaks Python 2.7
* PR #4962: Fix test error on windows
* PR #4973: Fixes a bug in the relabelling logic in literal_unroll.
* PR #4978: Fix overload_method problem with stararg
* PR #4981: Add ind_to_const to enable fewer equivalence classes.
* PR #4991: Continuation of #4588 (Let dead code removal handle removing more of
the unneeded code after prange conversion to parfor)
* PR #4994: Remove xfail for test which has since had underlying issue fixed.
* PR #5018: Fix #5011.
* PR #5019: skip pycc test on Python 3.8 + macOS because of distutils issue
CUDA Enhancements/Fixes:
* PR #4629: Make numba.dummyarray.Array iterable
* PR #4675: Bump cuda array interface to version 2
* PR #4741: Update choosing the "CUDA_PATH" for windows
* PR #4838: Permit ravel('A') for contig device arrays in CUDA target
* PR #4931: Add physical limits for CC 7.0 / 7.5 to autotune
* PR #4934: Fix fails in TestCudaWarpOperations
* PR #4938: Improve errors / warnings for cuda vectorize decorator
Documentation Updates:
* PR #4418: Directed graph task roadmap
* PR #4578: Clarify numba ufunc supported features
* PR #4655: fix sphinx build warning
* PR #4667: Fix typo on urem implementation
* PR #4669: Add link to ParallelAccelerator paper.
* PR #4703: Fix numba.jit parameter name signature_or_function
* PR #4862: Addition of PyCon India 2019 talk on Numba
* PR #4947: Document jitclass with numba.typed use.
* PR #4958: Add docs for `try..except`
* PR #4993: Update deprecations for 0.47
CI Updates:
* PR #4737: remove maxParallel from Azure Pipelines
* PR #4767: pin to 2.7.16 for py27 on osx
* PR #4781: WIP/runtest cf pytest
Authors:
* Aaron Meurer
* Ankit Mahato
* Brian Wignall
* Denis Smirnov
* Ehsan Totoni (core dev)
* Elena Totmenina
* Eric Larson
* Ethan Pronovost
* Giovanni Cavallin
* Graham Markall
* Guilherme Leobas
* Isaac Virshup
* James Bourbeau
* Leo Fang
* Marc Garcia
* Marcelo Duarte Trevisani
* Matt Cooper
* Matti Picus
* Rob Ennis
* Rujal Desai
* Siu Kwan Lam (core dev)
* Stan Seibert (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
* VDimir
* Valentin Haenel (core dev)
* Vyacheslav Smirnov
Version 0.46.0
--------------
This release significantly reworked one of the main parts of Numba, the compiler
pipeline, to make it more extensible and easier to use. The purpose of this was
to continue enhancing Numba's ability for use as a compiler toolkit. In a
similar vein, Numba now has an extension registration mechanism to allow other
Numba-using projects to automatically have their Numba JIT compilable functions
discovered. There were also a number of other related compiler toolkit
enhancement added along with some more NumPy features and a lot of bug fixes.
This release has updated the CUDA Array Interface specification to version 2,
which clarifies the `strides` attribute for C-contiguous arrays and specifies
the treatment for zero-size arrays. The implementation in Numba has been
changed and may affect downstream packages relying on the old behavior
(see issue #4661).
Enhancements from user contributed PRs (with thanks!):
* Aaron Meurer fixed some Python issues in the code base in #4345 and #4341.
* Ashwin Srinath fixed a CUDA performance bug via #4576.
* Ethan Pronovost added support for triangular indices functions in #4601 (the
NumPy functions ``tril_indices``, ``tril_indices_from``, ``triu_indices``, and
``triu_indices_from``).
* Gerald Dalley fixed a tear down race occurring in Python 2.
* Gregory R. Lee fixed the use of deprecated ``inspect.getargspec``.
* Guilherme Leobas contributed five PRs, adding support for ``np.append`` and
``np.count_nonzero`` in #4518 and #4386. The typed List was fixed to accept
unsigned integers in #4510. #4463 made a fix to NamedTuple internals and #4397
updated the docs for ``np.sum``.
* James Bourbeau added a new feature to permit the automatic application of the
`jit` decorator to a whole module in #4331. Also some small fixes to the docs
and the code base were made in #4447 and #4433, and a fix to inplace array
operation in #4228.
* Jim Crist fixed a bug in the rendering of patched errors in #4464.
* Leo Fang updated the CUDA Array Interface contract in #4609.
* Pearu Peterson added support for Unicode based NumPy arrays in #4425.
* Peter Andreas Entschev fixed a CUDA concurrency bug in #4581.
* Lucio Fernandez-Arjona extended Numba's ``np.sum`` support to now accept the
``dtype`` kwarg in #4472.
* Pedro A. Morales Maries added support for ``np.cross`` in #4128 and also added
the necessary extension ``numba.numpy_extensions.cross2d`` in #4595.
* David Hoese, Eric Firing, Joshua Adelman, and Juan Nunez-Iglesias all made
documentation fixes in #4565, #4482, #4455, #4375 respectively.
* Vyacheslav Smirnov and Rujal Desai enabled support for ``count()`` on unicode
strings in #4606.
General Enhancements:
* PR #4113: Add rewrite for semantic constants.
* PR #4128: Add np.cross support
* PR #4162: Make IR comparable and legalize it.
* PR #4208: R&D inlining, jitted and overloaded.
* PR #4331: Automatic JIT of called functions
* PR #4353: Inspection tool to check what numba supports
* PR #4386: Implement np.count_nonzero
* PR #4425: Unicode array support
* PR #4427: Entrypoints for numba extensions
* PR #4467: Literal dispatch
* PR #4472: Allow dtype input argument in np.sum
* PR #4513: New compiler.
* PR #4518: add support for np.append
* PR #4554: Refactor NRT C-API
* PR #4556: 0.46 scheduled deprecations
* PR #4567: Add env var to disable performance warnings.
* PR #4568: add np.array_equal support
* PR #4595: Implement numba.cross2d
* PR #4601: Add triangular indices functions
* PR #4606: Enable support for count() method for unicode string
Fixes:
* PR #4228: Fix inplace operator error for arrays
* PR #4282: Detect and raise unsupported on generator expressions
* PR #4305: Don't allow the allocation of mutable objects written into a
container to be hoisted.
* PR #4311: Avoid deprecated use of inspect.getargspec
* PR #4328: Replace GC macro with function call
* PR #4330: Loosen up typed container casting checks
* PR #4341: Fix some coding lines at the top of some files (utf8 -> utf-8)
* PR #4345: Replace "import \*" with explicit imports in numba/types
* PR #4346: Fix incorrect alg in isupper for ascii strings.
* PR #4349: test using jitclass in typed-list
* PR #4361: Add allocation hoisting info to LICM section at diagnostic L4
* PR #4366: Offset search box to avoid wrapping on some pages with Safari.
Fixes #4365.
* PR #4372: Replace all "except BaseException" with "except Exception".
* PR #4407: Restore the "free" conda channel for NumPy 1.10 support.
* PR #4408: Add lowering for constant bytes.
* PR #4409: Add exception chaining for better error context
* PR #4411: Name of type should not contain user facing description for debug.
* PR #4412: Fix #4387. Limit the number of return types for recursive functions
* PR #4426: Fixed two module teardown races in py2.
* PR #4431: Fix and test numpy.random.random_sample(n) for np117
* PR #4463: NamedTuple - Raises an error on non-iterable elements
* PR #4464: Add a newline in patched errors
* PR #4474: Fix liveness for remove dead of parfors (and other IR extensions)
* PR #4510: Make List.__getitem__ accept unsigned parameters
* PR #4512: Raise specific error at typing time for iteration on >1D array.
* PR #4532: Fix static_getitem with Literal type as index
* PR #4547: Update to inliner cost model information.
* PR #4557: Use specific random number seed when generating arbitrary test data
* PR #4559: Adjust test timeouts
* PR #4564: Skip unicode array tests on ppc64le that trigger an LLVM bug
* PR #4621: Fix packaging issue due to missing numba/cext
* PR #4623: Fix issue 4520 due to storage model mismatch
* PR #4644: Updates for llvmlite 0.30.0
CUDA Enhancements/Fixes:
* PR #4410: Fix #4111. cudasim mishandling recarray
* PR #4576: Replace use of `np.prod` with `functools.reduce` for computing size
from shape
* PR #4581: Prevent taking the GIL in ForAll
* PR #4592: Fix #4589. Just pass NULL for b2d_func for constant dynamic
sharedmem
* PR #4609: Update CUDA Array Interface & Enforce Numba compliance
* PR #4619: Implement math.{degrees, radians} for the CUDA target.
* PR #4675: Bump cuda array interface to version 2
Documentation Updates:
* PR #4317: Add docs for ARMv8/AArch64
* PR #4318: Add supported platforms to the docs. Closes #4316
* PR #4375: Add docstrings to inspect methods
* PR #4388: Update Python 2.7 EOL statement
* PR #4397: Add note about np.sum
* PR #4447: Minor parallel performance tips edits
* PR #4455: Clarify docs for typed dict with regard to arrays
* PR #4482: Fix example in guvectorize docstring.
* PR #4541: fix two typos in architecture.rst
* PR #4548: Document numba.extending.intrinsic and inlining.
* PR #4565: Fix typo in jit-compilation docs
* PR #4607: add dependency list to docs
* PR #4614: Add documentation for implementing new compiler passes.
CI Updates:
* PR #4415: Make 32bit incremental builds on linux not use free channel
* PR #4433: Removes stale azure comment
* PR #4493: Fix Overload Inliner wrt CUDA Intrinsics
* PR #4593: Enable Azure CI batching
Contributors:
* Aaron Meurer
* Ashwin Srinath
* David Hoese
* Ehsan Totoni (core dev)
* Eric Firing
* Ethan Pronovost
* Gerald Dalley
* Gregory R. Lee
* Guilherme Leobas
* James Bourbeau
* Jim Crist
* Joshua Adelman
* Juan Nunez-Iglesias
* Leo Fang
* Lucio Fernandez-Arjona
* Pearu Peterson
* Pedro A. Morales Marie
* Peter Andreas Entschev
* Rujal Desai
* Siu Kwan Lam (core dev)
* Stan Seibert (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
* Valentin Haenel (core dev)
* Vyacheslav Smirnov
Version 0.45.1
--------------
This patch release addresses some regressions reported in the 0.45.0 release and
adds support for NumPy 1.17:
* PR #4325: accept scalar/0d-arrays
* PR #4338: Fix #4299. Parfors reduction vars not deleted.
* PR #4350: Use process level locks for fork() only.
* PR #4354: Try to fix #4352.
* PR #4357: Fix np1.17 isnan, isinf, isfinite ufuncs
* PR #4363: Fix np.interp for np1.17 nan handling
* PR #4371: Fix nump1.17 random function non-aliasing
Contributors:
* Siu Kwan Lam (core dev)
* Stuart Archibald (core dev)
* Valentin Haenel (core dev)
Version 0.45.0
--------------
In this release, Numba gained an experimental :ref:`numba.typed.List
<feature-typed-list>` container as a future replacement of the :ref:`reflected
list <feature-reflected-list>`. In addition, functions decorated with
``parallel=True`` can now be cached to reduce compilation overhead associated
with the auto-parallelization.
Enhancements from user contributed PRs (with thanks!):
* James Bourbeau added the Numba version to reportable error messages in #4227,
added the ``signature`` parameter to ``inspect_types`` in #4200, improved the
docstring of ``normalize_signature`` in #4205, and fixed #3658 by adding
reference counting to ``register_dispatcher`` in #4254
* Guilherme Leobas implemented the dominator tree and dominance frontier
algorithms in #4216 and #4149, respectively.
* Nick White fixed the issue with ``round`` in the CUDA target in #4137.
* Joshua Adelman added support for determining if a value is in a `range`
(i.e. ``x in range(...)``) in #4129, and added windowing functions
(``np.bartlett``, ``np.hamming``, ``np.blackman``, ``np.hanning``,
``np.kaiser``) from NumPy in #4076.
* Lucio Fernandez-Arjona added support for ``np.select`` in #4077
* Rob Ennis added support for ``np.flatnonzero`` in #4157
* Keith Kraus extended the ``__cuda_array_interface__`` with an optional mask
attribute in #4199.
* Gregory R. Lee replaced deprecated use of ``inspect.getargspec`` in #4311.
General Enhancements:
* PR #4328: Replace GC macro with function call
* PR #4311: Avoid deprecated use of inspect.getargspec
* PR #4296: Slacken window function testing tol on ppc64le
* PR #4254: Add reference counting to register_dispatcher
* PR #4239: Support len() of multi-dim arrays in array analysis
* PR #4234: Raise informative error for np.kron array order
* PR #4232: Add unicodetype db, low level str functions and examples.
* PR #4229: Make hashing cacheable
* PR #4227: Include numba version in reportable error message
* PR #4216: Add dominator tree
* PR #4200: Add signature parameter to inspect_types
* PR #4196: Catch missing imports of internal functions.
* PR #4180: Update use of unlowerable global message.
* PR #4166: Add tests for PR #4149
* PR #4157: Support for np.flatnonzero
* PR #4149: Implement dominance frontier for SSA for the Numba IR
* PR #4148: Call branch pruning in inline_closure_call()
* PR #4132: Reduce usage of inttoptr
* PR #4129: Support contains for range
* PR #4112: better error messages for np.transpose and tuples
* PR #4110: Add range attrs, start, stop, step
* PR #4077: Add np select
* PR #4076: Add numpy windowing functions support (np.bartlett, np.hamming,
np.blackman, np.hanning, np.kaiser)
* PR #4095: Support ir.Global/FreeVar in find_const()
* PR #3691: Make TypingError abort compiling earlier
* PR #3646: Log internal errors encountered in typeinfer
Fixes:
* PR #4303: Work around scipy bug 10206
* PR #4302: Fix flake8 issue on master
* PR #4301: Fix integer literal bug in np.select impl
* PR #4291: Fix pickling of jitclass type
* PR #4262: Resolves #4251 - Fix bug in reshape analysis.
* PR #4233: Fixes issue revealed by #4215
* PR #4224: Fix #4223. Looplifting error due to StaticSetItem in objectmode
* PR #4222: Fix bad python path.
* PR #4178: Fix unary operator overload, check with unicode impl
* PR #4173: Fix return type in np.bincount with weights
* PR #4153: Fix slice shape assignment in array analysis
* PR #4152: fix status check in dict lookup
* PR #4145: Use callable instead of checking __module__
* PR #4118: Fix inline assembly support on CPU.
* PR #4088: Resolves #4075 - parfors array_analysis bug.
* PR #4085: Resolves #3314 - parfors array_analysis bug with reshape.
CUDA Enhancements/Fixes:
* PR #4199: Extend `__cuda_array_interface__` with optional mask attribute,
bump version to 1
* PR #4137: CUDA - Fix round Builtin
* PR #4114: Support 3rd party activated CUDA context
Documentation Updates:
* PR #4317: Add docs for ARMv8/AArch64
* PR #4318: Add supported platforms to the docs. Closes #4316
* PR #4295: Alter deprecation schedules
* PR #4253: fix typo in pysupported docs
* PR #4252: fix typo on repomap
* PR #4241: remove unused import
* PR #4240: fix typo in jitclass docs
* PR #4205: Update return value order in normalize_signature docstring
* PR #4237: Update doc links to point to latest not dev docs.
* PR #4197: hyperlink repomap
* PR #4170: Clarify docs on accumulating into arrays in prange
* PR #4147: fix docstring for DictType iterables
* PR #3951: A guide to overloading
CI Updates:
* PR #4300: AArch64 has no faulthandler package
* PR #4273: pin to MKL BLAS for testing to get consistent results
* PR #4209: Revert previous network tol patch and try with conda config
* PR #4138: Remove tbb before Azure test only on Python 3, since it was already
removed for Python 2
Contributors:
* Ehsan Totoni (core dev)
* Gregory R. Lee
* Guilherme Leobas
* James Bourbeau
* Joshua L. Adelman
* Keith Kraus
* Lucio Fernandez-Arjona
* Nick White
* Rob Ennis
* Siu Kwan Lam (core dev)
* Stan Seibert (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
* Valentin Haenel (core dev)
Version 0.44.1
--------------
This patch release addresses some regressions reported in the 0.44.0 release:
- PR #4165: Fix #4164 issue with NUMBAPRO_NVVM.
- PR #4172: Abandon branch pruning if an arg name is redefined. (Fixes #4163)
- PR #4183: Fix #4156. Problem with defining in-loop variables.
Version 0.44.0
--------------
IMPORTANT: In this release a few significant deprecations (and some less
significant ones) are being made, users are encouraged to read the related
documentation.
General enhancements in this release include:
- Numba is backed by LLVM 8 on all platforms apart from ppc64le, which, due to
bugs, remains on the LLVM 7.x series.
- Numba's dictionary support now includes type inference for keys and values.
- The .view() method now works for NumPy scalar types.
- Newly supported NumPy functions added: np.delete, np.nanquantile, np.quantile,
np.repeat, np.shape.
In addition considerable effort has been made to fix some long standing bugs and
a large number of other bugs, the "Fixes" section is very large this time!
Enhancements from user contributed PRs (with thanks!):
- Max Bolingbroke added support for the selective use of ``fastmath`` flags in
#3847.
- Rob Ennis made min() and max() work on iterables in #3820 and added
np.quantile and np.nanquantile in #3899.
- Sergey Shalnov added numerous unicode string related features, zfill in #3978,
ljust in #4001, rjust and center in #4044 and strip, lstrip and rstrip in
#4048.
- Guilherme Leobas added support for np.delete in #3890
- Christoph Deil exposed the Numba CLI via ``python -m numba`` in #4066 and made
numerous documentation fixes.
- Leo Schwarz wrote the bulk of the code for jitclass default constructor
arguments in #3852.
- Nick White enhanced the CUDA backend to use min/max PTX instructions where
possible in #4054.
- Lucio Fernandez-Arjona implemented the unicode string ``__mul__`` function in
#3952.
- Dimitri Vorona wrote the bulk of the code to implement getitem and setitem for
jitclass in #3861.
General Enhancements:
* PR #3820: Min max on iterables
* PR #3842: Unicode type iteration
* PR #3847: Allow fine-grained control of fastmath flags to partially address #2923
* PR #3852: Continuation of PR #2894
* PR #3861: Continuation of PR #3730
* PR #3890: Add support for np.delete
* PR #3899: Support for np.quantile and np.nanquantile
* PR #3900: Fix 3457 :: Implements np.repeat
* PR #3928: Add .view() method for NumPy scalars
* PR #3939: Update icc_rt clone recipe.
* PR #3952: __mul__ for strings, initial implementation and tests
* PR #3956: Type-inferred dictionary
* PR #3959: Create a view for string slicing to avoid extra allocations
* PR #3978: zfill operation implementation
* PR #4001: ljust operation implementation
* PR #4010: Support `dict()` and `{}`
* PR #4022: Support for llvm 8
* PR #4034: Make type.Optional str more representative
* PR #4041: Deprecation warnings
* PR #4044: rjust and center operations implementation
* PR #4048: strip, lstrip and rstrip operations implementation
* PR #4066: Expose numba CLI via python -m numba
* PR #4081: Impl `np.shape` and support function for `asarray`.
* PR #4091: Deprecate the use of iternext_impl without RefType
CUDA Enhancements/Fixes:
* PR #3933: Adds `.nbytes` property to CUDA device array objects.
* PR #4011: Add .inspect_ptx() to cuda device function
* PR #4054: CUDA: Use min/max PTX Instructions
* PR #4096: Update env-vars for CUDA libraries lookup
Documentation Updates:
* PR #3867: Code repository map
* PR #3918: adding Joris' Fosdem 2019 presentation
* PR #3926: order talks on applications of Numba by date
* PR #3943: fix two small typos in vectorize docs
* PR #3944: Fixup jitclass docs
* PR #3990: mention preprint repo in FAQ. Fixes #3981
* PR #4012: Correct runtests command in contributing.rst
* PR #4043: fix typo
* PR #4047: Ambiguous Documentation fix for guvectorize.
* PR #4060: Remove remaining mentions of autojit in docs
* PR #4063: Fix annotate example in docstring
* PR #4065: Add FAQ entry explaining Numba project name
* PR #4079: Add Documentation for atomicity of typed.Dict
* PR #4105: Remove info about CUDA ENVVAR potential replacement
Fixes:
* PR #3719: Resolves issue #3528. Adds support for slices when not using parallel=True.
* PR #3727: Remove dels for known dead vars.
* PR #3845: Fix mutable flag transmission in .astype
* PR #3853: Fix some minor issues in the C source.
* PR #3862: Correct boolean reinterpretation of data
* PR #3863: Comments out the appveyor badge
* PR #3869: fixes flake8 after merge
* PR #3871: Add assert to ir.py to help enforce correct structuring
* PR #3881: fix preparfor dtype transform for datetime64
* PR #3884: Prevent mutation of objmode fallback IR.
* PR #3885: Updates for llvmlite 0.29
* PR #3886: Use `safe_load` from pyyaml.
* PR #3887: Add tolerance to network errors by permitting conda to retry
* PR #3893: Fix casting in namedtuple ctor.
* PR #3894: Fix array inliner for multiple array definition.
* PR #3905: Cherrypick #3903 to main
* PR #3920: Raise better error if unsupported jump opcode found.
* PR #3927: Apply flake8 to the numpy related files
* PR #3935: Silence DeprecationWarning
* PR #3938: Better error message for unknown opcode
* PR #3941: Fix typing of ufuncs in parfor conversion
* PR #3946: Return variable renaming dict from inline_closurecall
* PR #3962: Fix bug in alignment computation of `Record.make_c_struct`
* PR #3967: Fix error with pickling unicode
* PR #3964: Unicode split algo versioning
* PR #3975: Add handler for unknown locale to numba -s
* PR #3991: Permit Optionals in ufunc machinery
* PR #3995: Remove assert in type inference causing poor error message.
* PR #3996: add is_ascii flag to UnicodeType
* PR #4009: Prevent zero division error in np.linalg.cond
* PR #4014: Resolves #4007.
* PR #4021: Add a more specific error message for invalid write to a global.
* PR #4023: Fix handling of titles in record dtype
* PR #4024: Do a check if a call is const before saying that an object is multiply defined.
* PR #4027: Fix issue #4020. Turn off no_cpython_wrapper flag when compiling for…
* PR #4033: [WIP] Fixing wrong dtype of array inside reflected list #4028
* PR #4061: Change IPython cache dir name to numba_cache
* PR #4067: Delete examples/notebooks/LinearRegr.py
* PR #4070: Catch writes to global typed.Dict and raise.
* PR #4078: Check tuple length
* PR #4084: Fix missing incref on optional return None
* PR #4089: Make the warnings fixer flush work for warning comparing on type.
* PR #4094: Fix function definition finding logic for commented def
* PR #4100: Fix alignment check on 32-bit.
* PR #4104: Use PEP 508 compliant env markers for install deps
Contributors:
* Benjamin Zaitlen
* Christoph Deil
* David Hirschfeld
* Dimitri Vorona
* Ehsan Totoni (core dev)
* Guilherme Leobas
* Leo Schwarz
* Lucio Fernandez-Arjona
* Max Bolingbroke
* NanduTej
* Nick White
* Ravi Teja Gutta
* Rob Ennis
* Sergey Shalnov
* Siu Kwan Lam (core dev)
* Stan Seibert (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
* Valentin Haenel (core dev)
Version 0.43.1
--------------
This is a bugfix release that provides minor changes to fix: a bug in branch
pruning, bugs in `np.interp` functionality, and also fully accommodate the
NumPy 1.16 release series.
* PR #3826: NumPy 1.16 support
* PR #3850: Refactor np.interp
* PR #3883: Rewrite pruned conditionals as their evaluated constants.
Contributors:
* Rob Ennis
* Siu Kwan Lam (core dev)
* Stuart Archibald (core dev)
Version 0.43.0
--------------
In this release, the major new features are:
- Initial support for statically typed dictionaries
- Improvements to `hash()` to match Python 3 behavior
- Support for the heapq module
- Ability to pass C structs to Numba
- More NumPy functions: asarray, trapz, roll, ptp, extract
NOTE:
The vast majority of NumPy 1.16 behaviour is supported, however
``datetime`` and ``timedelta`` use involving ``NaT`` matches the behaviour
present in earlier release. The ufunc suite has not been extending to
accommodate the two new time computation related additions present in NumPy
1.16. In addition the functions ``ediff1d`` and ``interp`` have known minor
issues in replicating outputs exactly when ``NaN``'s occur in certain input
patterns.
General Enhancements:
* PR #3563: Support for np.roll
* PR #3572: Support for np.ptp
* PR #3592: Add dead branch prune before type inference.
* PR #3598: Implement np.asarray()
* PR #3604: Support for np.interp
* PR #3607: Some simplication to lowering
* PR #3612: Exact match flag in dispatcher
* PR #3627: Support for np.trapz
* PR #3630: np.where with broadcasting
* PR #3633: Support for np.extract
* PR #3657: np.max, np.min, np.nanmax, np.nanmin - support for complex dtypes
* PR #3661: Access C Struct as Numpy Structured Array
* PR #3678: Support for str.split and str.join
* PR #3684: Support C array in C struct
* PR #3696: Add intrinsic to help debug refcount
* PR #3703: Implementations of type hashing.
* PR #3715: Port CPython3.7 dictionary for numba internal use
* PR #3716: Support inplace concat of strings
* PR #3718: Add location to ConstantInferenceError exceptions.
* PR #3720: improve error msg about invalid signature
* PR #3731: Support for heapq
* PR #3754: Updates for llvmlite 0.28
* PR #3760: Overloadable operator.setitem
* PR #3775: Support overloading operator.delitem
* PR #3777: Implement compiler support for dictionary
* PR #3791: Implement interpreter-side interface for numba dict
* PR #3799: Support refcount'ed types in numba dict
CUDA Enhancements/Fixes:
* PR #3713: Fix the NvvmSupportError message when CC too low
* PR #3722: Fix #3705: slicing error with negative strides
* PR #3755: Make cuda.to_device accept readonly host array
* PR #3773: Adapt library search to accommodate multiple locations
Documentation Updates:
* PR #3651: fix link to berryconda in docs
* PR #3668: Add Azure Pipelines build badge
* PR #3749: DOC: Clarify when prange is different from range
* PR #3771: fix a few typos
* PR #3785: Clarify use of range as function only.
* PR #3829: Add docs for typed-dict
Fixes:
* PR #3614: Resolve #3586
* PR #3618: Skip gdb tests on ARM.
* PR #3643: Remove support_literals usage
* PR #3645: Enforce and fix that AbstractTemplate.generic must be returning a Signature
* PR #3648: Fail on @overload signature mismatch.
* PR #3660: Added Ignore message to test numba.tests.test_lists.TestLists.test_mul_error
* PR #3662: Replace six with numba.six
* PR #3663: Removes coverage computation from travisci builds
* PR #3672: Avoid leaking memory when iterating over uniform tuple
* PR #3676: Fixes constant string lowering inside tuples
* PR #3677: Ensure all referenced compiled functions are linked properly
* PR #3692: Fix test failure due to overly strict test on floating point values.
* PR #3693: Intercept failed import to help users.
* PR #3694: Fix memory leak in enumerate iterator
* PR #3695: Convert return of None from intrinsic implementation to dummy value
* PR #3697: Fix for issue #3687
* PR #3701: Fix array.T analysis (fixes #3700)
* PR #3704: Fixes for overload_method
* PR #3706: Don't push call vars recursively into nested parfors. Resolves #3686.
* PR #3710: Set as non-hoistable if a mutable variable is passed to a function in a loop. Resolves #3699.
* PR #3712: parallel=True to use better builtin mechanism to resolve call types. Resolves issue #3671
* PR #3725: Fix invalid removal of dead empty list
* PR #3740: add uintp as a valid type to the tuple operator.getitem
* PR #3758: Fix target definition update in inlining
* PR #3782: Raise typing error on yield optional.
* PR #3792: Fix non-module object used as the module of a function.
* PR #3800: Bugfix for np.interp
* PR #3808: Bump macro to include VS2014 to fix py3.5 build
* PR #3809: Add debug guard to debug only C function.
* PR #3816: Fix array.sum(axis) 1d input return type.
* PR #3821: Replace PySys_WriteStdout with PySys_FormatStdout to ensure no truncation.
* PR #3830: Getitem should not return optional type
* PR #3832: Handle single string as path in find_file()
Contributors:
* Ehsan Totoni
* Gryllos Prokopis
* Jonathan J. Helmus
* Kayla Ngan
* lalitparate
* luk-f-a
* Matyt
* Max Bolingbroke
* Michael Seifert
* Rob Ennis
* Siu Kwan Lam
* Stan Seibert
* Stuart Archibald
* Todd A. Anderson
* Tao He
* Valentin Haenel
Version 0.42.1
--------------
Bugfix release to fix the incorrect hash in OSX wheel packages.
No change in source code.
Version 0.42.0
--------------
In this release the major features are:
- The capability to launch and attach the GDB debugger from within a jitted
function.
- The upgrading of LLVM to version 7.0.0.
We added a draft of the project roadmap to the developer manual. The roadmap is
for informational purposes only as priorities and resources may change.
Here are some enhancements from contributed PRs:
- #3532. Daniel Wennberg improved the ``cuda.{pinned, mapped}`` API so that
the associated memory is released immediately at the exit of the context
manager.
- #3531. Dimitri Vorona enabled the inlining of jitclass methods.
- #3516. Simon Perkins added the support for passing numpy dtypes (i.e.
``np.dtype("int32")``) and their type constructor (i.e. ``np.int32``) into
a jitted function.
- #3509. Rob Ennis added support for ``np.corrcoef``.
A regression issue (#3554, #3461) relating to making an empty slice in parallel
mode is resolved by #3558.
General Enhancements:
* PR #3392: Launch and attach gdb directly from Numba.
* PR #3437: Changes to accommodate LLVM 7.0.x
* PR #3509: Support for np.corrcoef
* PR #3516: Typeof dtype values
* PR #3520: Fix @stencil ignoring cval if out kwarg supplied.
* PR #3531: Fix jitclass method inlining and avoid unnecessary increfs
* PR #3538: Avoid future C-level assertion error due to invalid visibility
* PR #3543: Avoid implementation error being hidden by the try-except
* PR #3544: Add `long_running` test flag and feature to exclude tests.
* PR #3549: ParallelAccelerator caching improvements
* PR #3558: Fixes array analysis for inplace binary operators.
* PR #3566: Skip alignment tests on armv7l.
* PR #3567: Fix unifying literal types in namedtuple
* PR #3576: Add special copy routine for NumPy out arrays
* PR #3577: Fix example and docs typos for `objmode` context manager.
reorder statements.
* PR #3580: Use alias information when determining whether it is safe to
* PR #3583: Use `ir.unknown_loc` for unknown `Loc`, as #3390 with tests
* PR #3587: Fix llvm.memset usage changes in llvm7
* PR #3596: Fix Array Analysis for Global Namedtuples
* PR #3597: Warn users if threading backend init unsafe.
* PR #3605: Add guard for writing to read only arrays from ufunc calls
* PR #3606: Improve the accuracy of error message wording for undefined type.
* PR #3611: gdb test guard needs to ack ptrace permissions
* PR #3616: Skip gdb tests on ARM.
CUDA Enhancements:
* PR #3532: Unregister temporarily pinned host arrays at once
* PR #3552: Handle broadcast arrays correctly in host->device transfer.
* PR #3578: Align cuda and cuda simulator kwarg names.
Documentation Updates:
* PR #3545: Fix @njit description in 5 min guide
* PR #3570: Minor documentation fixes for numba.cuda
* PR #3581: Fixing minor typo in `reference/types.rst`
* PR #3594: Changing `@stencil` docs to correctly reflect `func_or_mode` param
* PR #3617: Draft roadmap as of Dec 2018
Contributors:
* Aaron Critchley
* Daniel Wennberg
* Dimitri Vorona
* Dominik Stańczak
* Ehsan Totoni (core dev)
* Iskander Sharipov
* Rob Ennis
* Simon Muller
* Simon Perkins
* Siu Kwan Lam (core dev)
* Stan Seibert (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
Version 0.41.0
--------------
This release adds the following major features:
* Diagnostics showing the optimizations done by ParallelAccelerator
* Support for profiling Numba-compiled functions in Intel VTune
* Additional NumPy functions: partition, nancumsum, nancumprod, ediff1d, cov,
conj, conjugate, tri, tril, triu
* Initial support for Python 3 Unicode strings
General Enhancements:
* PR #1968: armv7 support
* PR #2983: invert mapping b/w binop operators and the operator module #2297
* PR #3160: First attempt at parallel diagnostics
* PR #3307: Adding NUMBA_ENABLE_PROFILING envvar, enabling jit event
* PR #3320: Support for np.partition
* PR #3324: Support for np.nancumsum and np.nancumprod
* PR #3325: Add location information to exceptions.
* PR #3337: Support for np.ediff1d
* PR #3345: Support for np.cov
* PR #3348: Support user pipeline class in with lifting
* PR #3363: string support
* PR #3373: Improve error message for empty imprecise lists.
* PR #3375: Enable overload(operator.getitem)
* PR #3402: Support negative indexing in tuple.
* PR #3414: Refactor Const type
* PR #3416: Optimized usage of alloca out of the loop
* PR #3424: Updates for llvmlite 0.26
* PR #3462: Add support for `np.conj/np.conjugate`.
* PR #3480: np.tri, np.tril, np.triu - default optional args
* PR #3481: Permit dtype argument as sole kwarg in np.eye
CUDA Enhancements:
* PR #3399: Add max_registers Option to cuda.jit
Continuous Integration / Testing:
* PR #3303: CI with Azure Pipelines
* PR #3309: Workaround race condition with apt
* PR #3371: Fix issues with Azure Pipelines
* PR #3362: Fix #3360: `RuntimeWarning: 'numba.runtests' found in sys.modules`
* PR #3374: Disable openmp in wheel building
* PR #3404: Azure Pipelines templates
* PR #3419: Fix cuda tests and error reporting in test discovery
* PR #3491: Prevent faulthandler installation on armv7l
* PR #3493: Fix CUDA test that used negative indexing behaviour that's fixed.
* PR #3495: Start Flake8 checking of Numba source
Fixes:
* PR #2950: Fix dispatcher to only consider contiguous-ness.
* PR #3124: Fix 3119, raise for 0d arrays in reductions
* PR #3228: Reduce redundant module linking
* PR #3329: Fix AOT on windows.
* PR #3335: Fix memory management of __cuda_array_interface__ views.
* PR #3340: Fix typo in error name.
* PR #3365: Fix the default unboxing logic
* PR #3367: Allow non-global reference to objmode() context-manager
* PR #3381: Fix global reference in objmode for dynamically created function
* PR #3382: CUDA_ERROR_MISALIGNED_ADDRESS Using Multiple Const Arrays
* PR #3384: Correctly handle very old versions of colorama
* PR #3394: Add 32bit package guard for non-32bit installs
* PR #3397: Fix with-objmode warning
* PR #3403 Fix label offset in call inline after parfor pass
* PR #3429: Fixes raising of user defined exceptions for exec(<string>).
* PR #3432: Fix error due to function naming in CI in py2.7
* PR #3444: Fixed TBB's single thread execution and test added for #3440
* PR #3449: Allow matching non-array objects in find_callname()
* PR #3455: Change getiter and iternext to not be pure. Resolves #3425
* PR #3467: Make ir.UndefinedType singleton class.
* PR #3478: Fix np.random.shuffle sideeffect
* PR #3487: Raise unsupported for kwargs given to `print()`
* PR #3488: Remove dead script.
* PR #3498: Fix stencil support for boolean as return type
* PR #3511: Fix handling make_function literals (regression of #3414)
* PR #3514: Add missing unicode != unicode
* PR #3527: Fix complex math sqrt implementation for large -ve values
* PR #3530: This adds arg an check for the pattern supplied to Parfors.
* PR #3536: Sets list dtor linkage to `linkonce_odr` to fix visibility in AOT
Documentation Updates:
* PR #3316: Update 0.40 changelog with additional PRs
* PR #3318: Tweak spacing to avoid search box wrapping onto second line
* PR #3321: Add note about memory leaks with exceptions to docs. Fixes #3263
* PR #3322: Add FAQ on CUDA + fork issue. Fixes #3315.
* PR #3343: Update docs for argsort, kind kwarg partially supported.
* PR #3357: Added mention of njit in 5minguide.rst
* PR #3434: Fix parallel reduction example in docs.
* PR #3452: Fix broken link and mark up problem.
* PR #3484: Size Numba logo in docs in em units. Fixes #3313
* PR #3502: just two typos
* PR #3506: Document string support
* PR #3513: Documentation for parallel diagnostics.
* PR #3526: Fix 5 min guide with respect to @njit decl
Contributors:
* Alex Ford
* Andreas Sodeur
* Anton Malakhov
* Daniel Stender
* Ehsan Totoni (core dev)
* Henry Schreiner
* Marcel Bargull
* Matt Cooper
* Nick White
* Nicolas Hug
* rjenc29
* Siu Kwan Lam (core dev)
* Stan Seibert (core dev)
* Stuart Archibald (core dev)
* Todd A. Anderson (core dev)
Version 0.40.1
--------------
This is a PyPI-only patch release to ensure that PyPI wheels can enable the
TBB threading backend, and to disable the OpenMP backend in the wheels.
Limitations of manylinux1 and variation in user environments can cause
segfaults when OpenMP is enabled on wheel builds. Note that this release has
no functional changes for users who obtained Numba 0.40.0 via conda.
Patches:
* PR #3338: Accidentally left Anton off contributor list for 0.40.0
* PR #3374: Disable OpenMP in wheel building
* PR #3376: Update 0.40.1 changelog and docs on OpenMP backend
Version 0.40.0
--------------
This release adds a number of major features:
* A new GPU backend: kernels for AMD GPUs can now be compiled using the ROCm
driver on Linux.
* The thread pool implementation used by Numba for automatic multithreading
is configurable to use TBB, OpenMP, or the old "workqueue" implementation.
(TBB is likely to become the preferred default in a future release.)
* New documentation on thread and fork-safety with Numba, along with overall
improvements in thread-safety.
* Experimental support for executing a block of code inside a nopython mode
function in object mode.
* Parallel loops now allow arrays as reduction variables
* CUDA improvements: FMA, faster float64 atomics on supporting hardware,
records in const memory, and improved datatime dtype support
* More NumPy functions: vander, tri, triu, tril, fill_diagonal
General Enhancements:
* PR #3017: Add facility to support with-contexts
* PR #3033: Add support for multidimensional CFFI arrays
* PR #3122: Add inliner to object mode pipeline
* PR #3127: Support for reductions on arrays.
* PR #3145: Support for np.fill_diagonal
* PR #3151: Keep a queue of references to last N deserialized functions. Fixes #3026
* PR #3154: Support use of list() if typeable.
* PR #3166: Objmode with-block
* PR #3179: Updates for llvmlite 0.25
* PR #3181: Support function extension in alias analysis
* PR #3189: Support literal constants in typing of object methods
* PR #3190: Support passing closures as literal values in typing
* PR #3199: Support inferring stencil index as constant in simple unary expressions
* PR #3202: Threading layer backend refactor/rewrite/reinvention!
* PR #3209: Support for np.tri, np.tril and np.triu
* PR #3211: Handle unpacking in building tuple (BUILD_TUPLE_UNPACK opcode)
* PR #3212: Support for np.vander
* PR #3227: Add NumPy 1.15 support
* PR #3272: Add MemInfo_data to runtime._nrt_python.c_helpers
* PR #3273: Refactor. Removing thread-local-storage based context nesting.
* PR #3278: compiler threadsafety lockdown
* PR #3291: Add CPU count and CFS restrictions info to numba -s.
CUDA Enhancements:
* PR #3152: Use cuda driver api to get best blocksize for best occupancy
* PR #3165: Add FMA intrinsic support
* PR #3172: Use float64 add Atomics, Where Available
* PR #3186: Support Records in CUDA Const Memory
* PR #3191: CUDA: fix log size
* PR #3198: Fix GPU datetime timedelta types usage
* PR #3221: Support datetime/timedelta scalar argument to a CUDA kernel.
* PR #3259: Add DeviceNDArray.view method to reinterpret data as a different type.
* PR #3310: Fix IPC handling of sliced cuda array.
ROCm Enhancements:
* PR #3023: Support for AMDGCN/ROCm.
* PR #3108: Add ROC info to `numba -s` output.
* PR #3176: Move ROC vectorize init to npyufunc
* PR #3177: Add auto_synchronize support to ROC stream
* PR #3178: Update ROC target documentation.
* PR #3294: Add compiler lock to ROC compilation path.
* PR #3280: Add wavebits property to the HSA Agent.
* PR #3281: Fix ds_permute types and add tests
Continuous Integration / Testing:
* PR #3091: Remove old recipes, switch to test config based on env var.
* PR #3094: Add higher ULP tolerance for products in complex space.
* PR #3096: Set exit on error in incremental scripts
* PR #3109: Add skip to test needing jinja2 if no jinja2.
* PR #3125: Skip cudasim only tests
* PR #3126: add slack, drop flowdock
* PR #3147: Improve error message for arg type unsupported during typing.
* PR #3128: Fix recipe/build for jetson tx2/ARM
* PR #3167: In build script activate env before installing.
* PR #3180: Add skip to broken test.
* PR #3216: Fix libcuda.so loading in some container setup
* PR #3224: Switch to new Gitter notification webhook URL and encrypt it
* PR #3235: Add 32bit Travis CI jobs
* PR #3257: This adds scipy/ipython back into windows conda test phase.
Fixes:
* PR #3038: Fix random integer generation to match results from NumPy.
* PR #3045: Fix #3027 - Numba reassigns sys.stdout
* PR #3059: Handler for known LoweringErrors.
* PR #3060: Adjust attribute error for NumPy functions.
* PR #3067: Abort simulator threads on exception in thread block.
* PR #3079: Implement +/-(types.boolean) Fix #2624
* PR #3080: Compute np.var and np.std correctly for complex types.
* PR #3088: Fix #3066 (array.dtype.type in prange)
* PR #3089: Fix invalid ParallelAccelerator hoisting issue.
* PR #3136: Fix #3135 (lowering error)
* PR #3137: Fix for issue3103 (race condition detection)
* PR #3142: Fix Issue #3139 (parfors reuse of reduction variable across prange blocks)
* PR #3148: Remove dead array equal @infer code
* PR #3153: Fix canonicalize_array_math typing for calls with kw args
* PR #3156: Fixes issue with missing pygments in testing and adds guards.
* PR #3168: Py37 bytes output fix.
* PR #3171: Fix #3146. Fix CFUNCTYPE void* return-type handling
* PR #3193: Fix setitem/getitem resolvers
* PR #3222: Fix #3214. Mishandling of POP_BLOCK in while True loop.
* PR #3230: Fixes liveness analysis issue in looplifting
* PR #3233: Fix return type difference for 32bit ctypes.c_void_p
* PR #3234: Fix types and layout for `np.where`.
* PR #3237: Fix DeprecationWarning about imp module
* PR #3241: Fix #3225. Normalize 0nd array to scalar in typing of indexing code.
* PR #3256: Fix #3251: Move imports of ABCs to collections.abc for Python >= 3.3
* PR #3292: Fix issue3279.
* PR #3302: Fix error due to mismatching dtype
Documentation Updates:
* PR #3104: Workaround for #3098 (test_optional_unpack Heisenbug)
* PR #3132: Adds an ~5 minute guide to Numba.
* PR #3194: Fix docs RE: np.random generator fork/thread safety
* PR #3242: Page with Numba talks and tutorial links
* PR #3258: Allow users to choose the type of issue they are reporting.
* PR #3260: Fixed broken link
* PR #3266: Fix cuda pointer ownership problem with user/externally allocated pointer
* PR #3269: Tweak typography with CSS
* PR #3270: Update FAQ for functions passed as arguments
* PR #3274: Update installation instructions
* PR #3275: Note pyobject and voidptr are types in docs
* PR #3288: Do not need to call parallel optimizations "experimental" anymore
* PR #3318: Tweak spacing to avoid search box wrapping onto second line
Contributors:
* Anton Malakhov
* Alex Ford
* Anthony Bisulco
* Ehsan Totoni (core dev)
* Leonard Lausen
* Matthew Petroff