Skip to content
esc edited this page Aug 9, 2022 · 1 revision

Numba Meeting: 2022-08-09

Attendees: brandon willard, Graham Markall, Kaustubh Chaudhari, LI Da, Luk, Shannon Quinn, Todd A. Anderson, stuart, Siu Kwan Lam, Val, Jim Pivarski

NOTE: All communication is subject to the Numba Code of Conduct.

Please refer to this calendar for the next meeting date.

0. Discussion

  • Dropping support for old Python versions:
  • Dropping support for win-32 and linux-32
  • Dropping support for old NumPy versions: https://github.com/numba/numba/pull/8283#issuecomment-1197505150
    • Only 1.19 and upwards?
    • Need to check on the platform support
    • Will this have any implications
    • Consensus: Runtime: 0.57 will support only 1.19 and above
    • Consensus: Build time: probably ok to use 1.19 to build
  • Issue #8309 / PR #8310 - CUDA: Atomic addition on complex components does not work in 0.56:
    • This is a regression due to changes in PR #7999, which made array.real and array.imag overloads
    • This is a problem because:
      • The overloads return arrays (not permitted in CUDA device functions)
      • Overload can't be resolved from the low-level API (used to implement CUDA atomics)
    • One fix (in PR #8310) is to revert this change and use the low-level API for .real and .imag again
      • Pros: it works, and is a relatively simple fix
      • Cons: it is a step backwards in the core implementation
    • Another fix is to make overloads returning arrays work on CUDA
      • Pros: Retains forward direction in the core, a general CUDA improvement
      • Cons: Lots of work - needs NRT, change to call convention, and rewriting of many CUDA implementations as overloads
      • Siu tried doing this but got stuck on (1) target_extension problems (active context not CUDA) and (2) _generate_real_imag_attr.<locals>.intrin_typing.<locals>.codegen (an @intrinsic) not found in lowering registry.
    • Does a third way exist?
  • Resolution: Merge #8310 after review for 0.56.1.
    • Look into issues preventing it working with overloads afterwards
  • PR #8294: CUDA: Add trig ufunc support
    • Works for the CUDA target, but adding simulator support is difficult.
    • Does it need simulator support?
      • Option 1: use __array_ufunc__ to fix implementation.
      • Option 2: Try making the "Fake within CUDA kernel array" an actual ndarray subclass.
      • Option 3: If neither of the above approaches succeed, discuss whether simulator support is required again next week.
  • Compilation speed benchmark - Luk

1. New Issues

  • #8303 - Allow multiple outputs for guvectorize on CUDA target
  • #8304 - Python 3.11
  • #8305 - Len of two concatenated Bytes objects is 0
  • #8307 - Ambiguous overloads are allowed for calls to jitted functions inside other jitted functions
    • Needs poking, else it is operating in "surprise mode".
  • #8309 - Numba 0.56 does not support atomic.add on arrays of complexs anymore
  • #8311 - presence of while loop breaks literal_unroll compilation
  • #8314 - Numba not vectorizing 2d copy loop
  • #8317 - LoweringError
  • #8322 - Obsolete pycc script/code
    • Answer is yes: please remove.

Closed Issues

  • #8312 - numba's np.full_like is inconsistent with numpy's
  • #8313 - Segfault in nested loop with aliased array

2. New PRs

  • #8306 - Fix len of two concatenated Bytes objects bug
  • #8308 - CUDA: Support for multiple signatures
  • #8310 - CUDA: Fix Issue #8309 - atomics don't work on complex components
  • #8315 - Add get_local_mem_per_thread method to Dispatcher
  • #8316 - Fix error handling in float unboxing
  • #8318 - Cleanup old support for Python 3.6 and earlier code
  • #8319 - Bump minimum supported Python version to 3.8
  • #8320 - Add name support for GUFuncs
  • #8321 - Fix literal_unroll pass erroneously exiting on non-conformant loop.
  • #8323 - Remove mk_unique_var from array_analysis.py
  • #8324 - Remove use of mk_unique_var in untyped_passes.py
  • #8325 - Remove use of mk_unique_var in stencil.py
  • #8326 - Remove mk_unique_var from parfor_lowering.py

Closed PRs

3. Next Release: Version 0.57.0/0.40.0, RC Jan 2023

Clone this wiki locally