Skip to content
Siu Kwan Lam edited this page Dec 13, 2018 · 1 revision

Numba Meeting: 2018-12-04

Attendees: Siu, Ehsan, Stuart, Stan

0. Feature Discussion

  • LLVM 7
    • OSX + Win issue with numba cache test
    • Seems to related to low core count
  • Caching questions
    • multiprocess caching relies on atomic file move
    • there're known limitation that cache expiration check doesn't check other files
    • caching function that takes function with arguments
      • currently function-as-argument is just a syntactic sugar for using function-as-global

1. New issues

  • #3560 - Hide messages from deliberate malloc impl overflow test as they can be alarming
  • #3559 - RunTimeWarning is sometimes raised in vectorized function and should not
    • may have to do with np.isnan, or other FP flags
  • #3557 - numpy.ndarray.copy() incorrect when operating on less common array layouts
  • #3555 - "RuntimeError: missing Environment" using print in second call to cached numba function
  • #3554 - Slicing array[:0] fails with parallel=True in 0.41
    • PR #3558 fixes
  • #3553 - CUDA device->host transfer with non-contiguous arrays overwrites adjacent elements
    • Fix by detecting situation and launching custom copy kernel
  • #3551 - Workqueue scheduler hangs on prange calling function containing prange
    • how would runtime detection work? need to generate both versions? (Probably too hard)
    • just try to inline function call inside prange body
  • #3550 - Host-to-device copy of broadcast arrays mangles data
  • #3548 - Performance regression from 0.38 to 0.41
  • #3546 - [Feature Discussion] Nested Reflected Types

Already Closed

  • #3547 - NotImplementedError: make_function - during tuple unpacking

2. Open PRs

New

  • #3561 - Fixing the boolean copy from data
  • #3558 - Fixes array analysis for inplace binary operators.
  • #3556 - [WIP] Support for np.trim_zeros
  • #3552 - Handle broadcast arrays correctly in host->device transfer.
  • #3549 - ParallelAccelerator caching improvements
    • Siu needs to add skip tests
  • #3545 - Fix @njit description in 5 min guide
    • Ready to merge
  • #3543 - Avoid implementation error being hidden by the try-except
    • Ready to merge

Already Merged

  • #3544 - Add long_running test flag and feature to exclude tests.

Old

  • #3538 - Avoid future C-level assertion error due to invalid visibility

    • Ready to merge
  • #3516 - [WIP] Typeof dtype values

    • Siu needs to re-review
  • #3519 - WIP: fix-3457 support of numpy repeat.

  • #3520 - Fix @stencil ignoring cval if out kwarg supplied.

    • Discussion on going
  • #3468 - Add support for np.clip and ndarray.clip.

    • depends on fix on @overload_method for kwargs
  • 3437 - Changes to accommodate LLVM 7.0.x

  • 3450 - [WIP] generated_jit for CUDA kernels

  • 3392 - Launch and attach gdb directly from Numba.

    • ready for review
  • 3390 - typeinfer: use unknown_loc object instead of string literal

  • 3162 - Support constant dtype string in nopython mode in functions like numpy.empty.

    • Need to resolve #3195
  • 3134 - [WIP] Cfunc x86 abi

    • Needs re-review
  • 3046 - Pairwise sum implementation.

  • #2999 - Support LowLevelCallable

  • #2942 - Fix linkage nature (declspec(dllexport)) of some test functions

  • #2894: [WIP] Implement jitclass default constructor arguments.

  • #2817: [WIP] Emit LLVM optimization remarks

Merged old PRs

  • 3449 - [WIP] Allow matching non-array objects in find_callname()
    • merged
  • 3414 - [WIP] Refactor Const type
    • merged
  • 3399 - Add max_registers Option to cuda.jit
    • merged
  • 3397 - Fix with-objmode warning
    • merged
  • #2950 - Fix dispatcher to only consider contiguous-ness.
    • merged
  • 3385 - conda recipe: whitelist libiomp5.dylib
    • merged
  • 3382 - CUDA_ERROR_MISALIGNED_ADDRESS Using Multiple Const Arrays
    • merged

===========================

4. Next Release: Version 0.42, RC=Dec 10, Final=Dec 17, 2018

  • LLVM 7
  • Other string features/fixes
  • Caching bugs
Clone this wiki locally