Skip to content
Stan Seibert edited this page Aug 13, 2018 · 1 revision

Numba Meeting: 2018-08-09

Attendees: Siu, Stan,

1. New issues

  • #3208 - Inconsistent behavior for objects allocated in intrinsics used in two JITed functions
    • Usage error, Siu has responded
  • #3207 - Pass c_char or c_char_p to ctypes function in numba
    • c_char is straightforward and would simplify Numba internals
    • c_char_p has potential nul-termination and lifetime questions
  • #3206 - @guvectorize copy_to_host() wasting too much time
    • Stan can respond
    • Timing kernel execution, not copy time
  • #3205 - Performance case study
    • Likely culprit is Numba logic to handle negative indexing
    • Try different approaches to create a best-practice for identifying non-negative indices to compiler to avoid if-statements.
    • This could be parallelized as well, possibly with some changes to ParallelAccelerator.
  • #3204 - AssertionError while looking up variables
    • Duplicate of loop-lifting bug reported periodically
    • Stuart has figured out problem and has most of a fix
  • #3203 - numba.cuda.detect() feature request
    • numba -s diagnostics would be helpful
  • #3201 - Instantiate default arguments for extension functions early in the pipeline
    • Useful, how to implement?
    • Normalization pass to populate these defaults?
  • #3200 - NVIDIA cub support
    • Could be big request
    • Needs thinking and response
  • #3196 - datetime64, timedelta64 support on GPU
    • See PR
  • #3195 - Problems with support_literals and the Const type
    • This issue is notes from discussion last meeting

2. Open PRs

New

  • 3202 - [WIP] TBB + backend refactor... DO NOT MERGE!
    • Going to replace current thread backend due to various hard to fix issues in the implementation
    • two backends: TBB (one known issue), OpenMP
    • Comparing both to each other
    • Will check platform compatibility
    • Nested parallelism?
  • 3199 Support inferring stencil index as constant in simple unary expressions
    • Ready to merge
    • Stuart will give quick look and merge
  • 3198 Fix GPU datetime timedelta types usage
    • Ready to merge

Old

  • 3186 Support Records in CUDA Const Memory
    • needs Siu to check something
  • 3181 Support function extension in alias analysis
    • Todd to review
  • 3172 Use float64 add Atomics, Where Available
    • needs review by Siu
    • Does this impact distributed usage when cluster is not homogenous
  • 3171 Fix #3146. Fix CFUNCTYPE void* return-type handling
    • Stuart will review
  • 3166 [WIP] Objmode with-block
    • Still in progress
  • 3162 Support constant dtype string in nopython mode in functions like numpy.empty.
    • Blocked until we figure out 3195 or limit the scope.
  • 3160 First attempt at parallel diagnostics
    • Looking for feedback on utility of output, impl is a bit hacky
  • 3153 Fix canonicalize_array_math typing for calls with kw args
    • Siu will review
  • 3145 support for np.fill_diagonal
    • Stuart will review and decide how much strange NumPy behavior to replicate
  • 3142 Issue3139
    • Stuart will run through build farm
  • 3137 Fix for issue3103
    • Stuart will run through build farm
  • 3134 [WIP] Cfunc x86 abi
    • Needs re-review
  • 3128 WIP: Fix recipe for jetson tx2/ARM
    • Will merge when ready
  • 3127 Support for reductions on arrays.
    • Stuart will run through build farm
  • 3124 Fix 3119, raise for 0d arrays in reductions
    • Stuart needs to fix.
  • 3093 [WIP] Singledispatch overload support for cuda array interface.
    • Needs review
  • 3046 Pairwise sum implementation.
  • 3017 Add facility to support with-contexts
  • #2999 Support LowLevelCallable
  • #2983 [WIP] invert mapping b/w binop operators and the operator module
  • #2950 Fix dispatcher to only consider contiguous-ness.
  • #2942 Fix linkage nature (declspec(dllexport)) of some test functions
  • #2894: [WIP] Implement jitclass default constructor arguments.
  • #2817: [WIP] Emit LLVM optimization remarks

===========================

3. Feature Discussion

  • buildbot talk to slack?
  • ask users to disable gitter.im pushes from travisci from private forks?
    • Stan will fix

4. Next Release: Version 0.40, RC=Sept 3, 2018, Final=Sept 10, 2018

  • Experimental python mode blocks
  • Refactored threadpool interface
  • AMD GPU backend
  • Parallel diagnostics
  • Usual collection of bug fixes
Clone this wiki locally