Skip to content
Valentin Haenel edited this page Mar 30, 2021 · 1 revision

Numba Meeting: 2021-03-30

Attendees:

NOTE: All communication is subject to the Numba Code of Conduct.

0. Feature Discussion/admin

  • 0.53.1 status
    • Need to do a GH release
    • Release notebooks are reviewed, need minor updates then can be promoted.
  • Mentorship program
    • 5 participants at present
  • Should Numba behave more like C-extension? #6862
  • Graham proposed API for adding call site attributes: numba/llvmlite#702
  • Related to #6862: noop_ufunc(inp, out=outarray)
    • if outarray is initialized to np.arange(N)
    • noop_ufunc does not write to the output array.
    • would outarray retain the same values after the call?

1. New Issues

  • #6872 - try: if array.shape[1] - tuple index out of range. works without numba, doesn't work with @njit(fastmath=True, nogil=True, cache=True)
  • #6864 - guvectorize returns different output after upgrading to Numpy 1.20
    • TODO: Ask numpy devs
  • **** #6862 - Program crashes with ACCESS_VIOLATION. Windows only.
    • Should Numba behave more like C-extension?
    • Python 3.9 interpreter loading a Python 2.7 DLL into memory. Numba via llvm picked up symbols from Python 2.7 lib instead of 3.9 and as a result, segfault.
    • LLVM symbol look up is differnt to what OS does.
    • More general issue, could well occur with e.g. BLAS.
    • Ignore for now. Document the problem as known limitation. TODO Add task issue for this.
  • #6859 - Parallel prange fails on higher number of iterations and is slower
    • TODO: need a quick doc update to mention list.append (any container mutation) is not thread-safe
  • #6857 - Implement np.nan_to_num
  • #6855 - Rollaxis
  • #6853 - Error "The use of yield in a closure is unsupported"
    • Make better error msg... mention "generator"

Closed Issues

  • #6868 - CUDA: Eagerly compiling kernels with tuple parameters causes a segfault
  • #6865 - 0.53.1 Checklist
  • #6863 - Try command
  • #6860 - Doc error: old reference to hsa
  • #6858 - ImportError: DLL load failed

2. New PRs

  • #6873 - docs: Update reference to @jitclass location
  • #6871 - Force text-align:left when using Annotate
  • #6870 - Alternative to #6770 for more flexibility
  • #6869 - Implement builtin sum()
  • ** #6856 - Avoid recompiling inlined overload impl
    • need reproducer/benchmark
  • #6852 - nanvar with ddof-v1

Closed PRs

  • #6867 - Update changelog for 0.53.1
  • #6866 - Changes for 0.53.1
  • #6861 - updated reference to hsa with roc
  • #6854 - PR 6096 continued
  • #6851 - set non-reported llvm timing values to 0.0

3. Next Release: Version 0.54.0/0.37.0, RC=May 2021

4. Upcoming tasks

llvmlite call site attributes

PR: https://github.com/numba/llvmlite/pull/702

Existing: attributes on declaration arguments

Example IR:

declare noalias i32 @"my_func"(i32 zeroext %".1", i32 dereferenceable(5) dereferenceable_or_null(10) %".2", double %".3", i3     2* nonnull align 4 %".4")

This is generated with:

func = self.function()
func.args[0].add_attribute("zeroext")
func.args[1].attributes.dereferenceable = 5
func.args[1].attributes.dereferenceable_or_null = 10
func.args[3].attributes.align = 4
func.args[3].add_attribute("nonnull")

Proposed: attributes on call site arguments

Example IR:

call void @"fun"(i32* noalias sret %"retval", i32 42, i32* noalias %"other")

Example API:

fun = ir.Function(builder.function.module, fun_ty, 'fun')
fun.args[0].add_attribute('sret')
retval = builder.alloca(int32, name='retval')
other = builder.alloca(int32, name='other')
builder.call(
    fun,
    (retval, ir.Constant(int32, 42), other),
    arg_attrs={
        0: ir.ArgumentAttributes(('sret', 'noalias')),
        2: ir.ArgumentAttributes('noalias'),
    }
)

Alternatives?

Create an actual arguments class, similar to the formal arguments used in declarations, so that attributes can be added later. Example of declaration from test_function_attribute in llvmlite/tests/test_ir.py, line 141:

func = self.function()
func.args[0].add_attribute("zeroext")
func.args[1].attributes.dereferenceable = 5
func.args[1].attributes.dereferenceable_or_null = 10
func.args[3].attributes.align = 4
func.args[3].add_attribute("nonnull")

Thoughts

  • PR #702's proposed API works, but is asymmetric with attributes on arguments in declarations.
  • The alternative is more work.
  • Accept the PR #702 API, noting that we might want to expand on it with the alternative proposal later? (or not expand on it later if no need...)

Notes

Stuart:

  • Additive, not invasive
  • Doesn't preclude doing it the other way later
  • Better than maintaining a hack

Siu:

a = ir.ArgumentAttributes()
a.align = 4

On removal of tail:

  • Siu: tail is a bug not a feature, so OK to remove
  • Stuart: Put a notice in the release notes that it's gone.
Clone this wiki locally