Minutes_2023_08_01
esc edited this page Aug 1, 2023
·
1 revision
Attendees: Jim Pivarski, Guilherme, Matthew Murray, Siu Kwan Lam, Da Li, Kaustubh, Graham Markall, Todd A. Anderson, Val, Ianna Osborne FPOC (last week): Guilherme FPOC (incoming): Graham
NOTE: All communication is subject to the Numba Code of Conduct.
Please refer to this calendar for the next meeting date.
-
0.58rc1 Release
- PRs to go:
- NumPy 1.25 support merged: https://github.com/numba/numba/pull/9105
- Some dependent follow ups
- Still aiming to tag next Monday and push out RC artifacts
- No very high risk items
- PRs that may move to the next milestone:
- Cache invalidation: https://github.com/numba/numba/pull/8396
- Support GUFunc inside jit: https://github.com/numba/numba/pull/8984
- Sync with Stuart next Monday, aiming not to delay release.
- PRs to go:
- maybe talk about LPython?
- https://lpython.org/blog/2023/07/lpython-novel-fast-retargetable-python-compiler/
- Numba + CUDA interface prototype: https://github.com/lcompilers/lpython/compare/main...gmarkall:lpython:nvvm?expand=1#diff-471d874c2f8ebdebd76e25083cc809cb229060d20d2820c6210bd808d16451eb with small Numba patch: https://github.com/numba/numba/compare/main...gmarkall:numba:lpython?expand=1
- Observations on benchmarks:
- Numba and LPython same speed on x86 for summation benchmark, but Numba slower on M1
- Some benchmarks use dicts and lists, which have handwritten LLVM code implementations that may be more optimized for specific cases
- Conclusion - things to learn for Numba from both these.
- Question: does the LPython compiled code require any external dependencies / shared libs?
- Potentially rolling in some other LLVM IR to compiled code
- Jim: LPython starts from an AST, planning to investigate how this works with a decorator.
-
PR 9095:
Support dtype keyword in arange_parallel_impl (Todd)
- (Discussion clarifying PR)
-
PR 9108: Add noalias option to jit decorator (Todd)
- All tests passing except for
test_use_of_ir_uknown_loc
. - Question: Why is the test now searching for "Unknown location" in the output?
- Next steps: check with Stuart next week
- All tests passing except for
- call cfunc from c++, directly from cfunc or from cfunc.address, has different performance (Da)
- avg= 7.13159 ns/call (dummy c++)
- avg= 232.824 ns/call (cfunc)
- avg= 7.27619 ns/call (casted cfunc)
- cfunc:
auto cfunc = py::reinterpret_borrow<py::function>(py_module.attr("add_cfunc"));
- address:
typedef int32_t (*c_func)(int32_t, int32_t); c_func cfunc_address = reinterpret_cast<c_func>(py_module.attr("add_cfunc").attr("address").cast<intptr_t>());
; Function Attrs: nofree norecurse nounwind writeonly define i32 @_ZN8__main__3addB2v2B44c8tJTIcFHzwl2ILiXkcBV0KBSiNiHkkANTZhEkUQPZoAEdd(double* noalias nocapture %retptr, { i8*, i32, i8* }** noalias nocapture readnone %excinfo, double %arg.x, double %arg.y) local_unnamed_addr #0 { entry: %.6 = fadd double %arg.x, %arg.y store double %.6, double* %retptr, align 8 ret i32 0 } ; Function Attrs: nofree norecurse nounwind writeonly define double @cfunc._ZN8__main__3addB2v2B44c8tJTIcFHzwl2ILiXkcBV0KBSiNiHkkANTZhEkUQPZoAEdd(double %.1, double %.2) local_unnamed_addr #0 { entry: %.4 = alloca double, align 8 store double 0.000000e+00, double* %.4, align 8 %.8 = call i32 @_ZN8__main__3addB2v2B44c8tJTIcFHzwl2ILiXkcBV0KBSiNiHkkANTZhEkUQPZoAEdd(double* nonnull %.4, { i8*, i32, i8* }** undef, double %.1, double %.2) #2 %.18 = load double, double* %.4, align 8 ret double %.18 }
- Calling from
cfunc
calls through Python, whereascfunc.address
is more direct.
- numba#9091 - Reminder to add deprecation warning for new_style error capture
- numba#9092 - Still problematic: non-deterministic NaN values in scipy.integrate.solve_ivp when compiling function with numba #8931
- numba#9097 - Dispatch error
- numba#9098 - tuple of tuple arguments not allowed in parfor loop
- numba#9102 - Towncrier rendering
-
numba#9103 - Potential memory leak? (Potentially related to
Generator
type) - numba#9104 - Ommited keyword argument can't be a literal
-
numba#9107 - No implementation of function
Function(<built-in function mul>) found for signature: >>> mul(array(float32, 1d, C), array(float32, 1d, C))
- numba#9109 - various seg faults on Debian arm64 with numba 0.57.1
- numba#9110 - ValueError: cannot compute fingerprint of empty list
-
numba#9089 - Fix segfault on passing
None
for args in PythonAPI.call - numba#9090 - Add deprecation notice for new_style error capturing.
- numba#9094 - Add support for a 'max' level to NUMBA_OPT environment variable.
- numba#9095 - Support dtype keyword in arange_parallel_impl
-
numba#9099 - Make all parameter names in
@overload
s match the API being overloaded. - numba#9100 - Add towncrier news snippets for PRs that are missing them.
- numba#9101 - Add misc script to find missing towncrier news files
- numba#9106 - CUDA: Add overloads generated by specialization to the current dispatcher.
- numba#9108 - Add noalias option to jit decorator.
- numba#9111 - Fixes ReST syntax error in PR#9099
- numba#9112 - Fixups for PR#9100
- numba#9113 - Add support for np.diagflat
- numba#9114 - update np min to 122
- numba#9093 - Updates the minimum supported NumPy to 1.22.
- numba#9096 - Debug azure-ci conda json problem
- merged - numba#9105 - NumPy 1.25 support (PR #9011) continued