Minutes_2020_10_13

Public Numba Meeting: 2020-10-12

Attendees: Stuart, Siu, Todd A, Alexander, Angelina, Ethan, Hameer, Ivan, Jim, Juan, Michael, Val, Graham

0. Feature Discussion

Type annotation discussion

Siu: Opening remarks on type annotations.
Siu: background on type annotations, Numba core developers are not hugely familiar with type annotations for large projects. Most use has been for small projects, so we probably lack background information about best practice, style, use etc.
Siu: Some concerns around:
- the readability issues as a result of adding type annotations
- numerous ways to spell the same thing and lack of consensus
- Numba already has a type system and "type" is a hugely overloaded term :)
Siu: Floor opened, please educate us on type annotations...
Ethan: WRT readability, there's some discussion in the discourse threads and in the discussion after the meeting announcement. With the current formatting standards on the repo type hints are likely to impact readability, but other formatting can make this better. There are other places where methods/args etc imply what the args are (it's known information based on internals) but for external/new contributors its not hugely obvious. A lot of time is spent working out what is going on, code may not be commented and this makes it harder to work out what is going on.
Siu: What tools etc do you use?
Ethan: VSCode
Siu: Are there any extra plugins to help make this smoother?
Ethan: Just the python one with mypy, standard set up will do auto complete etc.
Stuart: Anyone has experience maintenance liability?
- CI: mypi
- Maintenance: keeping type annotation correct
Jim P: Has heard that the ability to check is less useful if only some small amount is annotated, perhaps >80% needed (this is a fabricated number for the purposes of discussion) to make it so that the code base "sorts itself out".
Todd A: Any runtime implications (Siu: none)
Siu: If 80% needed that means that the benefits are felt really late (unless lax annotations are used). There are a few projects out there trying to add runtime checking (typeguard), is there anything we can turn on early, perhaps within testing?
Ethan: Has never used typeguard directly but other projects have found this having the right effect, but would impact runtime behaviour (potentially), so check during tests might be the thing to do. There may be edge cases, and there's quite a few (Stuart ed) "strange" things type wise. Also possible to put stuff explicitly in testing for this sort of thing. As coverage of annotations goes up, value goes up. WRT benefits earlier, distribution of annotations is unlikely to be scattered, it will naturally be more focussed on e.g. directory wise. This gets a higher type annotation density earlier.
Alexander: public pandas API is entirely annotated. It has helped a bit when it comes to looking at the source code esp. in conjunction with the documentation.
Ethan: WRT maintainability, is this different to e.g docstrings with the same information, but in this case its actually checkable to some degree
Siu: It's not actually going to be possible to rely on mypy for "various reasons" but runtime check could help with this and is a way to ensure correctness without having to burden the reviewers with manually checking so much.
Ethan: Could integrate this with CI.
Alexander: These are two different tools, type annotation checks and runtime checks on specific types in the source are orthogonal.
Siu: Coupling type annotation with runtime checks means one uses the other and it doesn't impact source, and can be kept out of prod.
Stuart: more convinced. time will tell. Is it better to spend time in other contributions---docstrings, long documentation, etc. Is there way to hide type annotations?
Ethan: Can make type stubs for annotations so they are not in the actual file, can't see benefit of decoupling.
Stuart:
Jim P: For cases where there's loads of really long args, seems like typedef (whatever that is in typing) it'd probably help.
Stuart: in pandas code, there's a lot of compound types.
Jim P: WRT comparison to pandas. Pandas is a different kind of project, its a much more shallow project in terms of call stack. There's a large number of functions that the users contribute with, whereas Numba has a tiny surface exposed to users (@jit) and as a result its a different project/problem as the ratio for functions on interface to the call stack. Is there a better project perhaps to compare too?
Ethan: RE: needing to get used to it. Has been through that, about a year ago, but thinks learning curve is relatively short. RE compound types, would probably be helpful, as there's so little annotated its not obvious what the repeated pattern is yet. For some work projects the difficulty with compound types is that if they are defined locally all the time that's not helpful. It's been useful to create a single module with types to hold the compounds across the whole code base, this technique could be applied again here (requires more things done to spot patterns).
Stu: Are there any "champions" for this that could help out with review, expertise etc. This could help with maintenance costs etc.
Ethan: Yes, https://numba.discourse.group/t/coordinating-type-annotations/219, Can this be made more efficient though?
Siu: Cost of this to review given amount of fire fighting etc, would be very pleased for others take it on too.
Ethan: Beyond adding type guard, WRT the annotation PR structure that'd make it easier to review?
Siu: A lot of it is having tools to instill confidence:
Hameer: Perhaps add typeguard right now to strictly verify this.

Enhanced inspect_cfg

Stuart's demo, notebook to follow, perhaps.

More on extensions

Defer to next time or on discourse

1. New Issues

#6352 - CUDA: Local memory kernel launch causes excessive memory allocation
#6350 - NumbaWarning: The TBB threading layer requires TBB version 2019.5 or later i.e., TBB_INTERFACE_VERSION >= 11005. Found TBB_INTERFACE_VERSION = 9002. The TBB threading layer is disabled.
- perhaps to add new warning category so user can filter it out.
#6347 - numba causing periodic process stalls
#6345 - Python 3.9 Support
#6344 - literal_unroll doesn't seem to handle a list - yields unknown problem.
#6339 - CUDA: Performance regression launching specialized kernels
- has PR
#6336 - TypeRef should implement key
- has PR
#6332 - Numba 0.52.0RC1 / llvmlite 0.35.0RC1 checklist
#6328 - Support for Unified (Managed) Memory in the EMM Plugin interface
#6327 - Provide method to attach managed memory to a stream
#6323 - Import Error with Numba _typeconv
#6321 - Indexing within numpy array slow for high index values
#6320 - Symbol not found: _do_scheduling_unsigned
- tricky problem to deal with the symbols
#6314 - functools.reduce doesn't work with numpy functions in parallel accelerator
#6313 - [Request] Implement numpy.ufunc.METHOD
#6312 - Equality check between numpy string array and string element fails
#6311 - Support subclass from a jitclass
#6308 - pytest doctest can't discover functions with @guvectorize decorator

Closed Issues

#6330 - CUDA Simulator not working
#6317 - ImportError: cannot import name 'numpy_support' from 'numba'
#6315 - mainline @ 159510a is failing CI
#6302 - Implement np.gradient
#6301 - Implement np.logspace and np.geomspace

2. New PRs

#6351 - Dev/6335+6278
#6349 - Refactor Dispatcher to remove unnecessary indirection
#6348 - doc: fix typo in jitclass.rst ("initilising" -> "initialising")
#6346 - DOC: add where to get dev builds from to FAQ
#6343 - CUDA: Add support for passing tuples and namedtuples to kernels
#6341 - Re roll 6279
#6340 - CUDA: Fix #6339, performance regression launching specialized kernels
#6338 - Direct user questions to Discourse instead of the Google Group.
#6337 - Implements key on types.TypeRef
#6335 - Split optimisation passes.
#6334 - [WIP] pass in the builder with get_attr
#6333 - Remove dead _Kernel.call
#6326 - set build and test dependency to llvmlite=0.35.0dev0_llvm_with_new_svml_patch
#6319 - Permit switching off boundschecking when debug is on.
#6318 - PR #5892 continued
#6307 - Remove restrictions on SVML version due to bug in LLVM SVML CC
#6305 - Fixes #5871. Segfault when calling pyapi.call() passing args as None

Closed PRs

#6342 - Fix docs on literally usage.
#6331 - Explicitly state that NUMBA_ENABLE_CUDASIM needs to be set before import
#6329 - Flake8 fix for a CUDA test
#6325 - Refactor code for atomic operations
#6324 - PR 6208 continued
#6322 - PR 6127 debug
#6316 - Make IR inliner tests not self mutating.
#6310 - Declare unofficial *BSD support
#6309 - Add suitable file search path for BSDs.
#6306 - Fix flake8 in cuda atomic test from merge.
#6304 - Support NumPy 1.19
#6303 - Merge cuda flake8 prs

3. Next Release: Version 0.52.0, RC=7th Oct, Final=RC+=~3weeks?

Requests for 0.52
- Fast(er) typed.List/typed.Dict? Doesn't have to be in 0.52, next 3-5 months is fine.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minutes_2020_10_13

Public Numba Meeting: 2020-10-12

0. Feature Discussion

Type annotation discussion

Enhanced inspect_cfg

More on extensions

1. New Issues

Closed Issues

2. New PRs

Closed PRs

3. Next Release: Version 0.52.0, RC=7th Oct, Final=RC+=~3weeks?

4. Upcoming tasks

Clone this wiki locally