Minutes_2020_01_14
Valentin Haenel edited this page Mar 30, 2020
·
1 revision
Attendees: Stuart, Graham, Aaron, Pearu, Todd, Val
- NBEP for GPU memory manager https://github.com/numba/numba/issues/3247#issuecomment-572948356
- Draft is self contained doc
- looking for feedback to see if it matches usage expectations
- Secondary concern to update it with what Numba changes would be required.
- Interface is currently designed to be register and opt-in and it takes over memory management
- Suggestions for other libraries that provide an allocator to test as plugin RAPID RMM, CuPy is on TODO.
- Feedback is great from now PRs or issues are fine.
- put in gitter and on mailing list + numba to tweet
-
__cuda_array_interface__
https://github.com/numba/numba/issues/4933#issuecomment-572934894 - choices (more than 1 is possible) - a) safe default with implicit sync - b) unsafe default and let user sync - c) new interface to provide a) or b)- Graham noted that if there's e.g. aynchronous memory management on streams at some point this will have to be addressed.
- lint NumPy funcs vs numba overloads (#2558)
- no takers
- #4724 parfor DCE in header patch conclusion, do doc, merge?!
- OK to merge
- #4615 threadmasking state
- ready for review
- #4967 First class function support.
- Siu left feedback
- some issues around generality of type objects resolved
- Should first part land in 0.48. (Q. for Siu)
- Docs for release.
- Pearu: Talks about whether
struct
is of interest.
-
#5067 - Avoidable loop vectorization failure with float datatype (x3.5 performance penalty)
- maybe related to: https://github.com/numba/numba/issues/2196
- need to check llvm9
- **** #5065 - Calling jitted function with *args in a prange fails
- regression
- likely bytecode changes related to tuple
- Todd will take a look
-
#5064 - overload inlining fails for builtin operators
- new failure
- ** #5063 - Is there any interest in CFG for native assembly?
- can accept if hightlight/warn about indirect jumps
- make sure its ok cross platform
- Take a look at radare2
- Will it work for CUDA?
- #5054 - how can I do this?
- #5052 - Factor common infrastructure between CUDA simulator and hardware target implementations
- #5051 - error
- #5050 - Can @jitclass support numpy.array([string])?
- #5045 - Looplift fail on if-branch after for-loop in py3.8
- **** #5043 - @overload cannot replace previously defined or built-in implementations
- need policy
- probably ban for safety, lift this later
- **** #5042 - Make DummyType type factory for testing
- discuss conditional on API gist
- #5041 - Catch the use of global TypedList in JITed functions.
- #5066 - Improve max() translation on x86-64
- #5057 - Cuda device api: copy_to_host might not cast boolean correctly
- #5056 - ValueError: Argument types for wrap_index must match in @njit(parallel=True) when using int32 as index
- #5055 - Shared memory not support boolean type
- #5038 - Switch off report requesting in error messages for non-lowering problems
-
#5068 - Remove Python 3.4 backports from utils
-
#5062 - Update docs for updated version requirements
-
#5061 - Prevent kernel launch with no configuration, remove autotuner
-
#5060 - [WIP] enables np.sum for timedelta64
-
#5059 - Docs: Explain how to use Memcheck with Numba, fixups in CUDA documentation
-
#5053 - Convert from arrays to names in define() and don't invalidate for multiple consistent defines.
-
#5047 - Add del's to reductions.
-
#5044 - R&D thread masking
-
Actions:
- Share plan of chainsawing
- TODOs for others orthogonal to chainsaw
- Ask Siu about Function type PR going into 0.48
- Stu look at radare2 for asm CFG
- #5058 - Permit mixed int types in wrap_index
- #5049 - Clarify what dictionary means
- #5048 - Fix CI py38
- #5046 - Update setup.py and buildscripts for dependency requirements
- #5040 - Drop Py2.7 and Py3.5 from public CI
- #5039 - Disable help messages.
- Requests for 0.48
- TBD