Skip to content

Backwards incompatible ideas for a major release

Sebastian Berg edited this page Nov 16, 2020 · 26 revisions

This is a collection of ideas for changes that would break backwards compatibility and be inappropriate for anything but a major release. If/when we do make a major release, we can then go through this list and see what can be taken along.

That does not mean that all of these ideas necessarily strictly require a major release.

Python APIs

  • Make the default integer type int64 at least on python 3, no matter what long is on the system (from @seberg). (Making the default integer equal to intp may be the simpler option that is at least predictable/easier to reason about.)
  • overhaul casting rules to avoid things like uint64 + int64 -> float64. Perhaps use "C-like" casting instead. See https://github.com/numpy/numpy/issues/12525.
  • Casting rules for arithmetic are value-dependent for scalars (https://github.com/numpy/numpy/issues/6240)
  • result_type(int, str) should be object, not str. (seberg: Or shouldn't it simply raise an error, i.e. never implicitly go to object? – In which case, is simple deprecation viable?)
  • Require explicitly writing dtype=object to get an object dtype array.
  • Find a solution to issues created by PyArray_Return: That is, most numpy functions, importantly ufuncs, convert 0-D array results to scalars when returning. This could be a breaking change returning arrays always, or more complex solutions. Possible steps forward that do not require breakage (immediately) are discussed in https://github.com/numpy/numpy/issues/13105.
  • Make the ufunc out argument force a higher precision loop (maybe possible without a major version increase?). https://mail.python.org/pipermail/numpy-discussion/2019-September/080106.html

C APIs

  • Extend the ndarray struct in order to speed up and clean up buffer handling.
  • Delete the sigint header and related functions (technically an ABI and API break, but a loud one and nobody probably notices)
  • Removing NPY_CHAR (see https://github.com/numpy/numpy/issues/2801 and linked PRs/issues)
  • Dtype cleanup ideas (see https://github.com/numpy/numpy/issues/2899)
  • Make the PyArray_Descr and PyUfunc_Object structs opaque like we did with PyArray_Object, extracting PyArray_Descr_Fields etc - this allows us to make API changes more easily later.
  • modify NPY_SORTKIND to allow different sorting algorithms (timsort, radixsort). THis requires a change in size of PyArray_ArrFuncs See https://github.com/numpy/numpy/pull/12586 https://github.com/numpy/numpy/pull/12586
  • Increase NPY_MAXARGS to more than 32, see https://github.com/numpy/numpy/issues/4398.
  • implement radixsort once sort ABI can be changed, see https://github.com/numpy/numpy/pull/12586
  • Remove promise to handle NULL as Py_None in object arrays (we do not use this, and it crashes hard, so could probably do it without a major release as well). (Sebastian: Exception would be "uninitialized data" (i.e. cleared data). I.e. to support writing to buffers are NULL'ed, and it is easier if DECREF is not used unless the writer knows the buffer/data is not freshly initialized. Any function reading could assume that NULL is not possible.)

"Recompile the world release" (Break C-ABI)

  • The elsize slot in PyArray_Descr should be npy_intp or ssize_t and not integer.