Skip to content

dev call 20220203

Florian Angerer edited this page Feb 4, 2022 · 1 revision

Dev Call: 3 February 2022

Present

Antonio, Ronan, Matti, Tim, Florian, Stepan

Cython progress

  • Upstream PR is soon to come.
  • Current approach: Florian introduced a backend that is used to emit API-specific calls. There are three backends: (1) CApiBackend,

    (2) HPyBackend, and (3) CombinedBackend. The CApiBackend and HPyBackend emit code that directly uses the appropriate API. E.g. calls to PyDict_GetItem (C API) or HPy_GetItem (HPy). The CombinedBackend uses meta-pogramming and would in this case have a new macro __Pyx_PyDict_GetItem that will be defined differently depending on the compilation environment. In this way, the CombinedBackend produces code that is suitable for C API and HPy.

  • Fixing last bugs to be able to run test hpy_basic and builtin_abs in all possible combinations, i.e., (compiling for C API, HPY) x (CApiBackend, HPyBackend, CombinedBackend).
  • There are many missing HPy API functions which Florian implemented workarounds for. He will open issues/PRs to discuss/add them.
  • There are many open issues concerning how to restructure Cython that need to be discussed on the PR (e.g. usage of Py_INCREF to make a reference owned).
  • There is a new method flag in Python 3.9 (METH_METHOD; https://www.python.org/dev/peps/pep-0573) that might help.

Binary stability

  • HPy arg clinic (which might be introduced in future) could have impact on it on binary compatibility.
  • We should probably have just one context function for a purpose and add wrappers that delegate to the context function.
  • What should be the general strategy for adding ctx functions? Antonio means: case by case but we should not always go for the most generic API, in particular if performance suffers.

General Discussions

  • Guido van Rossum stated that HPy is not moving very fast; we should probably communicate better and do more blog posts. Matplotlob could be our next demo/blog post.
  • Maybe we should target Google summer of code with some HPy topics That would need a sponsoring org (PSF, Numfocus, ...).
  • Fetching HPyContext out of nowhere: we could add some API that allows you to get a global context but if you use it, it will be the only context. This would enable an easier migration path but disallow some core features like sub-interpreters or debug mode
  • Antonio suggests to have a compile-time flag for that. We do anyway want to have two modes: legacy and pure. So we could decide (at compile time) that you need a get-context function. A problem might be that then just one C extension would make that decision for the whole process. But does it matter?
  • Having a context in tp_traverse is still a problem for PyPy (e.g. it will crash as soon as someone tries to allocate memory) because it is called from the GC. A possibililty would be to provide a restricted context that basically just allows read-only access to objects.
  • Numpy has a restricted API (Array API) without dtypes that allow objects in your arrays. Matti thinks, that object and structured dtypes are usually used by mistake. Array API reference: https://data-apis.org/array-api/latest/ Interesting blog post: https://labs.quansight.org/blog/2021/11/pydata-extensibility-vision/
  • Antonio, Matti, Ronan agree that PyPy cannot support calling tp_traverse on the main thread (as Graalpython would do). The GC must do it.
  • OTOH: the top100 packages don't really need the context in tp_traverse. So, the way is maybe to "fix" Numpy. This would need (as discussed before) some information about the layout of the data (basically, where are the objects in raw mem).
  • Stepan suggests that if get-context only works in CPython ABI mode but fails in universal mode (at runtime), we could support Numpy with restrictions. This is basically the same as Antonio said (it's a compile-time decision). We would need to fail at array-creation time because we cannot raise in tp_traverse.
  • Numpy doesn't supported obj-/structured arrays when using the new Array API.
  • Summary: Full support in CPython ABI mode; no obj-/struct arrays (like with Array API) in universal mode. For that we need to add a flag to know if the array contains objects.
Clone this wiki locally