Skip to content

HPy design trade‐offs

Stepan Sindelar edited this page Nov 9, 2023 · 1 revision

Definitions:

  • call-site: Python->C transition, duration of a call-site is until the C code returns back to the original Python entry point on the same stack frame. There can be nested calls when C calls to Python and Python then calls to C again -- that would be another separate (although nested) call-site.

HPyContext

HPyContext as it is now serves several purposes, they should be considered in separation, another ABI/API design may merge them like HPy, but does not have to.

  • Function table:

    • General advantage of function table over linking with a shared library is that one doesn't need the shared library to load the binary, see https://github.com/namhyung/uftrace/blob/master/utils/script-python.c
    • Minor advantage: versioning feels less messy, but can be done also when linking with shared library
    • Advantages of function table per call-site (what HPyContext is now)
      • Possibility to restrict the API that is accessible to specific entry-points, such as tp_traverse
      • Possibility to run multiple incompatible Pythons (i.e., not subinterpreters) in one process (multiple CPython versions, GraalPy and CPython together, etc.)
      • Advanced optimizations/tracing/debugging, e.g., specialized PyNumber_Add for callsite that is known to add integers
      • Limitation: callbacks that happen asynchronously independenly of the call-site duration
        • I think this is inherent limitation if we want to keep the advantages, but one can provide some API to get "global" (or per extension) context and unless the user uses that, the advantages can be still there
        • This is ill-defined concept with subinterpreters. Example: extension sets-up asynchronous callback, what if the extension is loaded in two subinterpreters and the callback fires?
    • Function table per extension:
      • Can be implemented without requiring to pass the function table around as an argument. Each extension must export function, let's say SetVTable, which sets static global variable embedded in the extension.
      • Possible to have different versions of context per extension
      • Possible to "decorate" context per extension (decorations: debug mode, tracing mode, ...)
  • Storage for "global" objects, such as Py_True

    • Goal: avoid/abstract global state
    • This can be also implemented with function calls: (get_)Py_True().
    • Context feels more natural, because: the constants in the context work like "hidden" arguments, so they are the caller's responsibility (we do not want borrowed references, so if get_Py_True() returns fresh reference, then the caller must defref/close/..., which is combersome to do for every constant). This provides "constants like" API while still avoiding global state.
    • This must be at least per-subinterpreter (so implementation wise: can be probably thread local)
    • Per-callsite advantages are the same as for Function table per-callsite (especially: can restrict accesibility, multiple Pythons in one process)
  • Storage for any interpreter specific data

    • The point is that when the Python interpreter is called back via some API, it gets also the context, so it can use the data stored in it
    • To distinguish current thread/subinterpreter
    • To quickly retrieve something that is call-site/extension/thread/subinterpreter specific (TODO: do we have some good examples?)
    • For alternative Pythons based on other language runtimes, whose FFI requires context argument already (JNI, NAPI for V8, ...)
      • Different runtimes will have different life-scopes of their context argument. Per call-site is the most generic.
      • JNIEnv is per thread. In code that was not called from Java, one must "attach" the thread, then can use the JNIEnv, and then one should "detach" the thread
      • TODO: research other FFIs and give more examples ~
Clone this wiki locally