Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hard to reason about which functions are or are not part of the stable ABI #50

Open
koubaa opened this issue Jun 15, 2023 · 3 comments
Open

Comments

@koubaa
Copy link

koubaa commented Jun 15, 2023

Functions are added to the stable ABI over time, and just given the C definition of a module, auditing to see what function calls are stable and what aren't is not trivial.

I think the core issue is that there is only one header that all functions are supposed to be defined in, which is Python.h

I think the stable ABI could be defined in another header, PythonABI.h,

where Python.h includes PythonABI.h for compatibility reasons.

Taking this further, we can have a PythonABI1.h, PythonABI2.h, PythonABI3.h, etc, and Python.h includes the latest one, while modules that want to target a specific ABI with long-term guarantees can directly pick one of these.

Now, to make sure you only use the stable ABI, or which stable ABI, you need only look at the include block at the top of the module.

I'm not sure how feasible this is, but I think it is worth doing.

@encukou
Copy link
Contributor

encukou commented Jun 19, 2023

Define Py_LIMITED_API to limit Python.h to only what's in the stable ABI for a given version. See the docs.

Contents of the limited API are listed here: https://pypi.org/project/abi3info/
and the authoritative list is in the source, although the format is subject to change. And there's abi3info, a third-party project that exposes the info.

As for C headers, I don't think they're a good source format for information about API. You'd need several of them (as you propose), and they can only be parsed by a C compiler (unless you use some well-defined subset of C).

See #7 for a more general issue.

@koubaa
Copy link
Author

koubaa commented Jun 19, 2023

@encukou Thanks for the note about Py_LIMITED_API, I agree it's very powerful and forgot about it. I also take your point in #7 about a manifest. Maybe the set of entry points are generated from a set of abi manifests (or alternativey the set of stable ABI functions are defined to point to a namespaced symbol in libpython.so)

In the latter case, if libpython.so exports the function PyObject_New_ABI2

then if the user includes the line:

#define PY_ABI_VERSION 2

before including the C header(s),

PyObject_New is defined as an alias to PyObject_New_ABI2 rather than the exported symbol PyObject_New.
and perhaps the API used to initialize the module is marked with an ABI tag so that the interpret know which ABI the module is based on.

I think my point is less about headers but more about where to house an alternative ABI that the python library actually exports, and additional headers was the first thing that came to mind from the developer experience point of view.

@encukou
Copy link
Contributor

encukou commented Jun 20, 2023

You are explaining a solution. Could you focus more on identifying the problem that needs solving?

Your example is about a change that changes ABI but keeps the same API (including behaviour). That's pretty rare, but, what you propose is possible, and used, today. (Just recently, the implementation of Py_INCREF was changed to depend on Py_LIMITED_API, similar to what you propose.)

As for “marking” the module with an ABI tag, we already did that: PyModule_Create is a macro that calls PyModule_Create2 which checks the version. But:

  • The version number -- PYTHON_API_VERSION -- is undocumented and hasn't been updated since 2006.
  • Multi-phase init (PyModuleDef_Init) doesn't include a version check.
  • If we want to do this we probably want two versions: what the module was built for (Py_LIMITED_API) and what the module was built with (PY_VERSION_HEX) -- the latter would allow deprecating old versions in unforeseen ways. (If which we'll probably want to do at some point in the long term.)

The fact that we used to do it but it kinda fell by the wayside suggests that we don't need it.
IMO, if we use something like Mark's proposal that encodes ABI details in the symbol names, plus change the name on behaviour changes (see #39 (comment)), we don't really need the check. (That is, at runtime we could rely on “symbol not found” linker errors. Versions are still needed for packaging/distribution, so e.g. pip can select a compatible wheel.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants