Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tp_bases of object is NULL: undocumented or unintentional behavior change in 3.12? #105020

Closed
rwgk opened this issue May 27, 2023 · 18 comments
Closed
Assignees
Labels
3.12 bugs and security fixes docs Documentation in the Doc dir interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@rwgk
Copy link

rwgk commented May 27, 2023

Documentation

See #103912 (comment) for full context.

Specific problem:

It appears tp_bases of object was an empty tuple in the past, but is now NULL, could that be? Is that intentional?

If intentional, could it please be documented? I think the behavior change is likely to trip up fair a number of people.

I looked in https://github.com/python/cpython/blob/edd0cb8e77d7b65f5a9c2c69dc81f9c4514878d5/Doc/whatsnew/3.12.rst but don't see tp_bases being mentioned there.

I also verified that the tp_bases == nullptr patch is still needed with current 3.12 HEAD (edd0cb8).

Linked PRs

@rwgk rwgk added the docs Documentation in the Doc dir label May 27, 2023
@sunmy2019
Copy link
Member

sunmy2019 commented May 27, 2023

NULL means it is stored in the interpreter. To get the correct value, you will need an internal function

Objects/typeobject.c: lookup_tp_bases

or

Objects/typeobject.c: _PyType_GetBases

These changes and APIs usages are private and undocumented.

@ronaldoussoren ronaldoussoren added interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.12 bugs and security fixes labels May 28, 2023
@ronaldoussoren
Copy link
Contributor

The same appears to be true for tp_dict of static builtin classes, and that change is also not documented.

For both changes there's no clean alternative other than using the private API (_PyType_GetDict for tp_dict and _PyType_GetBases for tp_bases). One could use PyAttr_GetAtring(type, "__dict__") to access tp_dict instead, but that's not necessarily equivalent when subtypes are involved.

I'm running into this issue while porting PyObjC to 3.12.

@ronaldoussoren
Copy link
Contributor

This is related to #94673

@ericsnowcurrently, do you recall why there are no public accessors to replace directly accessing tp_bases and tp_dict?

ronaldoussoren added a commit to ronaldoussoren/pyobjc that referenced this issue May 28, 2023
In particular, access tp_dict directly is unsafe in 3.12,
use a helper for that (the helper function is a private
API in 3.12 beta 1 hence the need for a helper function in
pyobjc-compat)

See also python/cpython#105020
@ronaldoussoren
Copy link
Contributor

PyObjC 9.2 will use _PyType_GetDict with Python 3.12 to get it to work without invasive changes. It would be nice if that API could be moved to the unstable or even stable tier as there's currently no way to get the same functionality with a public API. The same is true for _PyType_GetBases (use case for the OP).

In particular, the only public way I've found to get this information is to use PyObject_GetAttr for the corresponding dunder name. That doesn't fit my use-case for two reasons:

  1. Semantically this is not the same because types can override __getattribute__ and could return different information.

    In PyObjC the PyObject_GetAttr path could lead to unintentional recursion because I have a custom __dict__ descriptor on some types to lazily initialise those types.

  2. The PyObject_GetAttr alternative results in differences in refcount (its result is a new reference, while accessing the corresponding tp_ slot is a borrowed reference). That leeds to fairly large changes to correctly handle refcounts in all paths through the code (some of which is from the Python 2.1 era).

@rwgk, sorry about hijacking your issue, but the problem we're running into is basically the same with to different slots on PyTypeObject and should have a similar solution.

@rwgk
Copy link
Author

rwgk commented May 29, 2023

sorry about hijacking your issue

No worries about that, on the contrary, I feel like I should edit the title. Something like:

Urgently needed: Guidance for binding tool maintainers: How to deal with the consequences of #103912

Based on the comments I saw thus far, I'm sorry to say, it seems like this wasn't actually planned out.

@gpshead
Copy link
Member

gpshead commented May 30, 2023

@ericsnowcurrently

@gpshead gpshead added the type-bug An unexpected behavior, bug, or error label May 30, 2023
@ericsnowcurrently ericsnowcurrently self-assigned this May 30, 2023
@ericsnowcurrently
Copy link
Member

I'm working on a fix.

@ericsnowcurrently
Copy link
Member

do you recall why there are no public accessors to replace directly accessing tp_bases and tp_dict?

I don't, though it seem very sensible for us to add getter functions to the public API.

@ericsnowcurrently
Copy link
Member

gh-105115 restores tp_bases (and tp_mro) back to where they were pre- gh-103912, by making them immortal for static builtin types. That should resolve the reported problems.

FYI, I totally missed that the two fields were part of the documented public API for PyTypeObject. Breaking compatibility is on me. Sorry about that.

@rwgk
Copy link
Author

rwgk commented May 30, 2023

Awesome, thank you @ericsnowcurrently!

ericsnowcurrently added a commit that referenced this issue May 31, 2023
…tic Builtin Types (gh-105115)

In gh-103912 we added tp_bases and tp_mro to each PyInterpreterState.types.builtins entry.  However, doing so ignored the fact that both PyTypeObject fields are public API, and not documented as internal (as opposed to tp_subclasses).  We address that here by reverting back to shared objects, making them immortal in the process.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 31, 2023
…ll Static Builtin Types (pythongh-105115)

In pythongh-103912 we added tp_bases and tp_mro to each PyInterpreterState.types.builtins entry.  However, doing so ignored the fact that both PyTypeObject fields are public API, and not documented as internal (as opposed to tp_subclasses).  We address that here by reverting back to shared objects, making them immortal in the process.
(cherry picked from commit 7be667d)

Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
@sunmy2019
Copy link
Member

Thanks! @ericsnowcurrently

@ronaldoussoren
Copy link
Contributor

@ericsnowcurrently, what's the right way forward for accessing tp_dict?

I'm currently using _PyType_GetDict, but that's a private API and I'd prefer to avoid those.

Three options:

  1. add PyType_GetDict as public API and keep _PyType_GetDict as an alias for code that's already adjust to the beta

  2. rename the private API to PyType_GetDict

  3. rename the private API to an unstable one like PyUnstable_Type_GetDict

I'd prefer the second option unless there's a good reason to assume that returning a borrowed reference from PyType_GetDict will be problematic in the future.

Also: Is this a change that can still get into 3.12, even during the beta cycle?

@ericsnowcurrently
Copy link
Member

what's the right way forward for accessing tp_dict?

If we keep it in the internal headers then we can re-name it however we like, whenever we like. 😄 Moving it to the public API would be okay if it returned the dict proxy object. However, it returns the actual dict, which complicates things (but doesn't necessarily rule it out). Doing so would require @Yhg1s to approve.

As to the complexity--when it comes to the static builtin types, giving the actual dict to C-API users would potentially expose us to future pain. For example, we may eventually statically initialize that dict, which means it must be treated as immutable. It's unclear what the actual impact would really be though, so it doesn't seem like a simple answer is reachable quickly. 😞

The safest bet would be to leave the getter function in the internal C-API for 3.12, possibly dropping the leading underscore.

@ronaldoussoren
Copy link
Contributor

Leaving the function in the internal API leaves me in the position that I have to use a private API to do something that used to be part of the (implicit) public API, just one small step removed from having to resort to poking in internals when someone notices that the API is not used by stdlib extension modules and removes the export for the symbol (e.g. visibility=hidden).

BTW. Returning a dict proxy from a public API for this wouldn't work for me, both because of the refcounting implications mention in an earlier comment and because AFAIK the PyDict_ API doesn't work on a mapping proxy. Both could be worked around, but would require significant changes to code that has been stable for a long time.

I'll try to raise this issue with our RM over the weekend, trying to communicate in a couple of minutes each morning and/or evening doesn't really work.

@ericsnowcurrently
Copy link
Member

Just to make sure we're on the same page, in what cases do you need access to the tp_dict of static builtin types? They are the only ones where tp_dict is NULL.

ericsnowcurrently pushed a commit that referenced this issue Jun 1, 2023
…All Static Builtin Types (gh-105115) (gh-105124)

In gh-103912 we added tp_bases and tp_mro to each PyInterpreterState.types.builtins entry.  However, doing so ignored the fact that both PyTypeObject fields are public API, and not documented as internal (as opposed to tp_subclasses).  We address that here by reverting back to shared objects, making them immortal in the process.
(cherry picked from commit 7be667d)

Co-authored-by: Eric Snow ericsnowcurrently@gmail.com
@ericsnowcurrently
Copy link
Member

@ronaldoussoren, it would probably be best to move this discussion to a separate issue. Feel free to create one, or I'll do it tomorrow if you haven't already. Thanks!

@rwgk
Copy link
Author

rwgk commented Jun 2, 2023

This is to confirm that current pybind11 master builds with the 3.12 branch, and all pybind11 unit tests, excl. the tests that depend on numpy, also pass.

Thanks again @ericsnowcurrently!

cpython commit e6d5e63 (HEAD -> 3.12, upstream/3.12)
pybind11 commit d0232b119f5f0cf2bf6cecacaf099c301c811ca8 (HEAD -> master, upstream/master)
gcc (Debian 12.2.0-14) 12.2.0

@ericsnowcurrently
Copy link
Member

Thanks for letting us know about this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes docs Documentation in the Doc dir interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

5 participants