Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better handling of accessing an object's __dict__ attribute. #651

Open
markshannon opened this issue Feb 7, 2024 · 5 comments
Open

Better handling of accessing an object's __dict__ attribute. #651

markshannon opened this issue Feb 7, 2024 · 5 comments

Comments

@markshannon
Copy link
Member

Currently when the __dict__ attribute of an object is accessed we transfer ownership of the values array from the object to the dict.

Accessing the __dict__ of an object is fairly uncommon, but not that uncommon, but it is highly disruptive to optimizations.
We currently attempt to mitigate this by dematerializing the __dict__, but that has a few failings:

  • It bulks out a common and performance critical uop, _CHECK_MANAGED_OBJECT_HAS_VALUES
  • It isn't that effective
  • It is not thread safe

Rather than attempting to get rid of the dictionary, lets change the object and values so that the presence of a __dict__ doesn't impact the fast path.

What that means is that an object would still retain a pointer to the values, even if the dict were present.
Inlining the values would help here. Otherwise we need an extra pointer, bulking out the pre-header even more.

Memory management becomes a little more complex, as we need to make sure that we don't free the values when there is a still a reference to it. We can use reference counting, but we will only need a single bit as the values can only be referred to by the object and/or the dict.

@brandtbucher thoughts?

@gvanrossum
Copy link
Collaborator

The dematerialization is what happens in _PyObject_MakeInstanceAttributesFromDict(), right?

Is the new design shown by any of the pictures in #72 (comment)?

@brandtbucher
Copy link
Member

The idea here is that we won't need to either dematerialize the __dict__ or hop through it to get to the values? In every case, __dict__ or no dict, we can just use the extra pointer from the header?

I like it. Getting rid of the tricky PyDictOrValues stuff will be nice, too.

@carljm
Copy link

carljm commented Feb 14, 2024

What happens when a materialized dict has to be resized? Doesn't that mean the dict will always become combined, so it won't have a (valid) PyDictValues at all anymore? So won't we still have to support the case where we have no values to point to, just a dict?

EDIT: clarified in offline discussion, we will still have to handle the combined-dict case, so this won't be able to get rid of PyDictOrValues.

@gvanrossum
Copy link
Collaborator

These pictures are nearly right, according to @markshannon
#72 (comment)

@dg-pb
Copy link

dg-pb commented Jun 16, 2024

Is this what is causing slowdowns of attribute access after modifications to __dict__?

https://discuss.python.org/t/1-attrdict-and-2-argparse-namespace-performance/53805

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants