Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-104078: Improve performance of PyObject_HasAttrString #104079

Merged
merged 2 commits into from May 3, 2023

Conversation

itamaro
Copy link
Contributor

@itamaro itamaro commented May 2, 2023

fixes gh-104078

microbenchmark on main

python -m pyperf timeit -s '
import _testcapi
hasattr_string = _testcapi.hasattr_string

class A:
  def __init__(self):
    self.attr = 1

a = A()' 'hasattr_string(a, "noattr")'
.....................
Mean +- std dev: 487 ns +- 7 ns

microbenchmark with PR

python -m pyperf timeit -s '
import _testcapi
hasattr_string = _testcapi.hasattr_string

class A:
  def __init__(self):
    self.attr = 1

a = A()' 'hasattr_string(a, "noattr")'
.....................
Mean +- std dev: 149 ns +- 4 ns

about 3.3X faster

@itamaro itamaro marked this pull request as ready for review May 2, 2023 04:26
@carljm
Copy link
Member

carljm commented May 2, 2023

How does the performance comparison look for an attribute that does exist?

@itamaro
Copy link
Contributor Author

itamaro commented May 2, 2023

How does the performance comparison look for an attribute that does exist?

it's about 1.2x faster

python -m pyperf timeit -s '
import _testcapi
hasattr_string = _testcapi.hasattr_string

class A:
  def __init__(self):
    self.attr = 1

a = A()' 'hasattr_string(a, "attr")'

# main
Mean +- std dev: 187 ns +- 4 ns

# PR
Mean +- std dev: 156 ns +- 4 ns

Copy link
Member

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This provides a significant performance improvement in both attribute-exists and (especially) attribute-doesn't-exist cases, without making the code any harder to understand.

There's a bit more duplication with PyObject_GetAttrString, but these two methods are right next to each other and it's logical for them to have parallel structure.

It's also logical -- and perhaps more likely to avoid behavioral inconsistency -- for PyObject_HasAttrString to go through PyObject_HasAttr.

@carljm carljm added the performance Performance or resource usage label May 3, 2023
@hauntsaninja hauntsaninja merged commit 8d34031 into python:main May 3, 2023
23 checks passed
@malemburg
Copy link
Member

I wonder why you are seeing those speedups. The PR essentially looks like it's manually inlining the code - something the compiler would normally do by itself where needed.

Also: The test will only check the second part of the PR, since Python instances have the tp_getattro set, but not tp_getattr.

Related to this, I believe the original and new code for checking the tp_getattr slot are not quite correct, since in all other lookup code, tp_getattro is checked and used before tp_getattr. I don't remember the details around what is supposed to happen when both are set, but if they are, the PyObject_HasAttrString() code will use a different path than e.g. PyObject_HasAttr().

@carljm
Copy link
Member

carljm commented May 3, 2023

I wonder why you are seeing those speedups. The PR essentially looks like it's manually inlining the code - something the compiler would normally do by itself where needed.

The important change in the PR is that it makes PyObject_HasAttrString go through PyObject_HasAttr, where previously it went through PyObject_GetAttrString and thus PyObject_GetAttr (in the common case). This is not just an inlining, it's an important change because PyObject_HasAttr is better optimized for "we don't need the value, we just need to know it exists" -- especially in the case where it doesn't exist. (I also think it's better for PyObject_HasAttrString to go through PyObject_HasAttr simply in terms of maintaining a consistent behavior between the two.)

Also: The test will only check the second part of the PR, since Python instances have the tp_getattro set, but not tp_getattr.

Yes, I noticed this, but a) I'm not sure how common it is for types to set tp_getattr (it is deprecated), and b) the only change to the tp_getattr case in the PR is to remove one layer of function call (we call tp_getattr ourselves now instead of going through PyObject_GetAttrString), so I would expect the impact on the tp_getattr case is neutral or slightly improved.

Related to this, I believe the original and new code for checking the tp_getattr slot are not quite correct, since in all other lookup code, tp_getattro is checked and used before tp_getattr. I don't remember the details around what is supposed to happen when both are set, but if they are, the PyObject_HasAttrString() code will use a different path than e.g. PyObject_HasAttr().

I wondered about this too. I assume the reason this was done this way originally is because tp_getattr takes a char* instead of a PyUnicodeObject, so preferring it for PyObject_HasAttrString and PyObject_GetAttrString avoids the need to box the attribute name. It is documented that tp_getattr "should point to a function that acts the same as tp_getattro", so I guess the idea is that it shouldn't matter which we prefer. In any case, this PR doesn't change the behavior here.

@malemburg
Copy link
Member

Thanks for added insights, @carljm. I agree that it's better for PyObject_HasAttrString() go through PyObject_HasAttr(). Existence of an attribute is often easier and faster to check than actually getting the value (esp. for properties).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance of PyObject_HasAttrString
5 participants