Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-93911: Specialize LOAD_ATTR for custom __getattr__ and __getattribute__ #93988

Merged
merged 21 commits into from
Aug 17, 2022

Conversation

Fidget-Spinner
Copy link
Member

@Fidget-Spinner Fidget-Spinner commented Jun 18, 2022

Linked to #93911.

For __getattr__, we specialize as per normal as long as the specialized bytecode can succeed without raising AttributeError.

For __getattribute__, a specialized instruction is created to inline the call to __getattribute__.

@Fidget-Spinner
Copy link
Member Author

Fidget-Spinner commented Jun 18, 2022

A very significant (15%) increase in hit rates for test_typing. That module represents the best case for this specialisation because nearly all its classes use __getattr__. This is also the highest pyperformance specialization failure apart from property which was fixed by the other PR.

.\PCbuild\amd64\python_d.exe -m test test_typing -R 3:3

On main:
opcode[106].specializable : 1
    opcode[106].specialization.success : 3833
    opcode[106].specialization.failure : 3648
    opcode[106].specialization.hit : 861058
    opcode[106].specialization.deferred : 520385
    opcode[106].specialization.miss : 91899
    opcode[106].specialization.deopt : 1598
    opcode[106].execution_count : 119454
    opcode[106].specialization.failure_kinds[2] : 1918
    opcode[106].specialization.failure_kinds[4] : 214
    opcode[106].specialization.failure_kinds[5] : 9
    opcode[106].specialization.failure_kinds[9] : 171
    opcode[106].specialization.failure_kinds[11] : 251
    opcode[106].specialization.failure_kinds[12] : 207
    opcode[106].specialization.failure_kinds[14] : 148
    opcode[106].specialization.failure_kinds[17] : 201
    opcode[106].specialization.failure_kinds[19] : 22
    opcode[106].specialization.failure_kinds[21] : 5
    opcode[106].specialization.failure_kinds[22] : 14
    opcode[106].specialization.failure_kinds[23] : 39
    opcode[106].specialization.failure_kinds[25] : 18
    opcode[106].specialization.failure_kinds[27] : 464


On this PR:
opcode[106].specializable : 1
    opcode[106].specialization.success : 4694
    opcode[106].specialization.failure : 2298
    opcode[106].specialization.hit : 991871
    opcode[106].specialization.deferred : 355737
    opcode[106].specialization.miss : 128358
    opcode[106].specialization.deopt : 2267
    opcode[106].execution_count : 156019
    opcode[106].specialization.failure_kinds[2] : 46
    opcode[106].specialization.failure_kinds[4] : 214
    opcode[106].specialization.failure_kinds[5] : 9
    opcode[106].specialization.failure_kinds[9] : 171
    opcode[106].specialization.failure_kinds[11] : 251
    opcode[106].specialization.failure_kinds[12] : 207
    opcode[106].specialization.failure_kinds[14] : 155
    opcode[106].specialization.failure_kinds[17] : 201
    opcode[106].specialization.failure_kinds[19] : 22
    opcode[106].specialization.failure_kinds[21] : 5
    opcode[106].specialization.failure_kinds[22] : 13
    opcode[106].specialization.failure_kinds[23] : 39
    opcode[106].specialization.failure_kinds[25] : 18
    opcode[106].specialization.failure_kinds[27] : 464
    opcode[106].specialization.failure_kinds[28] : 516

Python/ceval.c Outdated
Comment on lines 3750 to 3757
if (res) {
SET_TOP(NULL);
STACK_GROW((oparg & 1));
SET_TOP(res);
Py_DECREF(owner);
JUMPBY(INLINE_CACHE_ENTRIES_LOAD_ATTR);
DISPATCH();
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm aware that specialized opcodes should have as few branches as possible. But these two opcodes need to respect the getattro semantics.

I could move these out to a function, but the branch would still be there.

@markshannon
Copy link
Member

This doesn't seem right to me.
__getattribute__ allows attribute lookup to be overridden.
__getattr__ is for looking attributes after normal lookup has failed.

The existence of __getattr__ shouldn't prevent specialization of normal attribute lookup.
Consider this class

>>> class C:
...     def __init__(self):
...         self.a = 1
...     b = 2
...     def __getattr__(self, name):
...         if name == "c":
...             return 3
...         raise AttributeError(name)
... 
>>> c = C()
>>> c.a
1
>>> c.b
2
>>> c.c
3
>>> c.d
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 8, in __getattr__
AttributeError: d

We should specialize c.a and c.b; the presence of __getattr__ shouldn't matter.

Interestingly, the current behavior of Python is to check for both __getattribute__ and __getattr__ before calling either of them.

def get_b(self, name):
    print("In get_b")
    if name == "b":
        return "b"
    else:
        raise AttributeError()

class A:
    def __getattribute__(self, name):
        print("In A.__getattribute__")
        if name == "a":
            return "a"
        else:
            type(self).__getattr__ = get_b
            raise AttributeError()

try:
    print(A().b)
except Exception as e:
    print("Raises", type(e), e)
print()
try:
    print(A().b)
except Exception as e:
    print("Raises", type(e), e)

cause the following to be printed:

In A.__getattribute__
Raises <class 'AttributeError'> 

In A.__getattribute__
In get_b
b

Which is odd, but ideal for specialization, as we don't need to check for __getattr__ at runtime. We can do it at specialization time.

@markshannon
Copy link
Member

Here's what I think we should do.

  1. Specialize as normal for classes with a __getattr__ method (in most cases).
  2. Specialize for classes that have a Python __getattribute__ method, and do not have a __getattr__, much like BINARY_SUBSCR_GETITEM specializes for classes with a __getitem__method.

Most specializations of LOAD_ATTR deopt if they can't find the attribute. These should work just fine with a __getattr__ method.
Some specializations of LOAD_ATTR, like LOAD_ATTR_PROPERTY can raise an exception and might not be valid if a class has a __getattr__.
We should be able to determine this at specialization time.

@Fidget-Spinner
Copy link
Member Author

Alright. analyze_descriptor will need to be a lot more robust then. It will need to detect cases where it sees a __getattr__ but the attribute is present.

@markshannon
Copy link
Member

By checking for tp->tp_getattro == slot_tp_getattro and tp->tp_getattro == slot_tp_getattr_hook we can avoid most of the expensive calls to _PyType_Lookup when analyzing the class.

tp->tp_getattro == slot_tp_getattro implies that __getattribute__ is defined but __getattr__ is not. (I think).
tp->tp_getattro == slot_tp_getattr_hook implies that __getattr__ is set.
Therefore we only need to call _PyType_Lookup(tp, &_Py_ID(__getattribute__)) if one of the above is true.

@Fidget-Spinner
Copy link
Member Author

It's alarming that specialisation misses increased more than hits (~75k vs 55k). Maybe it means that for most types with __getattr__, it's expected that there's nothing there? This needs more analysis.

python -m test test_typing -R 3:3

Main 71868a0066a90519ccc6aeb75673ae882aaa03f0:
opcode[106].specializable : 1
    opcode[106].specialization.success : 3839
    opcode[106].specialization.failure : 3608
    opcode[106].specialization.hit : 892801
    opcode[106].specialization.deferred : 517029
    opcode[106].specialization.miss : 91724
    opcode[106].specialization.deopt : 1596
    opcode[106].execution_count : 119325
    opcode[106].specialization.failure_kinds[2] : 1918
    opcode[106].specialization.failure_kinds[4] : 213
    opcode[106].specialization.failure_kinds[5] : 9
    opcode[106].specialization.failure_kinds[9] : 172
    opcode[106].specialization.failure_kinds[11] : 254
    opcode[106].specialization.failure_kinds[12] : 165
    opcode[106].specialization.failure_kinds[14] : 148
    opcode[106].specialization.failure_kinds[17] : 203
    opcode[106].specialization.failure_kinds[19] : 20
    opcode[106].specialization.failure_kinds[21] : 5
    opcode[106].specialization.failure_kinds[22] : 14
    opcode[106].specialization.failure_kinds[23] : 39
    opcode[106].specialization.failure_kinds[25] : 18
    opcode[106].specialization.failure_kinds[27] : 463

This PR 2a931a8faad07375af647f5ed0e8d3548bcb3d37:
opcode[106].specializable : 1
    opcode[106].specialization.success : 5321
    opcode[106].specialization.failure : 2387
    opcode[106].specialization.hit : 944885
    opcode[106].specialization.deferred : 391933
    opcode[106].specialization.miss : 164726
    opcode[106].specialization.deopt : 2937
    opcode[106].execution_count : 192307
    opcode[106].specialization.failure_kinds[2] : 568
    opcode[106].specialization.failure_kinds[4] : 273
    opcode[106].specialization.failure_kinds[5] : 9
    opcode[106].specialization.failure_kinds[9] : 172
    opcode[106].specialization.failure_kinds[11] : 287
    opcode[106].specialization.failure_kinds[12] : 165
    opcode[106].specialization.failure_kinds[14] : 148
    opcode[106].specialization.failure_kinds[17] : 234
    opcode[106].specialization.failure_kinds[19] : 20
    opcode[106].specialization.failure_kinds[21] : 5
    opcode[106].specialization.failure_kinds[22] : 14
    opcode[106].specialization.failure_kinds[23] : 39
    opcode[106].specialization.failure_kinds[25] : 51
    opcode[106].specialization.failure_kinds[27] : 468

@markshannon
Copy link
Member

The numbers look correct. The total, hit + deferred + miss is unchanged and the number of deferred drops from 34% to 26%, which is a decent improvement. The increase in misses from 6% to 10% may be genuine, due to type instability. You are running this on the tests after all.

Have you tried to gather stats for pyperformance?

Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I'd like to assert our assumptions about tp_getattro.

Python/ceval.c Outdated
Py_INCREF(f);
_PyInterpreterFrame *new_frame = _PyFrame_PushUnchecked(tstate, f);
SET_TOP(NULL);
int push_null = !(oparg & 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We push NULL if oparg & 1 (and the line below should be STACK_SHRINK(1-push_null))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about:

int shrink_stack = !(oparg & 1);
STACK_SHRINKG(shrink_stack);

The behaviour is equivalent but it does save a subtract operation.

Python/specialize.c Outdated Show resolved Hide resolved
Python/specialize.c Outdated Show resolved Hide resolved
Python/specialize.c Outdated Show resolved Hide resolved
Python/specialize.c Outdated Show resolved Hide resolved
@@ -568,8 +574,47 @@ analyze_descriptor(PyTypeObject *type, PyObject *name, PyObject **descr, int sto
}
else {
if (type->tp_getattro != PyObject_GenericGetAttr) {
*descr = NULL;
return GETSET_OVERRIDDEN;
getattrofunc getattro_slot = type->tp_getattro;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if our assumption that getattro_slot == _Py_slot_tp_getattr_hook implies that there is a __getattr__ is true.
It might be that the slot hasn't been updated, even though that seems unlikely.

Could you add an assert that _PyType_Lookup(type, &_Py_ID(__getattr__)) != NULL if has_getattr?

Copy link
Member Author

@Fidget-Spinner Fidget-Spinner Jun 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_Py_slot_tp_getattr_hook handles this case by overwriting itself back to the simpler _Py_slot_tp_getattro. So I think that assert would fail. But in this case would it matter? If there's no __getattr__ isn't that a good thing? It doesn't change the fact that we can just do normal attribute lookup.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are we checking for here?

This is what I think we want to do w.r.t overriding these methods:

  • Neither __getattribute__ nor __getattr__ are overridden, which we already handle.
  • Both are overridden. Rare, don't specialize.
  • __getattribute__ is overridden, __getattr__ is not, and __getattribute__ is a Python function, which we add a new specialization for.
  • __getattr__ is overridden, two cases:
    • We expect a value on the instance. We can treat this as a normal instance lookup, as we de-opt if the value is NULL, falling back to the full lookup.
    • No attribute on the instance. This is going to be rare, so don't specialize.

Is that correct?

If so, could you set two booleans has_getattr and has_getattributefirst, then do the case analysis, it would be clearer (to me at least).

Include/internal/pycore_typeobject.h Outdated Show resolved Hide resolved
@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a worthwhile improvement.

A few suggestions.

Python/specialize.c Outdated Show resolved Hide resolved
Python/specialize.c Outdated Show resolved Hide resolved
Python/specialize.c Outdated Show resolved Hide resolved
@@ -568,8 +574,47 @@ analyze_descriptor(PyTypeObject *type, PyObject *name, PyObject **descr, int sto
}
else {
if (type->tp_getattro != PyObject_GenericGetAttr) {
*descr = NULL;
return GETSET_OVERRIDDEN;
getattrofunc getattro_slot = type->tp_getattro;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are we checking for here?

This is what I think we want to do w.r.t overriding these methods:

  • Neither __getattribute__ nor __getattr__ are overridden, which we already handle.
  • Both are overridden. Rare, don't specialize.
  • __getattribute__ is overridden, __getattr__ is not, and __getattribute__ is a Python function, which we add a new specialization for.
  • __getattr__ is overridden, two cases:
    • We expect a value on the instance. We can treat this as a normal instance lookup, as we de-opt if the value is NULL, falling back to the full lookup.
    • No attribute on the instance. This is going to be rare, so don't specialize.

Is that correct?

If so, could you set two booleans has_getattr and has_getattributefirst, then do the case analysis, it would be clearer (to me at least).

Python/ceval.c Outdated Show resolved Hide resolved
Python/specialize.c Outdated Show resolved Hide resolved
@Fidget-Spinner
Copy link
Member Author

Fidget-Spinner commented Jul 20, 2022

What are we checking for here?
,,,
This is what I think we want to do w.r.t overriding these methods:

Is that correct?

Yes.

If so, could you set two booleans has_getattr and has_getattributefirst, then do the case analysis, it would be clearer (to me at least).

There's one more special case for custom __getattribute__: where it's set to the defaut lookup (PyObject_GenericGetAttr). In such cases if there's no __getattr__, we can just treat this as a normal attribute lookup. This is why it matters whether it's a custom getattribute or not.

@markshannon
Copy link
Member

There's one more special case for custom getattribute: where it's set to the defaut lookup (PyObject_GenericGetAttr). In such cases if there's no getattr, we can just treat this as a normal attribute lookup. This is why it matters whether it's a custom getattribute or not.

Do you mean

class C:
    __getattribute__ = object.__getattribute__

?

Isn't that the same as not overriding at all?

@Fidget-Spinner
Copy link
Member Author

Fidget-Spinner commented Jul 20, 2022

There's one more special case for custom getattribute: where it's set to the defaut lookup (PyObject_GenericGetAttr). In such cases if there's no getattr, we can just treat this as a normal attribute lookup. This is why it matters whether it's a custom getattribute or not.

Do you mean

class C:
    __getattribute__ = object.__getattribute__

?

Isn't that the same as not overriding at all?

Yes I was trying to say that it is the same in that case :).

Python/specialize.c Outdated Show resolved Hide resolved
Fidget-Spinner and others added 2 commits July 20, 2022 23:34
Co-Authored-By: Mark Shannon <mark@hotpy.org>
@Fidget-Spinner
Copy link
Member Author

Ping @markshannon . Is there anything left for me to do?

Python/specialize.c Outdated Show resolved Hide resolved
@markshannon
Copy link
Member

One minor issue, and we should probably run the buildbots again.
Other than that, looks good.

@Fidget-Spinner Fidget-Spinner added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Aug 16, 2022
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @Fidget-Spinner for commit cb7d01e 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Aug 16, 2022
@Fidget-Spinner Fidget-Spinner added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Aug 16, 2022
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @Fidget-Spinner for commit def326e 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Aug 16, 2022
@@ -594,7 +595,9 @@ analyze_descriptor(PyTypeObject *type, PyObject *name, PyObject **descr, int sto
getattribute != interp->callable_cache.object__getattribute__;
has_getattr = _PyType_Lookup(type, &_Py_ID(__getattr__)) != NULL;
if (has_custom_getattribute) {
if (!has_getattr && Py_IS_TYPE(getattribute, &PyFunction_Type)) {
if (getattro_slot == _Py_slot_tp_getattro &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An assertion about which slot is present during specialisation was failing in one of the buildbots. Apparently there are cases where the other slot (which potentially calls __getattr__) still stuck around even with no __getattr__. This may mean the slot had not been called yet (seems strange considering specialised code is supposed to be hot).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which assertion?
Does that mean the assertion is incorrect, or that this test was?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly, that assert is wrong, in the sense that it's OK if type->tp_getattro == _Py_slot_tp_getattr_hook provided that there it would convert itself to _Py_slot_tp_getattro.

Having said that, it is probably easier to reason about if we assert type->tp_getattro == _Py_slot_tp_getattro, and the impact on performance of the extra test will be negligible.

@markshannon markshannon merged commit 7276ca2 into python:main Aug 17, 2022
@Fidget-Spinner Fidget-Spinner deleted the specialize_overridden branch August 17, 2022 14:33
@gvanrossum
Copy link
Member

Hey @Fidget-Spinner and @markshannon, I just noticed that LOAD_ATTR_GETATTRIBUTE_OVERRIDDEN, introduced here to support __getattribute__, is never used in the pyperformance benchmark suite. Is that specialization at all important? I know __getattr__ is important -- but __getattribute__ does seem like something rare enough that we might not bother. Does either of you recall why it's needed?

@markshannon
Copy link
Member

markshannon commented Oct 4, 2023

FTR (from a discussion on discord) __getattribute__ was used in _DeprecatedType in the typing module, but not since 3.12.

I think the specialization is worth keeping, but we could move it into the "long tail" discussed in faster-cpython/ideas#447

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants