New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-31336: Speed up type creation, which is highly dominated by slow dict lookups. #3279
Conversation
Objects/typeobject.c
Outdated
return NULL; | ||
} | ||
} | ||
|
||
res = NULL; | ||
/* keep a strong reference to mro because type->tp_mro can be replaced | ||
during PyDict_GetItem(dict, name) */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the comment.
@@ -2994,7 +3010,7 @@ _PyType_Lookup(PyTypeObject *type, PyObject *name) | |||
assert(PyType_Check(base)); | |||
dict = ((PyTypeObject *)base)->tp_dict; | |||
assert(dict && PyDict_Check(dict)); | |||
res = PyDict_GetItem(dict, name); | |||
res = _PyDict_GetItem_KnownHash(dict, name, hash); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyDict_GetItem()
always clears errors. Call PyErr_Clear()
if _PyDict_GetItem_KnownHash()
returns NULL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see well the difference between PyDict_GetItem() and _PyDict_GetItem_KnownHash() in term of performance. Maybe the difference is that PyDict_GetItem() calls PyErr_Fetch/PyErr_Restore. In that case, PyDict_GetItemWithError() would have the same speed, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using PyDict_GetItemWithError()
adds only a half of the speed up.
The difference is:
- PyDict_GetItem() calls PyThreadState_GET/PyErr_Fetch/PyErr_Restore.
- PyDict_GetItem() checks for str and reads a hash every time. I don't understand well why this affects performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
KnownHash is extremely short in comparison and probably gets inlined and streamlined with LTO. Substantially less branching.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
…hained name lookups in _PyType_Lookup().
Nice catches, but certainly, I didn't expect the Spanish Inquisition! |
Nobody expects the Spanish Inquisition! @serhiy-storchaka: please review the changes made to this pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to use the _Py_IDENTIFIER API rather than using _PyDict_GetItem_KnownHash().
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
…m update_one_slot().
…r ceases to be a valid base type, there'll probably be larger code sections to change than this one.
I agree that using the Instead, I followed Naokis idea of spliting |
Objects/typeobject.c
Outdated
@@ -6958,8 +6990,14 @@ update_one_slot(PyTypeObject *type, slotdef *p) | |||
return p; | |||
} | |||
do { | |||
descr = _PyType_Lookup(type, p->name_strobj); | |||
descr = _PyType_LookupUncached(type, p->name_strobj, &error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to add a comment explaining why the uncached lookup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure enough. Added.
…common in Python 3 now. Also make the step from "return NULL" error handling to "goto error" reference cleanup explicit.
nbases = 1; | ||
} | ||
else | ||
Py_INCREF(bases); | ||
else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any effect of this change? I tested class C: pass
and class C(object): pass
and didn't see any difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I consider this part a cleanup that makes it clearer what operations are needed when, and how error cases are dealt with.
I didn't measure any speed difference either.
Objects/typeobject.c
Outdated
if (res != NULL) | ||
break; | ||
if (PyErr_Occurred()) | ||
return NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaks mro
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Objects/typeobject.c
Outdated
if (!PyUnicode_CheckExact(name) || | ||
(hash = ((PyASCIIObject *) name)->hash) == -1) | ||
{ | ||
hash = PyObject_Hash(name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyObject_Hash()
can call user code. mro
is a borrowed reference here, it should be increfed before calling PyObject_Hash()
.
Or calculate the hash before getting mro
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I'll move the hash() call up.
Objects/typeobject.c
Outdated
/* Internal API to look for a name through the MRO, bypassing the method cache. | ||
This returns a borrowed reference, and might set an exception! */ | ||
static PyObject * | ||
_PyType_LookupUncached(PyTypeObject *type, PyObject *name, int *error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Local functions usually use different name convention. All lower characters and no Py
prefix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to hint at the name of the existing function. A different name would be less clear, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO this rather adds a confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed the function to find_name_in_mro()
.
Objects/typeobject.c
Outdated
} | ||
|
||
Py_hash_t hash; | ||
*error = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively you could return a special value for signalling error. E.g. (PyObject *)(-1). Test what is faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can set error
to -1
on exceptions. That will distinguish all three cases: ok, error with exception, error without exception. Only -1
will need a call to PyErr_Clear()
then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dislike things like (PyObject *)(-1)
because I wouldn't want to assume that they are not valid pointer values on any system whatsoever.
Distinguish between "error" and "error with exception" cases in _PyType_LookupUncached(). Fix a reference leak of "mro" on lookup errors. Resolves issues found by Serhiy Storchaka.
@serhiy-storchaka: thank you for your excellent feedback. |
Objects/typeobject.c
Outdated
/* Internal API to look for a name through the MRO, bypassing the method cache. | ||
This returns a borrowed reference, and might set an exception! */ | ||
static PyObject * | ||
_PyType_LookupUncached(PyTypeObject *type, PyObject *name, int *error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO this rather adds a confusion.
Objects/typeobject.c
Outdated
in a context that propagates the exception out. | ||
*/ | ||
PyErr_Clear(); | ||
PyType_Ready(type) < 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either move PyType_Ready()
to the previous line (if the resulting line will be not too long), or indent it as in the current code and move {
to a new line for readability (new PEP 7 rule).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cleaned it up a little.
Objects/typeobject.c
Outdated
in a context that propagates the exception out. | ||
*/ | ||
if (error == -1) | ||
PyErr_Clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyErr_Clear()
always is called after _PyType_LookupUncached()
if error == -1
. Why not call it inside _PyType_LookupUncached()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The question is which interface is better: should the function itself always swallow any exceptions, or should the callers decide how to deal with them? The comments in the original _PyType_Lookup()
implementation suggest that others have already been as unhappy as me about the interface of that function, so why keep making people unhappy?
Objects/typeobject.c
Outdated
the same type will call it again -- hopefully | ||
in a context that propagates the exception out. | ||
*/ | ||
if (error == -1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add braces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
…isting) API names. Suggested by Serhiy Storchaka.
The
In the failing test, it should find The correct place to fix this is obviously |
That being said, this can obviously be made to work again by ignoring all exceptions also in the new helper function, just as |
I pushed a fix that eats lookup exceptions, exactly like |
…g lookup exceptions, just like "_PyType_Lookup()" did previously.
c52e47a
to
4efde8e
Compare
@@ -0,0 +1,2 @@ | |||
Speed up class creation by reducing overhead in the necessary special method |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add speed up estimation: 10-20%
What was the cause of this failure? Could this failure be fixed by fixing ElementTree? |
See my comment above. Yes, ET should definitely be fixed. I pushed #3545. |
Feel free to revert 4efde8e if you think that fixing cElementTree is enough here. |
…ce eating lookup exceptions, just like "_PyType_Lookup()" did previously." This reverts commit 4efde8e.
…) from functions that swallow live exceptions.
I've reverted that change and added |
Note that both AppVeyor and Travis will be happy again once #3545 is merged. |
I don't recall why I was opposed to the change, but it changed a lot in the meanwhile :-)
@@ -2994,7 +3010,7 @@ _PyType_Lookup(PyTypeObject *type, PyObject *name) | |||
assert(PyType_Check(base)); | |||
dict = ((PyTypeObject *)base)->tp_dict; | |||
assert(dict && PyDict_Check(dict)); | |||
res = PyDict_GetItem(dict, name); | |||
res = _PyDict_GetItem_KnownHash(dict, name, hash); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see well the difference between PyDict_GetItem() and _PyDict_GetItem_KnownHash() in term of performance. Maybe the difference is that PyDict_GetItem() calls PyErr_Fetch/PyErr_Restore. In that case, PyDict_GetItemWithError() would have the same speed, no?
if (error == -1) { | ||
/* It is unlikely by not impossible that there has been an exception | ||
during lookup. Since this function originally expected no errors, | ||
we ignore them here in order to keep up the interface. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that "Since this function originally expected no errors" matters here. Why not reporting the exception to the caller and handle it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm working on a patch that cleans up much of the mess around _PyType_Lookup()
. See https://bugs.python.org/issue31465
"KnownHash is extremely short in comparison and probably gets inlined and streamlined with LTO. Substantially less branching." oh nice, good to know! |
if (mro == NULL) { | ||
*error = 1; | ||
return NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is the only non-exception raising error case left, I actually wonder if this is really an error case. If there is no MRO in a ready-ied type, isn't that just fine from the point of view of a lookup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I just proposed a minor coding style change.
The change enhances error handling, nice.
I don't understand everything, but I trust Python test suite to make sure that the change doesn't break anything :-)
if (bases == NULL) | ||
goto error; | ||
return NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP 7 nitpick: when you modify code, it's better to add { ... } to if blocks.
This gives me around 20% better performance for
class Test: pass
. Admittedly, that's a micro-benchmark, but the change is obvious enough to not merit staying slow.https://bugs.python.org/issue31336