bpo-31336: Speed up type creation, which is highly dominated by slow dict lookups. #3279

scoder · 2017-09-04T11:39:56Z

This gives me around 20% better performance for class Test: pass. Admittedly, that's a micro-benchmark, but the change is obvious enough to not merit staying slow.

https://bugs.python.org/issue31336

serhiy-storchaka · 2017-09-04T12:44:18Z

Objects/typeobject.c

+            return NULL;
+        }
+    }
+
    res = NULL;
    /* keep a strong reference to mro because type->tp_mro can be replaced
       during PyDict_GetItem(dict, name)  */


Update the comment.

serhiy-storchaka · 2017-09-04T12:51:18Z

Objects/typeobject.c

@@ -2994,7 +3010,7 @@ _PyType_Lookup(PyTypeObject *type, PyObject *name)
        assert(PyType_Check(base));
        dict = ((PyTypeObject *)base)->tp_dict;
        assert(dict && PyDict_Check(dict));
-        res = PyDict_GetItem(dict, name);
+        res = _PyDict_GetItem_KnownHash(dict, name, hash);


PyDict_GetItem() always clears errors. Call PyErr_Clear() if _PyDict_GetItem_KnownHash() returns NULL.

Already changed.

I don't see well the difference between PyDict_GetItem() and _PyDict_GetItem_KnownHash() in term of performance. Maybe the difference is that PyDict_GetItem() calls PyErr_Fetch/PyErr_Restore. In that case, PyDict_GetItemWithError() would have the same speed, no?

Using PyDict_GetItemWithError() adds only a half of the speed up.

The difference is:

PyDict_GetItem() calls PyThreadState_GET/PyErr_Fetch/PyErr_Restore.

PyDict_GetItem() checks for str and reads a hash every time. I don't understand well why this affects performance.

KnownHash is extremely short in comparison and probably gets inlined and streamlined with LTO. Substantially less branching.

bedevere-bot · 2017-09-04T12:53:16Z

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I didn't expect the Spanish Inquisition!. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

…hained name lookups in _PyType_Lookup().

scoder · 2017-09-04T13:34:24Z

Nice catches, but certainly, I didn't expect the Spanish Inquisition!

bedevere-bot · 2017-09-04T13:34:26Z

Nobody expects the Spanish Inquisition!

@serhiy-storchaka: please review the changes made to this pull request.

vstinner

I would prefer to use the _Py_IDENTIFIER API rather than using _PyDict_GetItem_KnownHash().

bedevere-bot · 2017-09-04T17:18:50Z

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I didn't expect the Spanish Inquisition!. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

…m update_one_slot().

…r ceases to be a valid base type, there'll probably be larger code sections to change than this one.

scoder · 2017-09-05T18:00:12Z

I agree that using the _Py_IDENTIFIER API would be nice, but changing the whole setup is more work than I would currently like to invest into this. This can still be done later. My changes do not make it any harder.

Instead, I followed Naokis idea of spliting _PyType_Lookup() into two functions, one to do the lookup and one that handles the method caching and calls the other on misses.

pitrou · 2017-09-05T19:31:07Z

Objects/typeobject.c

@@ -6958,8 +6990,14 @@ update_one_slot(PyTypeObject *type, slotdef *p)
        return p;
    }
    do {
-        descr = _PyType_Lookup(type, p->name_strobj);
+        descr = _PyType_LookupUncached(type, p->name_strobj, &error);


Would be nice to add a comment explaining why the uncached lookup.

Sure enough. Added.

…common in Python 3 now. Also make the step from "return NULL" error handling to "goto error" reference cleanup explicit.

serhiy-storchaka · 2017-09-10T18:47:19Z

Objects/typeobject.c

        nbases = 1;
    }
-    else
-        Py_INCREF(bases);
+    else {


Is there any effect of this change? I tested class C: pass and class C(object): pass and didn't see any difference.

I consider this part a cleanup that makes it clearer what operations are needed when, and how error cases are dealt with.
I didn't measure any speed difference either.

serhiy-storchaka · 2017-09-10T18:54:37Z

Objects/typeobject.c

        if (res != NULL)
            break;
+        if (PyErr_Occurred())
+            return NULL;


Leaks mro.

serhiy-storchaka · 2017-09-10T18:56:43Z

Objects/typeobject.c

+    if (!PyUnicode_CheckExact(name) ||
+        (hash = ((PyASCIIObject *) name)->hash) == -1)
+    {
+        hash = PyObject_Hash(name);


PyObject_Hash() can call user code. mro is a borrowed reference here, it should be increfed before calling PyObject_Hash().

Or calculate the hash before getting mro.

Good catch. I'll move the hash() call up.

serhiy-storchaka · 2017-09-10T19:02:33Z

Objects/typeobject.c

+/* Internal API to look for a name through the MRO, bypassing the method cache.
+   This returns a borrowed reference, and might set an exception! */
+static PyObject *
+_PyType_LookupUncached(PyTypeObject *type, PyObject *name, int *error)


Local functions usually use different name convention. All lower characters and no Py prefix.

I was trying to hint at the name of the existing function. A different name would be less clear, I think.

IMHO this rather adds a confusion.

I renamed the function to find_name_in_mro().

serhiy-storchaka · 2017-09-10T19:09:36Z

Objects/typeobject.c

-    }
-
+    Py_hash_t hash;
+    *error = 1;


Alternatively you could return a special value for signalling error. E.g. (PyObject *)(-1). Test what is faster.

I can set error to -1 on exceptions. That will distinguish all three cases: ok, error with exception, error without exception. Only -1 will need a call to PyErr_Clear() then.

I dislike things like (PyObject *)(-1) because I wouldn't want to assume that they are not valid pointer values on any system whatsoever.

Distinguish between "error" and "error with exception" cases in _PyType_LookupUncached(). Fix a reference leak of "mro" on lookup errors. Resolves issues found by Serhiy Storchaka.

scoder · 2017-09-10T20:12:57Z

@serhiy-storchaka: thank you for your excellent feedback.
I applied the changes and couldn't see a performance degradation from them.

serhiy-storchaka · 2017-09-10T20:21:23Z

Objects/typeobject.c

+/* Internal API to look for a name through the MRO, bypassing the method cache.
+   This returns a borrowed reference, and might set an exception! */
+static PyObject *
+_PyType_LookupUncached(PyTypeObject *type, PyObject *name, int *error)


IMHO this rather adds a confusion.

serhiy-storchaka · 2017-09-10T20:26:16Z

Objects/typeobject.c

-               in a context that propagates the exception out.
-            */
-            PyErr_Clear();
+                PyType_Ready(type) < 0) {


Either move PyType_Ready() to the previous line (if the resulting line will be not too long), or indent it as in the current code and move { to a new line for readability (new PEP 7 rule).

I cleaned it up a little.

serhiy-storchaka · 2017-09-10T20:29:26Z

Objects/typeobject.c

+           in a context that propagates the exception out.
+        */
+        if (error == -1)
+            PyErr_Clear();


PyErr_Clear() always is called after _PyType_LookupUncached() if error == -1. Why not call it inside _PyType_LookupUncached()?

The question is which interface is better: should the function itself always swallow any exceptions, or should the callers decide how to deal with them? The comments in the original _PyType_Lookup() implementation suggest that others have already been as unhappy as me about the interface of that function, so why keep making people unhappy?

serhiy-storchaka · 2017-09-10T20:31:08Z

Objects/typeobject.c

+           the same type will call it again -- hopefully
+           in a context that propagates the exception out.
+        */
+        if (error == -1)


Add braces.

…isting) API names. Suggested by Serhiy Storchaka.

scoder · 2017-09-13T12:40:04Z

The XMLParser.__init__() code in _elementtree.c does this:

    self->handle_start = PyObject_GetAttrString(target, "start");
    self->handle_data = PyObject_GetAttrString(target, "data");
    self->handle_end = PyObject_GetAttrString(target, "end");
    self->handle_comment = PyObject_GetAttrString(target, "comment");
    self->handle_pi = PyObject_GetAttrString(target, "pi");
    self->handle_close = PyObject_GetAttrString(target, "close");
    self->handle_doctype = PyObject_GetAttrString(target, "doctype");
    PyErr_Clear();

In the failing test, it should find close but not pi. Thus, it looks up close with a live AttributeError set from the pilookup. Since _PyDict_GetItem_KnownHash() may or may not set an exception, we have to check for a live exception after calling it, and that finds the old exception of the last attribute lookup and decides that its own lookup failed.

The correct place to fix this is obviously _elementtree.c (please do), but if one module has this bug, it's unlikely to be the only one. What do you think? To me, it would feel wrong to allow this misbehaviour inside of CPython by explicitly handling it somehow.

scoder · 2017-09-13T12:54:02Z

That being said, this can obviously be made to work again by ignoring all exceptions also in the new helper function, just as _PyType_Lookup() previously did. Sounds wrong to do that, but it keeps things working that worked before.

scoder · 2017-09-13T13:05:53Z

I pushed a fix that eats lookup exceptions, exactly like _PyType_Lookup() previously did.

…g lookup exceptions, just like "_PyType_Lookup()" did previously.

serhiy-storchaka · 2017-09-13T13:48:52Z

Misc/NEWS.d/next/Core and Builtins/2017-09-13-12-04-23.bpo-31336.gi2ahY.rst

@@ -0,0 +1,2 @@
+Speed up class creation by reducing overhead in the necessary special method


Add speed up estimation: 10-20%

serhiy-storchaka · 2017-09-13T13:50:27Z

What was the cause of this failure? Could this failure be fixed by fixing ElementTree?

scoder · 2017-09-13T14:31:19Z

See my comment above. Yes, ET should definitely be fixed. I pushed #3545.

scoder · 2017-09-13T18:06:02Z

Feel free to revert 4efde8e if you think that fixing cElementTree is enough here.

…ce eating lookup exceptions, just like "_PyType_Lookup()" did previously." This reverts commit 4efde8e.

…) from functions that swallow live exceptions.

scoder · 2017-09-14T08:19:49Z

I've reverted that change and added assert()s instead. Fixing the behaviour of _PyType_Lookup() is a separate issue (I've created a ticket in the tracker).

scoder · 2017-09-14T13:16:30Z

Note that both AppVeyor and Travis will be happy again once #3545 is merged.

I don't recall why I was opposed to the change, but it changed a lot in the meanwhile :-)

vstinner · 2017-09-15T12:47:27Z

Objects/typeobject.c

@@ -2994,7 +3010,7 @@ _PyType_Lookup(PyTypeObject *type, PyObject *name)
        assert(PyType_Check(base));
        dict = ((PyTypeObject *)base)->tp_dict;
        assert(dict && PyDict_Check(dict));
-        res = PyDict_GetItem(dict, name);
+        res = _PyDict_GetItem_KnownHash(dict, name, hash);


I don't see well the difference between PyDict_GetItem() and _PyDict_GetItem_KnownHash() in term of performance. Maybe the difference is that PyDict_GetItem() calls PyErr_Fetch/PyErr_Restore. In that case, PyDict_GetItemWithError() would have the same speed, no?

vstinner · 2017-09-15T12:48:40Z

Objects/typeobject.c

+            if (error == -1) {
+                /* It is unlikely by not impossible that there has been an exception
+                   during lookup. Since this function originally expected no errors,
+                   we ignore them here in order to keep up the interface. */


I don't think that "Since this function originally expected no errors" matters here. Why not reporting the exception to the caller and handle it?

I'm working on a patch that cleans up much of the mess around _PyType_Lookup(). See https://bugs.python.org/issue31465

vstinner · 2017-09-15T14:55:58Z

"KnownHash is extremely short in comparison and probably gets inlined and streamlined with LTO. Substantially less branching." oh nice, good to know!

scoder · 2017-09-16T15:37:28Z

Objects/typeobject.c

        if (mro == NULL) {
+            *error = 1;
            return NULL;


Since this is the only non-exception raising error case left, I actually wonder if this is really an error case. If there is no MRO in a ready-ied type, isn't that just fine from the point of view of a lookup?

scoder · 2017-09-18T07:28:40Z

From my POV, this is ready for merging.
All further changes regarding the general exception handling in _PyType_Lookup() are tracked by bpo-31465 and #3616.

vstinner

LGTM. I just proposed a minor coding style change.

The change enhances error handling, nice.

I don't understand everything, but I trust Python test suite to make sure that the change doesn't break anything :-)

vstinner · 2017-09-26T13:02:59Z

Objects/typeobject.c

        if (bases == NULL)
-            goto error;
+            return NULL;


PEP 7 nitpick: when you modify code, it's better to add { ... } to if blocks.

Speed up type creation, which is highly dominated by slow dict lookups.

0a865c9

the-knights-who-say-ni added the CLA signed label Sep 4, 2017

bedevere-bot added the awaiting review label Sep 4, 2017

serhiy-storchaka requested changes Sep 4, 2017

View reviewed changes

bedevere-bot added awaiting changes and removed awaiting review labels Sep 4, 2017

Stefan Behnel added 2 commits September 4, 2017 15:14

Update comment after changing the function call that it refers to.

c4779e3

Ignore any errors (however unlikely) that may happen during the mro c…

ff2ea63

…hained name lookups in _PyType_Lookup().

bedevere-bot added awaiting change review and removed awaiting changes labels Sep 4, 2017

vstinner requested changes Sep 4, 2017

View reviewed changes

bedevere-bot added awaiting changes and removed awaiting change review labels Sep 4, 2017

Stefan Behnel added 2 commits September 5, 2017 19:07

Extract non-caching code from _PyType_Lookup() to use it directly fro…

76b1986

…m update_one_slot().

Avoid some useless overhead for the no-basetype case. If "object" eve…

6f98485

…r ceases to be a valid base type, there'll probably be larger code sections to change than this one.

pitrou reviewed Sep 5, 2017

View reviewed changes

Stefan Behnel added 2 commits September 6, 2017 07:19

Add comment.

d9b726e

Avoid uselessly searching empty bases for a metaclass. This is quite …

be15255

…common in Python 3 now. Also make the step from "return NULL" error handling to "goto error" reference cleanup explicit.

serhiy-storchaka reviewed Sep 10, 2017

View reviewed changes

Avoid unsafe handling of borrowed "mro" reference during hash() call.

0736226

Distinguish between "error" and "error with exception" cases in _PyType_LookupUncached(). Fix a reference leak of "mro" on lookup errors. Resolves issues found by Serhiy Storchaka.

serhiy-storchaka reviewed Sep 10, 2017

View reviewed changes

Stefan Behnel added 3 commits September 10, 2017 22:44

Clean up code and formatting a little.

8bc783f

Add braces for code style reasons.

66648dd

Give internal helper function a local name that does not resemble (ex…

85fb3ae

…isting) API names. Suggested by Serhiy Storchaka.

Change nice interface of "find_name_in_mro()" to evil interface eatin…

4efde8e

…g lookup exceptions, just like "_PyType_Lookup()" did previously.

scoder force-pushed the _fast_pytype_lookup branch from c52e47a to 4efde8e Compare September 13, 2017 13:23

serhiy-storchaka reviewed Sep 13, 2017

View reviewed changes

Mention amount of speedup in News entry.

f5bce2a

scoder mentioned this pull request Sep 13, 2017

bpo-31455: avoid calling "PyObject_GetAttrString()" with a live exception set #3545

Merged

Stefan Behnel added 2 commits September 14, 2017 10:12

Revert "Change nice interface of "find_name_in_mro()" to evil interfa…

02bfef0

…ce eating lookup exceptions, just like "_PyType_Lookup()" did previously." This reverts commit 4efde8e.

Guard against external live exceptions when calling find_name_in_mro(…

2497858

…) from functions that swallow live exceptions.

serhiy-storchaka approved these changes Sep 15, 2017

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting changes labels Sep 15, 2017

serhiy-storchaka closed this Sep 15, 2017

serhiy-storchaka reopened this Sep 15, 2017

vstinner reviewed Sep 15, 2017

View reviewed changes

scoder mentioned this pull request Sep 16, 2017

gh-75646: allow _PyType_Lookup() to raise exceptions #3616

Open

scoder commented Sep 16, 2017

View reviewed changes

vstinner reviewed Sep 26, 2017

View reviewed changes

vstinner approved these changes Sep 26, 2017

View reviewed changes

serhiy-storchaka merged commit 2102c78 into python:master Oct 1, 2017

Mariatta removed the awaiting merge label Oct 8, 2017

		@@ -0,0 +1,2 @@
		Speed up class creation by reducing overhead in the necessary special method

bpo-31336: Speed up type creation, which is highly dominated by slow dict lookups. #3279

bpo-31336: Speed up type creation, which is highly dominated by slow dict lookups. #3279

Conversation

scoder commented Sep 4, 2017 • edited by bedevere-bot

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bedevere-bot commented Sep 4, 2017

scoder commented Sep 4, 2017

bedevere-bot commented Sep 4, 2017

vstinner left a comment

Choose a reason for hiding this comment

bedevere-bot commented Sep 4, 2017

scoder commented Sep 5, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scoder Sep 10, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scoder commented Sep 10, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scoder commented Sep 13, 2017

scoder commented Sep 13, 2017

scoder commented Sep 13, 2017

Choose a reason for hiding this comment

serhiy-storchaka commented Sep 13, 2017

scoder commented Sep 13, 2017

scoder commented Sep 13, 2017

scoder commented Sep 14, 2017

scoder commented Sep 14, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vstinner commented Sep 15, 2017

Choose a reason for hiding this comment

scoder commented Sep 18, 2017

vstinner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scoder commented Sep 4, 2017 •

edited by bedevere-bot

scoder Sep 10, 2017 •

edited