-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use PyDict_GetItemWithError() instead of PyDict_GetItem() #79640
Comments
There is an issue with using PyDict_GetItem(). Since it silences all exceptions, it can return incorrect result when an exception like MemoryError or KeyboardInterrupt was raised in the user __hash__() and __eq__(). In addition PyDict_GetItemString() and _PyDict_GetItemId() swallow a MemoryError raised when fail to allocate a temporary string object. In addition, PyDict_GetItemWithError() is a tiny bit faster than PyDict_GetItem(), because it avoids checking the exception state in successful case. The proposed PR replaces most calls of PyDict_GetItem(), PyDict_GetItemString() and _PyDict_GetItemId() with calls of PyDict_GetItemWithError(), _PyDict_GetItemStringWithError() and _PyDict_GetItemIdWithError(). |
My previous attempt: bpo-20615. |
Opened bpo-35461 for documenting flaws of PyDict_GetItem(). |
Most of changes are straightforward. Just replaced PyDict_GetItem*() with PyDict_GetItem*WithError() and added the check for PyErr_Occurred(). PyDict_GetItemString() with constant argument was replaced with _PyDict_GetItemIdWithError() for performance. Some code was left unchanged. This was mostly in files where errors are very and error checking is not performed or errors are silenced in any case (Python/compile.c, Python/symtable.c, Objects/structseq.c, etc). These cases needed separate issues. The most non-trivial change is in Objects/typeobject.c. The check for duplicated descriptors (in add_methods(), add_members() and add_getset()) was moved after creating the descriptor object. This improves performance by avoiding to create a temporary string objects. Duplicate descriptor names is a very uncommon case -- there were only two cases in the stdlib (besides tests), and one of them already is fixed (PR 11053). |
Eric requested to run the benchmark suite. Here are results. I do not know how to interpret them. Likely all differences are random. |
Thanks, Serhiy. While the benchmark suite is our best tool available for measuring performance, I'm not sure what slowdown is significant in those results. @victor, any thoughts? |
It seems like the change introduced a regression: bpo-36110. |
While the test has been fixed, IMHO we need to better document this subtle behavior change since it can be surprising. The test failed because PyObject_GetAttr() no longer ignores exceptions when getting the attribute from the type dictionary. It's a significant change compared to Python 3.7. Right now, the change has not even a NEWS entry, whereas it's a backward incompatible change! Don't get my wrong: the change is correct, we don't want to ignore arbitrary exceptions, it's just a matter of documentation. |
Serhiy, can this issue be closed? |
There are few occurrences of PyDict_GetItem(), PyDict_GetItemString() and _PyDict_GetItemId() in cases where they are unlikely failed. These cases will be considered in separate issues. |
I stumbled upon this issue while reading Python 3.8 and this made me curious. I've tried writing some Python code to reproduce this bug, but I'm unable to -- I should be missing something. Is there a simple snippet showing the issue? Also, the changelog states that "The CPython interpreter can swallow exceptions in some circumstances". Are there other documented cases of those circumstances? |
Any pointer would also be welcome, and a piece of code showing the bug would help me a lot. Thank you. |
Note that this is a bug from long ago. Why this bug had lived long is it can not happen in regular cases. So it is difficult to reproduce. See PR 11112. _csv module is changed to use PyDict_GetItemWithError. Python 3.7.6 (default, Dec 30 2019, 19:38:28)
[Clang 11.0.0 (clang-1100.0.33.16)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> class S(str):
... def __hash__(self):
... raise MemoryError
...
>>> import _csv
>>> _csv.Dialect(S("excel"))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
_csv.Error: unknown dialect You can see the MemoryError is suppressed. Let's try it on Python 3.8. $ python3
Python 3.8.1 (default, Jan 6 2020, 16:02:33)
(snip)
>>> _csv.Dialect(S("excel"))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in __hash__
MemoryError You can see the MemoryError is not suppressed. |
Thank you very much! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: