-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make dict.setdefault() atomic #57730
Comments
A dialog on python-dev suggests that dict.setdefault() was intended to be atomic and that a number of experienced developers have deployed code relying on its atomicity: http://mail.python.org/pipermail/python-3000/2007-July/008693.html The actual implementation shows a second call to PyDict_Setitem() which can call PyObject_Hash() which can call arbitrary Python code. http://hg.python.org/cpython/file/2.7/Objects/dictobject.c#l1937 This fix is straight-forward, use the results of the initial lookup to insert the default object. This will make the operation atomic and it will make it faster. |
+1. |
Atomic and faster... +1. |
+1 for atomic and more robust |
maniram: We try to keep bug reports focused on one thing, and we don’t generally collect votes (we prefer that people vote with patches, tests and messages with content—for example, saying exactly what “more robust” should be). Here I think Antoine and Jesús said +1 to agree that this should be changed even in stable versions. Raymond: I think I could make the patch, but I’m not sure about testing the new behavior. |
Eric, overload "__hash__()" and check that is only called once, while now would be called twice. |
I have written a patch and a test, but since it's changing C code, I am far from being sure if it's achieve the expected behavior in the right way. There are also tests and running whole test suite didn't bring any errors. |
Also: I'll be happy to work further on this patch, if I get some comments and advice. |
If _PyDict_SetItemUsingHash is module-private, it should be declared "static". Also, better if it follows the usual naming of static functions inside that C file (i.e. "dict_some_lowercase_name"). |
Done. |
The patch looks ok to me. I'll let Raymond make the final call. |
Thanks for the idea Jesús, even though I didn’t get the change to use it :) Filip: Is there a reasons for using a nonlocal count rather than e.g. self.count? Otherwise the test looks good. |
Eric, please, could we stop such pointless nitpicking about one's |
I haven't given it much thought, when I was making the choice of using nonlocal rather than self.count. I was rather excited to see, if the change will work as I wanted it to. If you believe it would be better to use an attribute, I'll be happy to change it. |
Well, if this is going to be applied to 2.7, "nonlocal" is invalid syntax there. I guess that being able to use the same test in 2.7 and 3.2/3.3 would be nice. Style, but justified. |
Ah, good point. |
I'll send a patch, when I get home from work. |
Let's just start with a working 2.7 patch and go from there. |
I have tried to port patch to python2.7, but apparently I must be doing something wrong, because while most tests pass, several fail (including tests for unittest itself). I would like to continue working on this test, but incoming week is pretty intensive for me. Is it ok if I returned to working on this during the weekend? |
Of course. There's no rush. |
Thanks, I'll look at it over the next few days. |
Patch for 2.7. |
Bump! There was no activity here for two weeks. Is my patch for 2.7 ok or should I do something more about it? |
Looking at the patch again, I think this isn't enough. |
I understand you are talking about this call: mp->ma_lookup(mp, key, hash); I haven't noticed that earlier. I'll try to provide a better fix (for 2.7 first, after we agree it's good enough, I will provide one for 3.3). Do you have any idea, how I might this part? I mean how to check, that this is called only once? |
Checking that __eq__ is called only once should be a good proxy. |
Thanks, I'll try that. Like before I will probably find time next weekend. |
No more double lookup. |
Your patch doesn't check hashed2.eq_count. A nit: be careful not to use tabs in C files. |
Fixed both issues. |
New changeset 5c52e7c6d868 by Antoine Pitrou in branch '2.7': |
New changeset 90572ccda12c by Antoine Pitrou in branch '3.2': New changeset 3dfa98cf2e26 by Antoine Pitrou in branch 'default': |
Thank you Filip! I've now committed the patch. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: