-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Deques to adopt the standard clearing procedure for mutable objects #69322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The clear method for deques is possibly reentrant. Use the safer technique used in listobject.c, setobject.c, and dictobject.c. |
I left comments on Rietveld. Perhaps you missed Rietveld messages in the Spam folder. |
How does newblock() mutate the deque? Does PyMem_Malloc() do anything other than call malloc()? We hold the GIL so there doesn't seem to be a need to use PyMem_RawMalloc(). |
"Does PyMem_Malloc() do anything other than call malloc()? We hold the GIL so there doesn't seem to be a need to use PyMem_RawMalloc()." Well, the difference between PyMem_Malloc() and PyMem_RawMalloc() is subtle. Right now, it should only impact debug tools hooking on the memory allocators. But one day I would like to try to modify PyMem_Malloc() to reuse PyObject_Malloc(). I don't see why PyMem_Malloc() shouldn't be optimized for small applications. But it's hard to benchmark such change, and it depends on the platform. I already had to use different config for memory allocators depending on the platform :-/ The implementation of tracemalloc uses a different overallocation factor on Windows and on other platforms. |
Here is a patch that doesn't need newblock(). |
Serhiy contends that calling newblock() can mutate the deque (its only external call is PyMem_Malloc()). I'm trying understand how that could be possible if it is just a thin wrapper around malloc. Before gumming-up the code in an effort to avoid calling newblock(), I want to understand whether there is an actual issue here and if so whether PyMem_RawMalloc() would solve the problem directly. |
I perhaps were overcautious. There is a difference between the case of deque and the case of lru_cache and OrderedDict. The latter creates Python object (PyDict_New) that can trigger garbage collecting, and the former calls only PyMem_Malloc. The latter can cause a crash in common use, and the former only with special memory allocation hooks (Victor, perhaps we should disable GC for executing hooks). But my version of the patch solves not only this issue. It eliminates the use of possibly re-entrant alternate method. |
Victor, I was thinking of switching to PyMem_RawMalloc instead of PyMem_Malloc. The advantages are that there are guaranteed to be no side-effects, the GIL doesn't need to be held, and there is less overhead. Are there any disadvantage I should know about? Serhiy, I'm still considering your patch. I have a strong aversion to putting a big block on the stack and don't like having to do a full block memcpy every time a deque is cleared on deallocation. If you don't see an actual bug in patch 2, I'm inclined to use that approach (especially since the popping fallback approach will likely never be called in practice). |
"Victor, I was thinking of switching to PyMem_RawMalloc instead of PyMem_Malloc. The advantages are that there are guaranteed to be no side-effects, the GIL doesn't need to be held, and there is less overhead. Are there any disadvantage I should know about?" For me, PyMem_Malloc() can be faster than PyMem_RawMalloc() because the GIL is held. In practice... both function are very thin wrapper to malloc(), so it's exactly the same :-) In Objects/obmalloc.c, you have: #define PYRAW_FUNCS _PyMem_RawMalloc, _PyMem_RawCalloc, _PyMem_RawRealloc, _PyMem_RawFree
...
#define PYMEM_FUNCS PYRAW_FUNCS So PyMem_Malloc() and PyMem_RawMalloc() are *currently* exactly the same. I'm not sure that you understand what you are trying to do. If you expect better performances, you should try to use PyObject_Malloc() instead, to benefit from the fast Python allocator for small allocations (512 bytes and less). |
Here is a patch that copies only used part of the block. I don't see an actual bug in patch 2 besides that it fallbacks to current popping behavior. |
The goal is to avoid possible re-entrancy both now and in the future. Since PyMem_RawMalloc is thread-safe and doesn't require the GIL to be held, it is guaranteed that there won't be a callback into regular Python that could mutate any of the data structures. |
The only way to be certain you're never going to face re-entrancy issues in the future is to call malloc() directly - and hope nobody redefines that too with some goofy macro ;-) In the meantime, stick to PyMem_Malloc(). That's the intended way for code holding the GIL to do lowest-level memory allocation. Uses of PyMem_RawMalloc() should be extremely rare, typically only in contexts where Python internals _know_ the GIL isn't being held, and can't reasonably try to acquire the GIL. It's already being used in a few contexts where it probably shouldn't be, and each such use needlessly complicates future changes. If PyMem_Malloc() does grow re-entrancy issues in the future, deques will be the least of our problems. I'd strongly oppose it before then. It's _intended_ to be as low level as possible (it was created to begin with just to worm around cross-platform insanity when called with a 0 argument). |
Thanks Timmy! |
New changeset 005590a4a005 by Raymond Hettinger in branch '3.5': |
New changeset fc6d62db8d42 by Raymond Hettinger in branch '2.7': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: