New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Debug memory allocators: remove useless "serialno" field to reduce memory footprint #80792
Comments
When PYTHONMALLOC=debug environment variable or -X dev command line option is used, Python installs debug hooks on memory allocators which add 2 size_t before and 2 size_t after each memory block: it adds 32 bytes to every memory allocation. I'm debugging crashes and memory leaks in CPython for 10 years, and I simply never had to use "serialno". So I simply propose attached pull request to remove it to reduce the memory footprint: I measured a reduction around -5% (ex: 1.2 MiB on 33.0 MiB when running test_asyncio). A smaller memory footprint allows to use this feature on devices with small memory, like embedded devices. The change also fix race condition in debug memory allocators: bpo-31473, "Debug hooks on memory allocators are not thread safe (serialno variable)". Using tracemalloc, it is already possible (since Python 3.6) to find where a memory block has been allocated, and so decide where to put a breakpoint when debugging. If someone cares about the "serialno" field, maybe we can keep code using a compilation flag, like a C #define. "serialno" is documented as: "an excellent way to set a breakpoint on the next run, to capture the instant at which this block was passed out." But again, I never used it... -- Some examples of the *peak* memory usage without => with the change:
Command used to measure the memory consumption: $ ./python -i -X tracemalloc -c pass
>>> import tracemalloc; print("%.1f kB" % (tracemalloc.get_traced_memory()[1] / 1024.)) With the patch: diff --git a/Modules/_tracemalloc.c b/Modules/_tracemalloc.c
index c5d5671032..e010c2ef84 100644
--- a/Modules/_tracemalloc.c
+++ b/Modules/_tracemalloc.c
@@ -582,6 +582,8 @@ tracemalloc_add_trace(unsigned int domain, uintptr_t ptr,
_Py_hashtable_entry_t* entry;
int res;
+ size += 4 * sizeof(size_t);
+
assert(_Py_tracemalloc_config.tracing);
traceback = traceback_new(); Replace 4 with 3 to measure memory used with the change. -- Since Python 3.6, when the debug memory allocator detects a bug (ex: buffer overflow), it now also displays the Python traceback where the memory block has been allocated if tracemalloc is tracing Python memory allocations. Example with buffer_overflow.py: import _testcapi
def func():
_testcapi.pymem_buffer_overflow()
def main():
func()
if __name__ == "__main__":
main() Output: $ ./python -X tracemalloc=10 -X dev bug.py Debug memory block at address p=0x7f45e85c3270: API 'm' Memory block allocated at (most recent call first): Fatal Python error: bad trailing pad byte Current thread 0x00007f45f5660740 (most recent call first): The interesting part is "Memory block allocated at (most recent call first):". Traceback reconstructed manually: You can see exactly where the memory block has been allocated. Note: Internally, the _PyTraceMalloc_GetTraceback() function is used to get the traceback where a memory block has been allocated. -- Extract of _PyMem_DebugRawAlloc() in Objects/obmalloc.c: /* Let S = sizeof(size_t). The debug malloc asks for 4*S extra bytes and p[0: S] /* Layout: [SSSS IFFF CCCC...CCCC FFFF NNNN]
The last size_t written at the end of each memory block is "serialno". It is documented as: "an excellent way to set a breakpoint on the next run, to capture the instant at which this block was passed out." |
While testing my changes, I found a bug in test_sys: ./python -X tracemalloc -m test test_sys -v -m test_getallocatedblocks ====================================================================== Traceback (most recent call last):
File "/home/vstinner/prog/python/master/Lib/test/test_sys.py", line 770, in test_getallocatedblocks
alloc_name = _testcapi.pymem_getallocatorsname()
RuntimeError: cannot get allocators name Attached PR 12797 fix it. |
I never used the serialno too. |
This issue is related to the following thread on python-dev which discuss disabling Py_TRACE_REFS by default, bpo-36465, to reduce the memory footprint in debug mode: |
The serialno has been added at the same time than the whole debug hooks on Python memory allocated by Tim Peters in 2002, 17 years ago: commit ddea208
|
We decided to only disable the code by default, but the code stays until we are sure that nobody uses it. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: