Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in Python in version 3.15.0+ AND in 4.21.1 #10088

Closed
Atheuz opened this issue May 31, 2022 · 9 comments
Closed

Memory leak in Python in version 3.15.0+ AND in 4.21.1 #10088

Atheuz opened this issue May 31, 2022 · 9 comments
Assignees

Comments

@Atheuz
Copy link

Atheuz commented May 31, 2022

What version of protobuf and what language are you using?
Version: 3.14.0/v3.15.0/v4.21.1
Language: Python

What operating system (Linux, Windows, ...) and version?
Ubuntu Linux

What runtime / compiler are you using (e.g., python version or gcc version)
Python v3.9.5
Built protobuf schema with protoc 3.14.0
Built protobuf schema with protoc 3.21.1 too.

What did you do?
Steps to reproduce the behavior:

  1. Go to https://github.com/Atheuz/proto_leak and clone it with: git clone https://github.com/Atheuz/proto_leak.git.
  2. Go to the repository with cd proto_leak.
  3. Create a virtual environment with virtualenv .venv --python=3.9 and activate it with source .venv/bin/activate.
  4. Run: pip install -r requirements.txt. (installs protobuf==3.14.0)
  5. Run python leak.py
  6. Observe:
Memory consumed at the beginning 14.50390625
Memory consumed after parsed 735.421875
Memory consumed after deallocating 16.15625
  1. Run: pip install protobuf==3.15.0
  2. Run python leak.py
  3. Observe:
Memory consumed at the beginning 17.1953125
Memory consumed after parsed 636.6875
Memory consumed after deallocating 536.734375
  1. Run: pip install protobuf==4.21.1
  2. Replace from schema_pb2 import value_test_topic with from schema_4_21_1_pb2 import value_test_topic on line 4 in leak.py
  3. Run python leak.py
  4. Observe:
Memory consumed at the beginning 14.54296875
Memory consumed after parsed 1118.91015625
Memory consumed after deallocating 1002.75390625

What did you expect to see
In 3.15.0, I expected it to free basically all memory used for the list of pb items.
In 4.21.1, I expected it to free basically all memory used for the list of pb items.

What did you see instead?
In 3.15.0, it retains 500+ MB of memory that is not retained in 3.14.0.
In 4.21.1, it retains 1000+ MB of memory that is not retained in 3.14.0.

Previously reported in #9917, was closed as to be fixed in 4.21.1, but doesn't seem like it actually was.

@haberman
Copy link
Member

Thanks for the minimal repro. I've spent some time trying to debug this, but it's tricky because the relationship between malloc()/free() calls and RSS is not straightforward.

I would still like to get to a root cause and fix here. But it will probably take some time.

@lurk26
Copy link

lurk26 commented Jun 9, 2023

Hi, is there any update on this? We are also seeing this problem and are in a bit of a bind as our system is unable to scale to any reasonable capacity due to memory leaking rapidly on high volume use causing crashes and restarts.

@Atheuz
Copy link
Author

Atheuz commented Jun 10, 2023

Hi, is there any update on this? We are also seeing this problem and are in a bit of a bind as our system is unable to scale to any reasonable capacity due to memory leaking rapidly on high volume use causing crashes and restarts.

@lurk26:

You could use the Python implementation by setting the environment variable:
PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python

Though it might be too slow for your system, but it doesn't have this memory issue.

@haberman
Copy link
Member

After much debugging, I've gotten to a root cause for this.

The problem is that the glibc free() function doesn't actually return memory to the OS. It returns it to the heap, for possible use by another malloc() call, but it doesn't give it back to the OS with munmap() or madvise(MADV_DONTNEED).

The pymalloc allocator on the other hand returns memory to the OS as quickly as possible, using munmap().

When you use the pure-Python protobuf library, all of the proto objects are allocated using pymalloc, so when they are collected pymalloc immediately returns the memory to the OS. But when you use an extension module, the proto objects are allocated at the C layer using malloc() and free(). Even though the extension module is properly freeing all of the memory it allocates, the RSS does not go down because of glibc's behavior.

There are two possible solutions to this problem. I have verified that both of these solutions successfully make the RSS go down in the test case attached to this issue:

  1. Make the native extension allocate message data using the Python memory allocation functions PyMem_Malloc(), PyMem_Realloc(), and PyMem_Free(). This will make the extension module use the same allocator as regular Python objects, causing the memory to be returned to the OS eagerly.
  2. Add explicit calls to the glibc function malloc_trim() so that glibc will return memory to the OS.

There are pros and cons to both of these. It's not immediately clear which one is better. (1) has the benefit of being more portable, since it only relies on Python APIs that will always be available, whereas (2) introduces a conditional dependency on a glibc-specific function. On the other hand, the pymalloc functions for (1) require that the GIL is held, and it may be difficult to guarantee that if we are sharing messages between Python and another language.

For example, if you want to call a C or C++ API that takes a proto, you may want to pass the Python proto to C or C++ without a copy. At the point it may be difficult to ensure that C or C++ will not hold a reference that is freed at an unpredictable time. But maybe this is not a concern, as it could be unsafe anyway for C or C++ to hold a reference, since Python message objects are mutable and this could introduce data races.

There is also the question of performance of pymalloc vs glibc malloc. It is worth benchmarking which one of these is faster in a microbenchmark, just to see if there is a significant difference.

@dkull
Copy link

dkull commented Aug 16, 2023

Also got hit with this, latest 3.x versions didn't have this leak, but 4+ versions all seem to leak memory.
PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python plugged the leak, but it will struggle under load.

@zhangskz zhangskz added the 24.x label Aug 29, 2023
@zhangskz zhangskz assigned ericsalo and unassigned anandolee Aug 30, 2023
copybara-service bot pushed a commit that referenced this issue Sep 6, 2023
…en memory is freed

This partially fixes #10088.  However we still have a global map that does not shrink, which can still create the appearance of leaking memory, as it will not be freed until the module is unloaded.

PiperOrigin-RevId: 562976233
copybara-service bot pushed a commit that referenced this issue Sep 6, 2023
…en memory is freed

This partially fixes #10088.  The test case from that bug significantly improves with this change.  However we still have a global map that does not shrink, which can still create the appearance of leaking memory, as it will not be freed until the module is unloaded.

PiperOrigin-RevId: 562976233
@haberman
Copy link
Member

haberman commented Sep 6, 2023

#13865 uses malloc_trim() to address most of the issue. With that PR, the output of the repro is:

Memory consumed at the beginning 16.5
Memory consumed after parsed 1099.99609375
Memory consumed after deallocating 119.0078125

This isn't perfect (ie. it doesn't get back below 20) because the global object cache used internally by upb is still using memory which will not be released until process shutdown. We could address that separately in a follow-up change. But #13865 should address the majority of the issue here.

@tvalentyn
Copy link

tvalentyn commented Sep 9, 2023

@ericsalo @haberman were fixes here patched to 24.x ? I am still seeing a leak in protobuf==4.24.3, but prior experiments done on https://github.com/protocolbuffers/upb/suites/15656441565/artifacts/893743418 from https://github.com/protocolbuffers/upb/actions/runs/6028136812 showed the leak was fixed.

@chleech
Copy link

chleech commented Sep 11, 2023

i'm observing the same behavior as #10088 (comment)

@haberman
Copy link
Member

I just ran tests for both of the memory leak repros, and the results indicate that both of the reported fixes are present in 4.24.3.

For the repro attached to this bug, I get:

$ /tmp/venv/bin/pip install protobuf==4.24.2
$ /tmp/venv/bin/python leak.py 
Memory consumed at the beginning 16.5
Memory consumed after parsed 1101.09765625
Memory consumed after deallocating 989.21875
$ /tmp/venv/bin/pip install protobuf==4.24.3
$ /tmp/venv/bin/python leak.py 
Memory consumed at the beginning 16.5
Memory consumed after parsed 1102.61328125
Memory consumed after deallocating 121.6875

For the repro attached to https://github.com/protocolbuffers/upb/issues/1243, I get:

$ /tmp/venv/bin/pip install protobuf==4.24.2
$ /tmp/venv/bin/python leak2.py 
RssAnon:           85248 kB
RssAnon:          163584 kB
pip package 'protobuf' version: 4.24.2
api_implementation.Type(): upb
$ /tmp/venv/bin/pip install protobuf==4.24.3
$ /tmp/venv/bin/python leak2.py
RssAnon:            6912 kB
RssAnon:            6912 kB
pip package 'protobuf' version: 4.24.3
api_implementation.Type(): upb

If you are seeing memory leaks with 4.24.3, it must be another issue. If you have a repro, please open another issue and we will fix it.

dawidcha pushed a commit to dawidcha/protobuf that referenced this issue Sep 27, 2023
…en memory is freed

This partially fixes protocolbuffers#10088.  The test case from that bug significantly improves with this change.  However we still have a global map that does not shrink, which can still create the appearance of leaking memory, as it will not be freed until the module is unloaded.

PiperOrigin-RevId: 563124724
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants