Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: memory leak in protobuf.reflection.ParseMessage() #156

Closed
wbenfold opened this issue Jan 7, 2015 · 5 comments
Closed

Python: memory leak in protobuf.reflection.ParseMessage() #156

wbenfold opened this issue Jan 7, 2015 · 5 comments

Comments

@wbenfold
Copy link

wbenfold commented Jan 7, 2015

Copied from https://code.google.com/p/protobuf/issues/detail?id=661 as the GitHub bug tracker seems to be more active...

What steps will reproduce the problem?

import google
import google.protobuf
import google.protobuf.descriptor_pb2

descriptor = google.protobuf.descriptor_pb2.DescriptorProto.DESCRIPTOR

while True:
google.protobuf.reflection.ParseMessage(descriptor, '')

What is the expected output? What do you see instead?

Expected: infinite loop with constant memory usage
Observed: infinite loop with rapidly growing memory usage

What version of the product are you using? On what operating system?

protobuf 2.5.0
python 2.7.3
Scientific Linux 6.5 (x86_64)

Please provide any additional information below.

The code above is a minimal standalone example (I don't really want to construct an endless series of DescriptorProto objects). In reality I'm calling ParseMessage with a descriptor which was read from the input stream, but the outcome is the same: memory usage grows linearly with the number of messages parsed.

I can't find anything in the documentation to suggest that ParseMessage requires the user to take special action to release resources.

@daid
Copy link

daid commented Feb 11, 2015

Are you sure it's not just a lack of the python garbage collector running? You might want to do a "gc.collect()" in your loop to verify.

@wbenfold
Copy link
Author

Afraid not: tried making it do a gc.collect() every 10000 iterations, makes no difference.
Shouldn't the GC be running automatically anyway?

Also tried using protobuf 2.6.1; also makes no difference.

@daid
Copy link

daid commented Feb 20, 2015

It should, but in some conditions it does not get enough time and will let the memory use grow instead of freeing memory, to lower runtime, as long as there is more then enough memory available from the OS. (seen this happen especially with java applications)

@xfxyjwf
Copy link
Contributor

xfxyjwf commented Mar 10, 2015

Will your code example cause a memory exception eventually?

@wbenfold
Copy link
Author

I believe it will run out of memory, but I expect the consequences will be platform-dependent. On my machine it think it will run until the OOM-killer gives it a SIGKILL. If the platform limits the per-process memory allocation or virtual address space (e.g. when running under a 32-bit OS), then you'll probably see the Python interpreter raise an exception.

(My actual application does get killed due to running out of memory. I believe this cut-down example will eventually do the same, but I think it's dealing with smaller messages so will take a very long time to get there. I haven't yet had the patience to sit it out.)

Note that as more memory is allocated, the process slows down; I think this is a combination of a) hitting swap once physical memory is exhausted and b) the Python heap manager possibly having to do more work when tracking a very large number of objects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants