Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Leak using & operator #20

Closed
stephenengland opened this issue May 15, 2017 · 6 comments
Closed

Memory Leak using & operator #20

stephenengland opened this issue May 15, 2017 · 6 comments

Comments

@stephenengland
Copy link

stephenengland commented May 15, 2017

I'm using your library in an architecture where I am having to AND over many, many iterations. I've found that the library leaks memory over time.

I haven't dug deep into the code yet, but I did write an example script that can show the issue.

from pyroaring import BitMap
import sys

def main():
    loop_one_count = 100000
    loop_two_count = 300000
    for i in range(0, loop_one_count):
        bitmap_one = BitMap([0,1,2,3,4,12,13,14,15,16,17,187,123,235235,234])
        for j in range(0, loop_two_count):
            bitmap_two = BitMap([2,3,4,5,7,8,9,90])
            bitmap_three = bitmap_one & bitmap_two
        #sys.stdout.write("i:" + str(i) + " \r")
        #sys.stdout.flush()

if __name__ == '__main__':
    main()

This will eventually fill your memory, even if you do a gc.collect() inside the for loop.

@lemire
Copy link
Contributor

lemire commented May 16, 2017

@thealah

I did not write the code, but I have checked that the Python code does free its memory as I think it should. Then I launched your program and let it run for several minutes, monitoring the process' memory usage, and it was absolutely flat (and tiny), with no increase.

That does prove that there is no leak, of course.

@Ezibenroc
Copy link
Owner

Thank you for reporting the issue.
I can reproduce the huge memory consumption with your program (I stopped it when it reached 5GB), so indeed there is a memory leak somewhere.
@lemire are you sure that you are using the latest version of pyroaring?

I will investigate.

@Ezibenroc
Copy link
Owner

The issue comes from function create_from_ptr. I thought that a call to BitMap.__new__ would not call BitMap.__cinit__ (as in Python a call to __new__ does not call __init__), but I was wrong.

Fixed in db725fc

There remains a memory leak with this code:

def main():
    loop_one_count = 100000
    loop_two_count = 300000
    for i in range(0, loop_one_count):
        bitmap_one = BitMap()
        for j in range(0, loop_two_count):
            bitmap_two = BitMap()
            bitmap_three = BitMap.union(bitmap_one, bitmap_two, bitmap_two)

But it seems that here the issue comes from CRoaring (I made a pull request: RoaringBitmap/CRoaring#104).

I will create a new release and close this issue when the leak on CRoaring's side will also be fixed.

@stephenengland
Copy link
Author

Awesome, I really appreciate it! It looks like the CRoaring PR was merged and CRoaring releases fairly frequently. I was expecting I'd have to get my hands dirty- and I haven't touched C in years. :)

@Ezibenroc
Copy link
Owner

Yes, CRoaring PR merged! I made a new Pyroaring release.

@lemire
Copy link
Contributor

lemire commented May 16, 2017

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants