Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid unnecessary container allocations #105

Open
lemire opened this issue Sep 12, 2017 · 6 comments
Open

Avoid unnecessary container allocations #105

lemire opened this issue Sep 12, 2017 · 6 comments

Comments

@lemire
Copy link
Member

lemire commented Sep 12, 2017

In many instances, especially when computing intersections, we generate empty containers that are immediately garbage collected.

There are clever tricks we could use to start scanning the input containers, advance up to the point where there might be values to be intersected, and bail out without generating an empty container if the intersection is empty.

@lemire
Copy link
Member Author

lemire commented Mar 30, 2018

Example:

c1 := rb.highlowcontainer.getWritableContainerAtIndex(pos1)
c2 := x2.highlowcontainer.getContainerAtIndex(pos2)
diff := c1.iand(c2)
if diff.getCardinality() > 0 {
	rb.highlowcontainer.replaceKeyAndContainerAtIndex(intersectionsize, s1, diff, false)
	intersectionsize++
}

@fjorgemota
Copy link

hmmm When I read this issue I was thinking about reusing containers, which would may help here too.

I mean, when roaring decides to automatically optimize a container (like on iAddReturnMinimized method, for example), it discards the old container without any chance to reuse it in the future, so on a dynamic environment, if someone calls Remove on that same bitmap the recently created container is, again, downgraded to the old container, which is allocated again without reusing the old container and increasing GC pressure.

Personally, I think reusing a container could help here too. Yeah, the algorithm you suggested could probably be better for the intersection case, but in cases where adding/removing integers to the bitmap causes a bunch of containers to be allocated and deallocated it could be interesting to reuse containers directly in the library, internally.

So what I suggest is: arrayContainerPool for arrayContainer, bitmapContainerPool for bitmapContainer and, of course, runContainerPool for runContainers. And then, instead of just returning the new container and discarding the old container, we just clear() the old container, put in the pool and, when allocating the container, just get the correspondent container of the pool. So it's a more generic option to improve performance while avoiding container allocations when adding and removing elements from the bitmap.

What do you think?

@lemire
Copy link
Member Author

lemire commented Jan 11, 2019

@fjorgemota Yes, this is sensible, but one would like to see actual performance gains in realistic scenarios.

@fjorgemota
Copy link

Oh, for sure. Only benchmarks to test these sort of things. And for sure no-allocation (like in the algorithm for intersections) is always better than one allocation here and there using pools. :)

@damnever
Copy link
Contributor

I have observed high CPU usage caused by the memory allocations when using Bitmap.ReadFrom and bitmap.Or/Clone frequently. By implementing memory reuse, we should be able to effectively resolve the majority of this problem.

@damnever
Copy link
Contributor

I have submitted a pull request to resolve the issue with ReadFrom #395

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants