Reduce Peak Memory Consumption #16

william-silversmith · 2019-06-10T19:48:49Z

This package is currently several times more memory intensive than SciPy. There are a few avenues to reducing memory consumption: .

Strip out union-by-size from union-find. Not sure what the theoretical justification is (it's supposed to be faster!), but it seems to be more performant and lower memory. It's possible that the union-by-size feature is more useful for arbitrary graphs but not the structure of graphs that are implied in images.
Allow use of uint16_t or uint8_t for output images. It's pretty rare that more than 65k labels are used in typical images. We would need a good way to estimate when this is okay or allow users to specify uint16.
As in feat: support arbitrary labels #11, we can use std::unordered_map in union-find which for many images which would sparsely utilize the union-find array would result in large memory reductions. However, for images which densely use it, it would use more. It also supports labels larger than the maximum index of the image. However, it is also slower than the array implementation. We should allow the user to choose which implementation is right for them. (whenever I try this it's embarrassingly slow) this one would be too slow
Allocate the output array in the pyx file and pass it by reference to avoid a copy.
~~Is it possible to do this in-place? Might be restricted to data-types uint16 or bigger.~~ (No, you need to be able to check the original labels.)
Allow binary images as input and represent them as bit-packed using vector<bool>
limit memory used for binary images based on maximum possible number of prospective labels
estimate the number of provisional labels before allocating

The text was updated successfully, but these errors were encountered:

Instead, create the inital memory segment as a numpy array and simply return it when finished. Addressed #16

william-silversmith · 2019-11-18T05:04:27Z

With regards to allowing binary images represented as std::vector<bool>, this is simple to do through the C++ interface, but may require copy-pasting a lot of code (ugly). It's not so simple to do through the Cython interface. If there's demand for this, let me know, but it should be pretty easy to just change some declarations to const std::vector<bool> &in_labels in a few places if that's what you need.

william-silversmith · 2020-11-03T07:25:27Z

https://pypi.org/project/bitarray/

william-silversmith · 2020-12-20T19:25:13Z

Previously, playing around with bit packing was sure to reduce performance and not really significantly reduce memory because of the output image and union find. However, after zeroth pass was implemented bit packing can save 20% (uint32) to 40% (for uint16) on peak memory. Might be worth looking into.

william-silversmith · 2021-01-06T07:36:07Z

Looks like std::bitset and std::vector would be annoying to work with in different respects. Will have to write own bit accessor class that accepts a byte string with zero parsing.

william-silversmith · 2024-02-07T00:34:36Z

Might be able to do something with np.packbits

william-silversmith added the performance Lower memory or faster computation. label Jun 10, 2019

This was referenced Jun 10, 2019

max_labels questions #15

Closed

perf: remove union-by-size for lower memory and better performance #17

Merged

william-silversmith added a commit that referenced this issue Jun 11, 2019

perf: save time and memory by not copying to numpy array

09ee96f

Instead, create the inital memory segment as a numpy array and simply return it when finished. Addressed #16

william-silversmith mentioned this issue Jun 11, 2019

perf: save time and memory by not copying to numpy array #19

Merged

william-silversmith added a commit that referenced this issue Jun 11, 2019

perf: save time and memory by not copying to numpy array (#19)

9544937

Instead, create the inital memory segment as a numpy array and simply return it when finished. Addressed #16

william-silversmith mentioned this issue Nov 20, 2019

feat: support non-32-bit output types #33

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce Peak Memory Consumption #16

Reduce Peak Memory Consumption #16

william-silversmith commented Jun 10, 2019 •

edited

Loading

william-silversmith commented Nov 18, 2019

william-silversmith commented Nov 3, 2020

william-silversmith commented Dec 20, 2020

william-silversmith commented Jan 6, 2021

william-silversmith commented Feb 7, 2024

Reduce Peak Memory Consumption #16

Reduce Peak Memory Consumption #16

Comments

william-silversmith commented Jun 10, 2019 • edited Loading

william-silversmith commented Nov 18, 2019

william-silversmith commented Nov 3, 2020

william-silversmith commented Dec 20, 2020

william-silversmith commented Jan 6, 2021

william-silversmith commented Feb 7, 2024

william-silversmith commented Jun 10, 2019 •

edited

Loading