dot product implementation for overlaps breaks xagg for high res grids #31

kerriegeil · 2022-03-15T03:19:01Z

I'm finding that the implementation of the dot product for computing weighted averages in core.py/aggregrate eats up way too much memory for high res grids. It's the wm.overlap_da that requires way too much memory. I am unable to allocate enough memory to make it through core.py/aggregate for many of the datasets I'm processing on an HPC system. I had no issue with the previous aggregate function before commit 4c5cc65. Looks like the dot product method is a lot cleaner in the code, but is there another benefit?

The text was updated successfully, but these errors were encountered:

ks905383 · 2022-03-15T16:45:12Z

The original hope was a performance boost by switching from the for-loop to the dot product implementation - though my own (limited) benchmarking so far has shown ambiguous results? @jsadler2 Do you have a take on that?

Depending on where we are with this, we could return behavior to what it was before, but with an option to use the dot-product setup manually for situations where it could provide the largest speed boost, or if there are fewer memory allocation issues.

jsadler2 · 2022-03-15T17:06:26Z

When I first implemented it, I wasn't thinking about using any kind of weights. I think the benefits would be for really large datasets and when you can use a dask cluster to do the dot product operation in a distributed way.

With the memory issue, I think it makes sense to rollback to the previous behavior for now. Maybe there is a way to implement the dot product approach that avoids the memory issues, but that can be explored offline.

ks905383 · 2022-04-01T18:35:49Z

#32 restores the default behavior, but keeps the dot product implementation as impl='dot_product' in pixel_overlaps() and aggregate(). Hopefully that covers all use cases for now.

Med-term it would be good to get some more detailed benchmarking under which circumstances one of the two approaches is 'better' (speed vs. memory), and decide on next steps from there (this fix is also a bit of a cludge with a few redundancies, so I can clean those up at that point as well)

ks905383 closed this as completed Apr 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dot product implementation for overlaps breaks xagg for high res grids #31

dot product implementation for overlaps breaks xagg for high res grids #31

kerriegeil commented Mar 15, 2022

ks905383 commented Mar 15, 2022

jsadler2 commented Mar 15, 2022 •

edited

ks905383 commented Apr 1, 2022 •

edited

dot product implementation for overlaps breaks xagg for high res grids #31

dot product implementation for overlaps breaks xagg for high res grids #31

Comments

kerriegeil commented Mar 15, 2022

ks905383 commented Mar 15, 2022

jsadler2 commented Mar 15, 2022 • edited

ks905383 commented Apr 1, 2022 • edited

jsadler2 commented Mar 15, 2022 •

edited

ks905383 commented Apr 1, 2022 •

edited