Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smarter area weight grid saves memory in grdfilter #7192

Merged
merged 1 commit into from
Jan 5, 2023

Conversation

PaulWessel
Copy link
Member

Instead of precomputing an area weight grid for all input nodes (and hence doubling the memory requirement of the grid), we now only do so for 3 nodes along each row (west, middle, east) since all the columns away from the ends have the same weight along a row and only vary with latitude. The west and east columns may differ depending on boundary conditions, periodicity (360), and gridline vs pixel registration. It also speeds up the precalculation itself but mostly saves memory for large grids to be filtered in grdfilter. The case that triggered this was a 16 Gb in-memory grid that resulted in another 16 Gb memory grid being used - this now drops down to 1 Mb.

I have run all tests and it looks unchanged. Happy to have someone do a few filter operations to see if they notice anything, but the new logic looks solid to me. I had to add a silly API bool variable to bypass a test that for this strange grid (say 3 columns and 43600 rows) would fail in BC setting - this skips that check.

Instead of precomputing a weight grid for all input nodes, we only do so for 3 nodes along each row (west, middle, east) since all the columns away from the ends have the same weight along a row.  It also speeds up the precalculation but mostly saves memory for large grids.
@PaulWessel PaulWessel added the maintenance Boring but important stuff for the core devs label Jan 5, 2023
@PaulWessel PaulWessel added this to the 6.5.0 milestone Jan 5, 2023
@PaulWessel PaulWessel self-assigned this Jan 5, 2023
@PaulWessel PaulWessel merged commit b463d15 into master Jan 5, 2023
@PaulWessel PaulWessel deleted the reduce-mem-grdfilter branch January 5, 2023 13:40
@joa-quim
Copy link
Member

joa-quim commented Jan 5, 2023

Another thought. Why does the second array (the 14 GB) needs to be allocated at all? Why not reuse the original array to store the filtered data? Filtered array is always small so we are safe on this side. If the grid was read from a disk file are sure to not destroy anything. The problem only poses if the grid was transmitted via a memory location, i.e. via wrappers. Then yes a new array is needed, but the other case no.

@PaulWessel
Copy link
Member Author

Sorry, no good. When we are at output node IJ we need all input node within the radius. If you then treated the input via a separate node-ij system and placed the result there then at IJ+1 we now suddenly are averaging in a mix of input points and a previously placed output node. Only in very special cases will this not happen, and I think it must always happen at the beginning. Just think of that convolution circle slowly moving across the grid - it cannot pick up filtered values.

@joa-quim
Copy link
Member

joa-quim commented Jan 5, 2023

Yes, that's right. I'm already imagining a complex schema where ... but too complex.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Boring but important stuff for the core devs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants