-
Notifications
You must be signed in to change notification settings - Fork 523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of rio-merge #507
Comments
There's always going to be a tradeoffs between the in-memory and on-disk approaches. It would be great to make this configurable so the user could make the decision based on their knowledge of the problem (available memory, file formats, disk speed, etc). As a documentation task, we could develop a narrative explaining when each would be appropriate. The other issue is that, even if we window the src reads, we'd still have to store |
Here's an implementation of It's optimized for internally tiled RGBA rasters. The README explains how this differs from the current rio merge, why it was implemented as a separate plugin and the challenges to applying the same approach to rio merge. TLDR; the use of masked reads ( |
Just curious, any movement on the windowed reads into merge from https://github.com/mapbox/rasterio/blob/c2df12979a5e07f96f108b0be8329e79fe950532/rasterio/merge.py#L142-L146 ? We're currently attempting to use rasterio.merge within dask, but the input files are compressed tiffs that can reach 7 Gb in memory (each), whereas our output window is relatively small. |
@wckoeppen no movement. I would have closed this issue if it were resolved or you would see other issues pointing here if there was any discussion. |
As it says at https://github.com/mapbox/rasterio/blob/master/rasterio/tools/merge.py#L123-L125, the current approach uses the maximum amount of memory to solve the problem. We could trade more I/O for reduced memory by operating on windows of the dataset.
The text was updated successfully, but these errors were encountered: