Describe the bug
clip_polygon() calls rasterize() without passing chunks, so the mask is always built as a full numpy array regardless of the input raster's backend. For a dask-backed input, this materializes the entire mask in RAM before wrapping it back into a dask array.
The code at polygon_clip.py:205-211 sets like=raster but never extracts the chunk structure. The rasterize() dispatch (rasterize.py:2158-2178) only uses _run_dask_numpy when chunks is explicitly provided, so the call falls through to _run_numpy and allocates a dense array.
For a 30TB raster this means the mask alone would need hundreds of GB of RAM.
Expected behavior
When the input raster is dask-backed, clip_polygon() should pass the raster's chunk sizes to rasterize() so the mask stays lazy. Similarly for dask+cupy inputs.
Affected code
polygon_clip.py:205-211
Describe the bug
clip_polygon()callsrasterize()without passingchunks, so the mask is always built as a full numpy array regardless of the input raster's backend. For a dask-backed input, this materializes the entire mask in RAM before wrapping it back into a dask array.The code at
polygon_clip.py:205-211setslike=rasterbut never extracts the chunk structure. Therasterize()dispatch (rasterize.py:2158-2178) only uses_run_dask_numpywhenchunksis explicitly provided, so the call falls through to_run_numpyand allocates a dense array.For a 30TB raster this means the mask alone would need hundreds of GB of RAM.
Expected behavior
When the input raster is dask-backed,
clip_polygon()should pass the raster's chunk sizes torasterize()so the mask stays lazy. Similarly for dask+cupy inputs.Affected code
polygon_clip.py:205-211