You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Categoricals are important, for example, to interpolate rasters (e.g., land use), and having the functionality out in the wild would help it get tested.
it would be useful to see whether this can provide a boost to the existing functionality we have for vectorizing rasters
It’s slightly different. We could think of a way of vectorizing pixels and doing a spatial dissolve with dask. I don’t know if that’d be faster (it'd be at least parallel/out-of-core), but it’s definitely different code (though similar philosophy), so I'd be tempted to leave that for a different PR, perhaps create an issue to remember this option in case we have bandwidth (or need) in the future to explore it.
In the case suggested above, a strategy to use Dask would be:
Read in the raster w/ rioxarray
Extract pixel centroids with to_pandas (there might be a way to go directly into a dask.DataFrame
Turn into a dask_geopandas.GeoDataFrame
Build pixels as vectors with buffer(xxx, cap_style=3)
Dissolve vector pixels by value
Once we enter a Dask data structure, all computations are lazy and parallel when .compute() is called, providing scalability and parallelism. But I'm not sure if that will make it faster than rasterio's vectorisation, which I imagine relies on GEOS? It might because the dissolve should be a fast one because all polygons to dissolve are four-point squares. One worth a shot for sure.
The text was updated successfully, but these errors were encountered:
This is a spin-off issue from the conversation in #180 so we don't loose track of it and also don't distract discussion in that PR.
Original suggestion from @knaaptime:
And response from @darribas:
In the case suggested above, a strategy to use Dask would be:
rioxarray
to_pandas
(there might be a way to go directly into adask.DataFrame
dask_geopandas.GeoDataFrame
buffer(xxx, cap_style=3)
Once we enter a Dask data structure, all computations are lazy and parallel when
.compute()
is called, providing scalability and parallelism. But I'm not sure if that will make it faster thanrasterio
's vectorisation, which I imagine relies on GEOS? It might because the dissolve should be a fast one because all polygons to dissolve are four-point squares. One worth a shot for sure.The text was updated successfully, but these errors were encountered: