Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zonal statistics on Vector and NetCDF data #175

Closed
rylanlee opened this issue Nov 5, 2018 · 2 comments
Closed

zonal statistics on Vector and NetCDF data #175

rylanlee opened this issue Nov 5, 2018 · 2 comments

Comments

@rylanlee
Copy link

rylanlee commented Nov 5, 2018

Since zonal statistics on Vector and \Raster data is possible as Integrating with GeoPandas and Numpy. Is zonal statistics on Vector and NetCDF data possible? Moreover, is calculation in each zone (each region in vector data) possible based ?

For example, I have 1980-2010 globla temperature, How can I get the 1980-2010 mean temperature series in each vectors? Can anyone help me figure it out?

@perrygeo
Copy link
Owner

perrygeo commented Nov 6, 2018

rasterstats is designed around summarizing a single raster band by multiple vector shapes. It sounds like what you want to do is summarize multiple raster bands (temperature for each year) by multiple shapes?

There is an open issue about multiband support: see #73

For now, you can iterate over each band and combine the data structure with a little post processing.
The following example demonstrates the approach I would take:

from rasterstats import zonal_stats
import json
import rasterio

multiband_raster = "rgba.tif"
vect = "shapes.geojson"

# Use rasterio to determine the band_indexes
with rasterio.open(multiband_raster) as src:
    band_indexes = src.indexes

all_band_summaries = []

for b in band_indexes:
    band_summary = zonal_stats(
        vect, multiband_raster, prefix=f"band{b}_", band=b, stats="min")  # note the prefix
    all_band_summaries.append(band_summary)

# all_band_summaries is now a 2D list of lists
# first dimension: bands
# second dimension: shapes
# We can flip the dimensions using zip
shape_summaries = list(zip(*all_band_summaries))
# then aggregate into a single list (dimension: shapes) containing bands
final = [{k: v for d in s for k, v in d.items()}
         for s in shape_summaries]

print(json.dumps(final, indent=2))

Which will print a list of dictionaries, one for each vector shape, containing the minimum value for each band:

[
  {
    "band1_min": 214.0,
    "band2_min": 220.0,
    "band3_min": 255.0,
    "band4_min": 255.0
  },
  {
    "band1_min": 133.0,
    "band2_min": 155.0,
    "band3_min": 171.0,
    "band4_min": 255.0
  },
  {
    "band1_min": 65.0,
    "band2_min": 69.0,
    "band3_min": 80.0,
    "band4_min": 255.0
  }
]

Something like this could be adapted to NetCDF files and annual temperature grids as bands, and of course all the vector formats as described in the docs.

Please give this a shot and report back if you have any questions.

@perrygeo
Copy link
Owner

Closing issue due to inactivity. The above code is the recommended approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants