# Read HRRR Forecast model data in cloud-friendly format (Zarr)
This notebook demonstrates the power of Pangeo to access HRRR data in Zarr format, perform computations in parallel and interactively visualize the result.  

The Zarr format data was obtained by converting HRRR best time series data from Unidata's Jetstream THREDDS server using this code:
```
import xarray as xr
url = 'http://thredds-jetstream.unidata.ucar.edu/thredds/dodsC/grib/NCEP/HRRR/CONUS_2p5km/Best'
ds = xr.open_dataset(url)

ds = ds[['Temperature_height_above_ground',
        'u-component_of_wind_height_above_ground',
        'v-component_of_wind_height_above_ground',
        'LambertConformal_Projection']]
        
ds = ds.chunk(chunks={'time':10})
ds.to_zarr('hrrr_zarr', consolidated=True)
```
and then stored the result on S3 at `s3://esip-pangeo-uswest2/pangeo/EPIC/hrrr_zarr`. 

In [None]:
from dask.distributed import Client, progress
from dask_kubernetes import KubeCluster
import numpy as np
import xarray as xr
import fsspec
import metpy
import hvplot.xarray
import geoviews as gv

Create a small Kubernetes cluster with 5 workers

In [None]:
cluster = KubeCluster()
cluster.scale(5);
cluster

In [None]:
client = Client(cluster)

Open Zarr dataset from S3 (no data is actually loaded at this step)

In [None]:
ds = xr.open_zarr(fsspec.get_mapper('s3://esip-pangeo-uswest2/pangeo/EPIC/hrrr_zarr'))

Examine the dataset.  It looks just the same as if it was read from a local NetCDF file. 

In [None]:
ds

Use Unidata's metpy package to read units and projection information

In [None]:
u  = ds.metpy.parse_cf('u-component_of_wind_height_above_ground')
v  = ds.metpy.parse_cf('v-component_of_wind_height_above_ground')

crs = u.metpy.cartopy_crs

From the many tile source basemap options in Geoviews, choose Open Street Map (OSM)

In [None]:
base_map = gv.tile_sources.OSM

Derive wind speed (still no data loaded!)

In [None]:
windspeed = np.sqrt(u**2 + v**2)

In [None]:
windspeed

Visualize the wind speed.  Data is finally read (on demand) from Zarr): 

In [None]:
mesh = windspeed.hvplot(x='x', y='y', rasterize=True, cmap='viridis', crs=crs, width=700)
base_map * mesh.opts(alpha=0.7)

How many GB of windspeed data are we going to crunch?

In [None]:
windspeed.nbytes/1e9

Find the maximum windspeed over the time dimension, and persist the data onto the workers in case we need it again.   This is where the actual data finally gets read from Zarr:

In [None]:
wind_max = windspeed.max(dim='time').persist()
progress(wind_max)

Visualize the maximum wind speed.  The track of Humberto is evident!

In [None]:
mesh = wind_max.hvplot(x='x', y='y', rasterize=True, cmap='viridis', crs=crs, width=700)

(base_map * mesh.opts(alpha=0.7)).opts(active_tools=['wheel_zoom', 'pan'])