<img src="https://xarray.dev/dataset-diagram-logo.png"
     align="right"
     width="30%"/>

# Geospatial Large

This is a national water model: https://registry.opendata.aws/nwm-archive/

## Load NWM data

In [None]:
import xarray as xr

ds = xr.open_zarr(
    "s3://noaa-nwm-retrospective-2-1-zarr-pds/rtout.zarr",
    consolidated=True,
).drop_encoding()
ds

## Set up cluster

In [None]:
import coiled

cluster = coiled.Cluster(
    n_workers=100,
    region="us-east-1",
)
client = cluster.get_client()

## Compute average over space

In [None]:
subset = ds.zwattablrt.sel(time=slice("2001-01-01", "2001-03-31"))
subset

In [None]:
avg = subset.mean(dim=["x", "y"]).compute()
avg.plot()

## Rechunk

In [None]:
result = subset.chunk({"time": "auto", "x": -1, "y": "auto"})
result

In [None]:
%%time

result.to_zarr("s3://oss-scratch-space/nwm-x-optimized.zarr", mode="w")

## Cleanup if you like

(but we'll clean up automatically eventually)

In [None]:
cluster.shutdown()