<img width="50" src="https://carbonplan-assets.s3.amazonaws.com/monogram/dark-small.png" style="margin-left:0px;margin-top:20px"/>

# gridMET to Zarr

_by Joe Hamman (CarbonPlan), June 29, 2020_

This notebook converts the raw gridMET dataset to Zarr format.

**Inputs:**

- inake catalog: `climate.gridmet_opendap`

**Outputs:**

- Cloud copy of gridMET

**Notes:**

- No reprojection or processing of the data is done in this notebook.


In [None]:
import gcsfs
import intake
import xarray as xr
import zarr
from numcodecs.zlib import Zlib

fs = gcsfs.GCSFileSystem(
    project="carbonplan",
    token="/Users/jhamman/.config/gcloud/legacy_credentials/joe@carbonplan.org/adc.json",
)

In [None]:
years = list(range(1979, 2021))
variables = [
    "pr",
    "tmmn",
    "tmmx",
    "rmax",
    "rmin",
    "sph",
    "srad",
    "th",
    "vs",
    "bi",
    "fm100",
    "fm1000",
    "erc",
    "pdsi",
    "etr",
    "pet",
    "vpd",
]

In [None]:
source_pattern = (
    "https://www.northwestknowledge.net/metdata/data/{var}_{year}.nc"
)

In [None]:
ds_list = []
for v in variables:
    print(v)
    ds_list.append(
        xr.concat(
            [source(variable=v, year=y).to_dask() for y in years], dim="day"
        )
    )

In [None]:
ds = xr.merge(ds_list, compat="override")
ds["crs"] = ds_list[0]["crs"]
ds

In [None]:
ds.nbytes / 1e9

In [None]:
mapper = fs.get_mapper("carbonplan-data/raw/gridmet/4km/raster.zarr")

In [None]:
ds = ds.chunk({"day": 1000, "lat": 256, "lon": 256})
ds

In [None]:
encoding = {v: {"compressor": Zlib(4)} for v in ds.variables}
encoding

In [None]:
future = ds.to_zarr(mapper, mode="w", encoding=encoding, compute=False)

In [None]:
from dask.diagnostics import ProgressBar

with ProgressBar():
    future.compute(scheduler="threading")