# Full conversion of ECHAM-6.3 / LandControl and LandOrbit to `zarr`

Outlining the process of converting one model/experiment pairing to `zarr` for upload to `pangeo-data`

Author: Charles Blackmon-Luca

# Getting started

Import necessry packages:

In [1]:
import xarray as xr
import zarr

print(xr.__version__)
print(zarr.__version__)

0.11.1+64.g612d390
2.2.1.dev140


Start `dask` client:

In [2]:
from dask.distributed import Client

client = Client("tcp://127.0.0.1:44388")
client

0,1
Client  Scheduler: tcp://127.0.0.1:44388  Dashboard: http://127.0.0.1:8787/status,Cluster  Workers: 4  Cores: 16  Memory: 135.44 GB


Establish a compressor - typically Pangeo uses `zstd`:

In [3]:
compressor = zarr.Blosc(cname='zstd', clevel=3, shuffle=2)

Load and save data for LandOrbit and LandControl:

In [5]:
for s in ['LandOrbit', 'LandControl']:

    monthly = xr.open_mfdataset('/data2/tracmip/ECHAM-6.3/%s/Amon/*.nc' % s,
                                chunks={'time' : 'auto'}, parallel=True)
    monthly.to_zarr('/data2/tracmip/tracmip/%s/ECHAM-6.3/monthly' % s,
                    encoding={var : {'compressor' : compressor} for var in monthly.data_vars},
                    consolidated=True,
                    mode='w')

Daily data:

In [6]:
for s in ['LandOrbit', 'LandControl']:

    daily = xr.open_mfdataset('/data2/tracmip/ECHAM-6.3/%s/Aday/*.nc' % s,
                              chunks={'time' : 'auto'}, parallel=True)
    daily.to_zarr('/data2/tracmip/tracmip/%s/ECHAM-6.3/daily' % s,
                  encoding={var : {'compressor' : compressor} for var in daily.data_vars},
                  consolidated=True,
                  mode='w')

3-hourly data:

In [8]:
for s in ['LandOrbit', 'LandControl']:

    hourly = xr.open_mfdataset('/data2/tracmip/ECHAM-6.3/%s/A3hr/*.nc' % s,
                               chunks={'time' : 'auto'}, parallel=True)
    hourly.to_zarr('/data2/tracmip/tracmip/%s/ECHAM-6.3/3hourly' % s,
                   encoding={var : {'compressor' : compressor} for var in hourly.data_vars},
                   consolidated=True,
                   mode='w')

<xarray.backends.zarr.ZarrStore at 0x7fab818a9860>

Once this data is loaded in `zarr` format, we can upload to the Google Cloud bucket at `gs://pangeo-data/`.