# Building Intake-ESM datastores of ACCESS model output

**Aims**: This tutorial will demonstrate how users can build Intake-ESM datastores for their ACCESS model runs using the `access-nri-intake` Python package

Builders classes for creating Intake-ESM datastores for different ACCESS model outputs are available in the `builders` submodule of the `access_nri_intake` Python package. There are currently Builders for outputs from ACCESS-OM2, ACCESS-CM2 and ACCESS-ESM1.5.

In this tutorial, we'll build an Intake-ESM datastore for an ACCESS-CM2 model run with output at:

`/g/data/p73/archive/non-CMIP/ACCESS-CM2/by578`.

Because we're looking at ACCESS-CM2 output, we'll use the `AccessCm2Builder` class

In [None]:
import os

from access_nri_intake.source.builders import AccessCm2Builder

# Building the datastore

Building the Intake-ESM catalog should be as simple as passing the model run base output directory to the Builder and calling `.build()`. The build is parallelized so will be faster if you throw more resources at it. The following was run using an XX-Large `normalbw` ARE instance (28 cpus).

In [None]:
%%time

builder = AccessCm2Builder(
    path="/g/data/p73/archive/non-CMIP/ACCESS-CM2/by578",
    ensemble=False # We could use this to pass multiple paths for different ensemble members
).build()

The previous cell builds the Intake-ESM datastore in memory. We'll want to save it somewhere so we can reuse and share it. The following cell will create two new files (a `.json` and `.csv` file) in your current work directory. These files are how Intake-ESM datastores are stored on disk.

In [None]:
builder.save(
    name="mydatastore", 
    description="An example datastore for ACCESS-CM2 by578",
)

# Using your datastore

Now we can use our Intake-ESM datastore to query and load the model data. We saw the basics of how to do this in the previous tutorial - see also the Intake-ESM documentation [here](https://intake-esm.readthedocs.io/en/stable/index.html).

We can load the datastore directly using `intake`.

In [None]:
import intake

esm_datastore = intake.open_esm_datastore(
    "./mydatastore.json", 
    columns_with_iterables=["variable"] # This is important
)

esm_datastore

It's easy to search for datasets in the datastore containing a particular variable and load them as xarray Datasets. (Note for analysing large datasets, you may want to first start a dask cluster).

In [None]:
ds = esm_datastore.search(variable="temp").to_dask()

In [None]:
ds["temp"].isel(time=-1, st_ocean=0).plot()

If you think your ACCESS model data is worth sharing more widely, it might be a good idea to include it in the ACCESS-NRI Intake catalog. We're still working on requirements for data to be included in the catalog, but please feel free to open an issue here to discuss: https://github.com/ACCESS-NRI/access-nri-intake-catalog/issues/new/choose