### **Global Ocean Sea-Ice Models:**

In this tutorial, we demonstrate how to construct a simple `NEMODataTree` from two example global NEMO ocean sea-ice configurations.

---

**1. `AGRIF_DEMO`**

Let's start by creating a `NEMODataTree` using example outputs from the global `AGRIF_DEMO` NEMO reference configuration.

`AGRIF_DEMO` is based on the `ORCA2_ICE_PISCES` reference configuration with the inclusion of 3 online nested domains.

Here, we will only consider the 2° global parent domain.

Further information on this reference configuration can be found [**here**](https://sites.nemo-ocean.io/user-guide/cfgs.html#agrif-demo).

In [1]:
import xarray as xr
import nemo_cookbook as nc
from nemo_cookbook import NEMODataTree

xr.set_options(display_style="text")

<xarray.core.options.set_options at 0x15fdcdfd0>

**NEMO Cookbook** includes a selection of example NEMO model output datasets accessible via cloud object storage.

`nemo_cookbook.examples.get_filepaths()` is a convenience function used to download and generate local filepaths for an available NEMO reference configuration.

Below we download and collect the filepaths for the `AGRIF_DEMO` configuration described above:

In [2]:
filepaths = nc.examples.get_filepaths("AGRIF_DEMO")
filepaths

{'domain_cfg.nc': '/Users/otooth/Library/Caches/nemo_cookbook/AGRIF_DEMO/domain_cfg.nc',
 '2_domain_cfg.nc': '/Users/otooth/Library/Caches/nemo_cookbook/AGRIF_DEMO/2_domain_cfg.nc',
 '3_domain_cfg.nc': '/Users/otooth/Library/Caches/nemo_cookbook/AGRIF_DEMO/3_domain_cfg.nc',
 'ORCA2_5d_00010101_00010110_grid_T.nc': '/Users/otooth/Library/Caches/nemo_cookbook/AGRIF_DEMO/ORCA2_5d_00010101_00010110_grid_T.nc',
 'ORCA2_5d_00010101_00010110_grid_U.nc': '/Users/otooth/Library/Caches/nemo_cookbook/AGRIF_DEMO/ORCA2_5d_00010101_00010110_grid_U.nc',
 'ORCA2_5d_00010101_00010110_grid_V.nc': '/Users/otooth/Library/Caches/nemo_cookbook/AGRIF_DEMO/ORCA2_5d_00010101_00010110_grid_V.nc',
 'ORCA2_5d_00010101_00010110_grid_W.nc': '/Users/otooth/Library/Caches/nemo_cookbook/AGRIF_DEMO/ORCA2_5d_00010101_00010110_grid_W.nc',
 'ORCA2_5d_00010101_00010110_icemod.nc': '/Users/otooth/Library/Caches/nemo_cookbook/AGRIF_DEMO/ORCA2_5d_00010101_00010110_icemod.nc',
 '2_Nordic_5d_00010101_00010110_grid_T.nc': '/User

Next, we need to define the `paths` dictionary, which contains the filepaths corresponding to our global parent domain.

We populate the `parent` dictionary with the filepaths to the `domain_cfg` and `gridT/U/V/W` netCDF files produced for the `AGRIF_DEMO` (ORCA2) parent domain. 

In [3]:
paths = {"parent": {
         "domain": filepaths["domain_cfg.nc"],
         "gridT": filepaths["ORCA2_5d_00010101_00010110_grid_T.nc"],
         "gridU": filepaths["ORCA2_5d_00010101_00010110_grid_U.nc"],
         "gridV": filepaths["ORCA2_5d_00010101_00010110_grid_V.nc"],
         "gridW": filepaths["ORCA2_5d_00010101_00010110_grid_W.nc"],
         "icemod": filepaths["ORCA2_5d_00010101_00010110_icemod.nc"]
        },
        }

Finally, we can construct a new `NEMODataTree` called `nemo` using the `.from_paths()` constructor.

Notice, that we also need to specify that our global parent domain is zonally periodic (`iperio=True`) and north folding on T-points (`nftype = "T"`) rather than a closed (regional) domain.

In [4]:
nemo = NEMODataTree.from_paths(paths, iperio=True, nftype="T")
nemo

--- 

**2. `NOC Near-Present Day eORCA1`**

Next, we'll consider monthly-mean outputs from the National Oceanography Centre Near-Present-Day global eORCA1 configuration of NEMO forced using JRA55-do from 1976-2024. 

For more details on this model configuration and the available outputs, users can explore the Near-Present-Day documentation [**here**](https://noc-msm.github.io/NOC_Near_Present_Day/).

The eORCA1 JRA55v1 NPD data are publicly accessible as remote Zarr v2 stores via [JASMIN Object Store](https://help.jasmin.ac.uk/docs/short-term-project-storage/using-the-jasmin-object-store/), so we will use the NEMODataTree `.from_datasets()` constructor. 

In [5]:
# Define JASMIN Object Store base URL for eORCA1 JRA55v1 NPD data:
base_url = "https://noc-msm-o.s3-ext.jc.rl.ac.uk/npd-eorca1-jra55v1"

# Opening domain_cfg:
ds_domain = (xr.open_zarr(f"{base_url}/domain/domain_cfg", consolidated=True, chunks={})
             .squeeze(drop=True)
             .rename({"z": "nav_lev"})
             )

# Opening eORCA1 JRA55v1 gridT dataset, including sea surface temperature (°C) and salinity (g kg-1):
ds_gridT = xr.merge([xr.open_zarr(f"{base_url}/T1m/{var}", consolidated=True, chunks={})[var] for var in ['tos_con', 'sos_abs']], compat="override")

Next, let's create a `NEMODataTree` from a dictionary of eORCA1 JRA55v1 `xarray.Datasets`, specifying that our global domain is zonally periodic (`iperio=True`) and north folding on T-points (`nftype = "F"`).

In [6]:
datasets = {"parent": {"domain": ds_domain, "gridT": ds_gridT}}

nemo = NEMODataTree.from_datasets(datasets=datasets, iperio=True, nftype="F")
nemo

### **Regional Ocean Models:**

---

**`AMM12`**

We can also construct a `NEMODataTree` using outputs from regional NEMO ocean model simulations.

Here, we will consider example outputs from the regional `AMM12` NEMO reference configuration.

The AMM, Atlantic Margins Model, is a regional model covering the Northwest European Shelf domain on a regular lat-lon grid at approximately 12km horizontal resolution. `AMM12` uses the vertical s-coordinates system, GLS turbulence scheme, and tidal lateral boundary conditions using a flather scheme.

Further information on this reference configuration can be found [**here**](https://sites.nemo-ocean.io/user-guide/cfgs.html#amm12).

In [7]:
filepaths = nc.examples.get_filepaths("AMM12")
filepaths

{'domain_cfg.nc': '/Users/otooth/Library/Caches/nemo_cookbook/AMM12/domain_cfg.nc',
 'AMM12_1d_20120102_20120110_grid_T.nc': '/Users/otooth/Library/Caches/nemo_cookbook/AMM12/AMM12_1d_20120102_20120110_grid_T.nc',
 'AMM12_1d_20120102_20120110_grid_U.nc': '/Users/otooth/Library/Caches/nemo_cookbook/AMM12/AMM12_1d_20120102_20120110_grid_U.nc',
 'AMM12_1d_20120102_20120110_grid_V.nc': '/Users/otooth/Library/Caches/nemo_cookbook/AMM12/AMM12_1d_20120102_20120110_grid_V.nc'}

As we showed in the `AGRIF_DEMO` example, we need to populate the `paths` dictionary with the `domain_cfg` and `gridT/U/V` filepaths corresponding to our regional model domain.

In [8]:
paths = {"parent": {
         "domain": filepaths["domain_cfg.nc"],
         "gridT": filepaths["AMM12_1d_20120102_20120110_grid_T.nc"],
         "gridU": filepaths["AMM12_1d_20120102_20120110_grid_U.nc"],
         "gridV": filepaths["AMM12_1d_20120102_20120110_grid_V.nc"],
        },
        }

Next, we can construct a new `NEMODataTree` called `nemo` using the `.from_paths()` constructor.

Note, we do not actually need to specify that our regional domain is not zonally periodic in this case, given that, by default, `iperio=False`.

In [9]:
nemo = NEMODataTree.from_paths(paths, iperio=False)
nemo

### **Nested Global Ocean Sea-Ice Models:**

---

`AGRIF_DEMO`

Returning to our `AGRIF_DEMO` NEMO reference configuration, we can also construct a more complex `NEMODataTree` to store the outputs of the global parent and its child domains in a single data structure.

We will make use of the two successively nested domains located in the Nordic Seas, with the finest grid (1/6°) spanning the Denmark strait. This grandchild domain also benefits from “vertical nesting”, meaning that it has 75 geopotential z-coordinate levels, compared with 31 levels in its parent domain.

Let's start by defining the `paths` dictionary for the ORCA2 global parent domain and its child and grandchild domains. Notice, that for `child` and `grandchild` domains, we must also specify a unique domain number, given that we could include further child or grandchild nests.

In [10]:
filepaths = nc.examples.get_filepaths("AGRIF_DEMO")
filepaths

{'domain_cfg.nc': '/Users/otooth/Library/Caches/nemo_cookbook/AGRIF_DEMO/domain_cfg.nc',
 '2_domain_cfg.nc': '/Users/otooth/Library/Caches/nemo_cookbook/AGRIF_DEMO/2_domain_cfg.nc',
 '3_domain_cfg.nc': '/Users/otooth/Library/Caches/nemo_cookbook/AGRIF_DEMO/3_domain_cfg.nc',
 'ORCA2_5d_00010101_00010110_grid_T.nc': '/Users/otooth/Library/Caches/nemo_cookbook/AGRIF_DEMO/ORCA2_5d_00010101_00010110_grid_T.nc',
 'ORCA2_5d_00010101_00010110_grid_U.nc': '/Users/otooth/Library/Caches/nemo_cookbook/AGRIF_DEMO/ORCA2_5d_00010101_00010110_grid_U.nc',
 'ORCA2_5d_00010101_00010110_grid_V.nc': '/Users/otooth/Library/Caches/nemo_cookbook/AGRIF_DEMO/ORCA2_5d_00010101_00010110_grid_V.nc',
 'ORCA2_5d_00010101_00010110_grid_W.nc': '/Users/otooth/Library/Caches/nemo_cookbook/AGRIF_DEMO/ORCA2_5d_00010101_00010110_grid_W.nc',
 'ORCA2_5d_00010101_00010110_icemod.nc': '/Users/otooth/Library/Caches/nemo_cookbook/AGRIF_DEMO/ORCA2_5d_00010101_00010110_icemod.nc',
 '2_Nordic_5d_00010101_00010110_grid_T.nc': '/User

In [11]:
paths = {"parent": {
        "domain": filepaths["domain_cfg.nc"],
        "gridT": filepaths["ORCA2_5d_00010101_00010110_grid_T.nc"],
        "gridU": filepaths["ORCA2_5d_00010101_00010110_grid_U.nc"],
        "gridV": filepaths["ORCA2_5d_00010101_00010110_grid_V.nc"],
        "gridW": filepaths["ORCA2_5d_00010101_00010110_grid_W.nc"],
        "icemod": filepaths["ORCA2_5d_00010101_00010110_icemod.nc"]
        },
        "child": {
        "1":{
        "domain": filepaths["2_domain_cfg.nc"],
        "gridT": filepaths["2_Nordic_5d_00010101_00010110_grid_T.nc"],
        "gridU": filepaths["2_Nordic_5d_00010101_00010110_grid_U.nc"],
        "gridV": filepaths["2_Nordic_5d_00010101_00010110_grid_V.nc"],
        "gridW": filepaths["2_Nordic_5d_00010101_00010110_grid_W.nc"],
        "icemod": filepaths["2_Nordic_5d_00010101_00010110_icemod.nc"]
        }},
        "grandchild": {
        "2":{
        "domain": filepaths["3_domain_cfg.nc"],
        "gridT": filepaths["3_Nordic_5d_00010101_00010110_grid_T.nc"],
        "gridU": filepaths["3_Nordic_5d_00010101_00010110_grid_U.nc"],
        "gridV": filepaths["3_Nordic_5d_00010101_00010110_grid_V.nc"],
        "gridW": filepaths["3_Nordic_5d_00010101_00010110_grid_W.nc"],
        "icemod": filepaths["3_Nordic_5d_00010101_00010110_icemod.nc"]
        }},
        }

Next, we need to construct a `nests` dictionary which contains the properties which define each nested domain. These include:

- Unique domain number (mapping properties to entries in our `paths` directory).
- Parent domain (to which unique domain does this belong).
- Zonal periodicity of child / grandchild domain (`iperio`).
- Horizontal grid refinement factors (`rx`, `ry`).
- Start (`imin`, `jmin`) and end (`imax`, `jmax`) grid indices in both directions (**i**, **j**) of the parent grid.

The latter information should be copied directly from the `AGRIF_FixedGrids.in` anicillary file used to define nested domains in NEMO.

***
`Example AGRIF_FixedGrids.in`

**1** ------------------------> (Number of nested domains - parent).

**121 146 113 133 4 4 4** ----> (imin, imax, jmin, jmax, rx, ry, rt)

**1** ------------------------> (Number of nested domains - child)

**20 60 27 60 3 3 3** --------> (imin, imax, jmin, jmax, rx, ry, rt)

**0** ------------------------> (Number of nested domains - grandchild)

***

**Important: we must specify the start and end grid indices using Fortran (1-based) indexes rather than Python (0-based) indexes.**

In [12]:
nests = {
    "1": {
    "parent": "/",
    "rx": 4,
    "ry": 4,
    "imin": 121,
    "imax": 146,
    "jmin": 113,
    "jmax": 133,
    "iperio": False
    },
    "2": {
    "parent": "1",
    "rx": 3,
    "ry": 3,
    "imin": 20,
    "imax": 60,
    "jmin": 27,
    "jmax": 60,
    "iperio": False
    }
    }

Finally, we can construct a new `NEMODataTree` called `nemo` using the `.from_paths()` constructor.

Again, we also need to specify that our global parent domain is zonally periodic (`iperio=True`) and north folding on T-points (`nftype = "T"`) rather than a closed (regional) domain.

We can also include additional keyword arguments to pass onto `xarray.open_dataset` or `xr.open_mfdataset` when opening NEMO model output files.

In [13]:
nemo = NEMODataTree.from_paths(paths=paths, nests=nests, iperio=True, nftype="T", engine="netcdf4")
nemo

### **Coupled Climate Models:**

`UKESM1-0-LL`

In addition to ocean-only and ocean sea-ice hindcast simulations (prescribing surface atmospheric forcing), NEMO models are also used as the ocean components in many coupled climate models, including the UK Earth System Model (UKESM) developed jointly by the UK Met Office and Natural Environment Research Council (NERC).

Here, we show how to construct a `NEMODataTree` from the 1° global ocean sea-ice component of [**UKESM1-0-LL**](https://doi.org/10.1029/2019MS001739) included in the sixth Coupled Model Intercomparsion Project ([**CMIP6**](https://wcrp-cmip.org/cmip-phases/cmip6/)) using outputs accessible via the [**CEDA Archive**](https://help.ceda.ac.uk/article/4801-cmip6-data).

Since CMIP6 outputs are processed and formatted according to the CMIP Community Climate Model Output Rewriter (CMOR) software, we will need to include a few additional pre-processing steps to reformat our NEMO model outputs in order to construct a `NEMODataTree`

**Important: only CMIP model outputs variables stored on their original NEMO ocean model grid (i.e, `gn`) can be used to construct a `NEMODataTree`**

In [14]:
# Open UKESM1-0-LL domain_cfg:
ds_domain_cfg = xr.open_dataset("/path/to/MOHC/Ofx/domain_cfg_Ofx_UKESM1.nc")

# Define time decoder to handle CMIP6 time units:
time_decoder = xr.coders.CFDatetimeCoder(use_cftime=True)

# Open UKESM1-0-LL thetao dataset, including potential temperature (°C):
base_filepath = "/badc/cmip6/data/CMIP6/CMIP/MOHC/UKESM1-0-LL/historical/r4i1p1f2/Omon/thetao/gn/latest"
ds_ukesm1_gridT = xr.open_mfdataset(f"{base_filepath}/thetao_Omon_UKESM1-0-LL_historical_r4i1p1f2_gn_*.nc",
                                    data_vars='all',
                                    decode_times=time_decoder
                                   )

# Adding UKESM1-0-LL mlotst dataset, including mixed layer depth (m):
ds_ukesm1_gridT['mlotst'] = xr.open_mfdataset(f"{base_filepath}/mlotst_Omon_UKESM1-0-LL_historical_r4i1p1f2_gn_*.nc",
                                              data_vars='all',
                                              decode_times=time_decoder
                                              )['mlotst']

FileNotFoundError: [Errno 2] No such file or directory: '/path/to/MOHC/Ofx/domain_cfg_Ofx_UKESM1.nc'

Now we have defined our `domain` and `gridT` datasets, let's define a `datasets` dictionary ensuring that we rename CMORISED dimensions to be consistent with standard NEMO model outputs.

We can then define a `NEMODataTree` using the `.from_datasets()` constructor, specifying that our global parent domain is zonally periodic and north-folding on F-points.

In [None]:
datasets = {"parent": {
                "domain": ds_domain_cfg.rename({'z':'nav_lev'}),
                "gridT": ds_ukesm1_gridT.rename({'time':'time_counter', 'i':'x', 'j':'y', 'lev':'deptht'}),
                }}

nemo = NEMODataTree.from_datasets(datasets=datasets, iperio=True, nftype="F")
nemo