# Accessing CESM2-LENS on the cloud

Notebook adapted from CESM2-LE on AWS tutorial, located here:  
```https://ncar.github.io/cesm2-le-aws/kay_et_al_lens2.html```

````{admonition} To-do
**Option 1 (recommended, much faster)**: open notebook with Google Colab (https://colab.research.google.com/)  

**Option 2**: Run locally (requires installing the following packages):
```bash
> conda install -c conda-forge intake intake-esm s3fs
```

## check if we're on Google colab

In [None]:
## are we using colab (true/false)
USING_COLAB = "google.colab" in str(get_ipython())

## Imports

In [None]:
## install packages in google colab
if USING_COLAB:
    !pip install zarr cftime intake intake-esm numcodecs==0.15.1 s3fs

## packages
import intake
import time
import xarray as xr

## CESM2-LE surface temperature data
(actually, we'll look at $T_{2m}$, temperature at 2m above the surface)

### Open (but don't load) the data

In [None]:
## get catalog of available data
catalog = intake.open_esm_datastore(
    "https://raw.githubusercontent.com/NCAR/cesm2-le-aws/main/intake-catalogs/aws-cesm2-le.json"
)

## subset for temperature data
catalog_subset = catalog.search(
    variable="TREFHT", frequency="monthly", experiment="historical"
)

## open_dataset kwargs
xr_kwargs = dict(
    engine="zarr",
    decode_timedelta=True,
    chunks=dict(member=1, time=1980, nlat=None, nlon=None),
)

## kwargs for opening data
kwargs = dict(
    aggregate=True,
    xarray_open_kwargs=xr_kwargs,
    zarr_kwargs={"consolidated": True},
    storage_options={"anon": True},
)

## load data
dsets = catalog_subset.to_dataset_dict(**kwargs)
data = dsets["atm.historical.monthly.cmip6"]

### Load the data into memory (uncomment cell below)

In [None]:
## only load multiple members if running on colab (slow otherwise)
if USING_COLAB:
    member_idx = dict(member_id=slice(None, 10))
else:
    member_idx = dict(member_id=0)

## specify lat/lon range
lonlat_vals = dict(lon=slice(285, 295), lat=slice(35, 45))

## load data into memory
t0 = time.time()
data_loaded = data["TREFHT"].isel(member_idx).sel(lonlat_vals).compute()
t1 = time.time()
print(f"Elapsed time: {t1-t0:.2f} seconds.")

## (optional, much slower) SST data

In [None]:
## get catalog of available data
catalog = intake.open_esm_datastore(
    "https://raw.githubusercontent.com/NCAR/cesm2-le-aws/main/intake-catalogs/aws-cesm2-le.json"
)

## subset for temperature data and grid data
catalog_subset = catalog.search(
    variable="TEMP", frequency="monthly", experiment="historical"
)
grid_subset = catalog.search(
    component="ocn",
    frequency="static",
    experiment="historical",
    forcing_variant="cmip6",
)

## load SST data
dsets = catalog_subset.to_dataset_dict(**kwargs)
data = dsets["ocn.historical.monthly.cmip6"].isel(z_t=0)

## load grid and subset for top level
_, grid = grid_subset.to_dataset_dict(**kwargs).popitem()
grid = grid.isel(z_t=0)

## merge grid information with SST data
sst = xr.merge([data.rename({"TEMP": "sst"}), grid[["TAREA", "TLONG", "TLAT"]]])