# Calculate Surface Ocean Heat using CESM2 LENS data 

---

## Overview
- This notebook is an adpation of a [workflow](https://gallery.pangeo.io/repos/NCAR/notebook-gallery/notebooks/Run-Anywhere/Ocean-Heat-Content/OHC_tutorial.html) in the NCAR gallery of the Pangeo collection
- This notebook illustrates how to compute surface ocean heat content using potential temperature data from [CESM2 Large Ensemble Dataset](https://www.cesm.ucar.edu/community-projects/lens2) (Community Earth System Model 2) hosted on NCAR's RDA.
- This data is open access and is accessed via OSDF using the pelicanFS package and demonstrates how you can stream data from NCAR's RDA
- Please refer to the first chapter of this cookbook to learn more about OSDF, pelican or pelicanFS


## Prerequisites
| Concepts | Importance | Notes |
| --- | --- | --- |
| [Intro to Intake-ESM](https://foundations.projectpythia.org/core/cartopy/cartopy) | Necessary | Used for searching CMIP6 data |
| [Understanding of Zarr](https://zarr.dev/) | Helpful | Familiarity with metadata structure |
| [Matplotlib](https://foundations.projectpythia.org/core/matplotlib/) | Helpful | Package used for plotting|
| [PelicanFS](https://projectpythia.org/osdf-cookbook/notebooks/pelicanfs/) | Necessary | The python package used to stream data in this notebook |
| OSDF| Helpful | OSDF is used to stream data in this notebook |

- **Time to learn**: 20 mins

## Table of Contents
- [Set up local dask cluster](#Set-up-local-dask-Cluster) 
- [Data Loading](#Data-Loading) 
- [Ocean heat computation](#Ocean-Heat-Computation)

---

## Imports

In [8]:
import intake
import numpy as np
import pandas as pd
import xarray as xr
import seaborn as sns
import re
import matplotlib.pyplot as plt
import dask
from dask.distributed import LocalCluster
import pelicanfs 
#import cf_units as cf

In [14]:
# Load Catalog URL
cat_url = 'https://stratus.rda.ucar.edu/d010092/catalogs/d010092-osdf-zarr-gdex.json'

## Set up local dask cluster 

Before we do any computation let us first set up a local cluster using dask

In [9]:
cluster = LocalCluster()          
client = cluster.get_client()

In [10]:
# Scale the cluster
n_workers = 5
cluster.scale(n_workers)
cluster

0,1
Dashboard: http://127.0.0.1:8787/status,Workers: 4
Total threads: 16,Total memory: 15.16 GiB
Status: running,Using processes: True

0,1
Comm: tcp://127.0.0.1:34739,Workers: 0
Dashboard: http://127.0.0.1:8787/status,Total threads: 0
Started: Just now,Total memory: 0 B

0,1
Comm: tcp://127.0.0.1:35601,Total threads: 4
Dashboard: http://127.0.0.1:35247/status,Memory: 3.79 GiB
Nanny: tcp://127.0.0.1:42283,
Local directory: /tmp/dask-scratch-space/worker-qs84mjgg,Local directory: /tmp/dask-scratch-space/worker-qs84mjgg

0,1
Comm: tcp://127.0.0.1:45293,Total threads: 4
Dashboard: http://127.0.0.1:35159/status,Memory: 3.79 GiB
Nanny: tcp://127.0.0.1:43655,
Local directory: /tmp/dask-scratch-space/worker-5jnqkfw_,Local directory: /tmp/dask-scratch-space/worker-5jnqkfw_

0,1
Comm: tcp://127.0.0.1:43421,Total threads: 4
Dashboard: http://127.0.0.1:37665/status,Memory: 3.79 GiB
Nanny: tcp://127.0.0.1:44047,
Local directory: /tmp/dask-scratch-space/worker-3h4uellj,Local directory: /tmp/dask-scratch-space/worker-3h4uellj

0,1
Comm: tcp://127.0.0.1:40839,Total threads: 4
Dashboard: http://127.0.0.1:44215/status,Memory: 3.79 GiB
Nanny: tcp://127.0.0.1:37269,
Local directory: /tmp/dask-scratch-space/worker-rmdyj0ji,Local directory: /tmp/dask-scratch-space/worker-rmdyj0ji


## Data Loading
### Load CESM2 LENS data from NCAR's RDA
- Load CESM2 LENS zarr data from RDA using an intake-ESM catalog that has OSDF links
- For more details regarding the dataset. See, https://rda.ucar.edu/datasets/d010092/#

In [15]:
col = intake.open_esm_datastore(cat_url)
col

Unnamed: 0,unique
Unnamed: 0,322
variable,53
long_name,51
component,4
experiment,2
forcing_variant,2
frequency,3
vertical_levels,3
spatial_domain,3
units,20


In [23]:
# Uncomment this line to see all the variables
# cesm_cat.df['variable'].values

In [24]:
cesm_temp = col.search(variable ='TEMP', frequency ='monthly')
cesm_temp

Unnamed: 0,unique
Unnamed: 0,3
variable,1
long_name,1
component,1
experiment,2
forcing_variant,2
frequency,1
vertical_levels,1
spatial_domain,1
units,1


In [25]:
cesm_temp.df['path'].values

array(['https://data-osdf.rda.ucar.edu/ncar-rda/d010092/ocn/monthly/cesm2LE-historical-cmip6-TEMP.zarr',
       'https://data-osdf.rda.ucar.edu/ncar-rda/d010092/ocn/monthly/cesm2LE-ssp370-cmip6-TEMP.zarr',
       'https://data-osdf.rda.ucar.edu/ncar-rda/d010092/ocn/monthly/cesm2LE-ssp370-smbb-TEMP.zarr'],
      dtype=object)

:::{note}
Note that all the file paths start with https://data-osdf.rda.ucar.edu indicating that the data will be streamed via OSDF

In [26]:
dsets_cesm = cesm_temp.to_dataset_dict()


--> The keys in the returned dictionary of datasets are constructed as follows:
	'component.experiment.frequency.forcing_variant'


In [27]:
dsets_cesm.keys()

dict_keys(['ocn.ssp370.monthly.smbb', 'ocn.ssp370.monthly.cmip6', 'ocn.historical.monthly.cmip6'])

In [28]:
historical       = dsets_cesm['ocn.historical.monthly.cmip6']
future_smbb      = dsets_cesm['ocn.ssp370.monthly.smbb']
future_cmip6     = dsets_cesm['ocn.ssp370.monthly.cmip6']

### Change units

In [29]:
orig_units = cf.Unit(historical.z_t.attrs['units'])
orig_units

NameError: name 'cf' is not defined

In [30]:
def change_units(ds, variable_str, variable_bounds_str, target_unit_str):
    orig_units = cf.Unit(ds[variable_str].attrs['units'])
    target_units = cf.Unit(target_unit_str)
    variable_in_new_units = xr.apply_ufunc(orig_units.convert, ds[variable_bounds_str], target_units, dask='parallelized', output_dtypes=[ds[variable_bounds_str].dtype])
    return variable_in_new_units

In [31]:
depth_levels_in_m = change_units(historical, 'z_t', 'z_t', 'm')
hist_temp_in_degK = change_units(historical, 'TEMP', 'TEMP', 'degK')
fut_cmip6_temp_in_degK = change_units(future_cmip6, 'TEMP', 'TEMP', 'degK')
fut_smbb_temp_in_degK = change_units(future_smbb, 'TEMP', 'TEMP', 'degK')
#
hist_temp_in_degK  = hist_temp_in_degK.assign_coords(z_t=("z_t", depth_levels_in_m['z_t'].data))
hist_temp_in_degK["z_t"].attrs["units"] = "m"
hist_temp_in_degK

NameError: name 'cf' is not defined

as well as $m = a * t / h$ text! Similarly, you have access to other $\LaTeX$ equation [**functionality**](https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Typesetting%20Equations.html) via MathJax:

\begin{align}
\dot{x} & = \sigma(y-x) \\
\dot{y} & = \rho x - y - xz \\
\dot{z} & = -\beta z + xy
\end{align}

Check out [**any number of helpful Markdown resources**](https://www.markdownguide.org/basic-syntax/) for further customizing your notebooks and the [**MyST Syntax Overview**](https://mystmd.org/guide/syntax-overview) for MyST-specific formatting information. Don't hesitate to ask questions if you have problems getting it to look *just right*.

## Last Section

You can add [admonitions using MyST syntax](https://mystmd.org/guide/admonitions):

:::{note}
Your relevant information here!
:::

Some other admonitions you can put in ([there are 10 total](https://mystmd.org/guide/admonitions#admonitions-list)):

:::{hint}
A helpful hint.
:::

:::{warning}
Be careful!
:::

:::{danger}
Scary stuff be here.
:::

We also suggest checking out Jupyter Book's [brief demonstration](https://jupyterbook.org/content/metadata.html#jupyter-cell-tags) on adding cell tags to your cells in Jupyter Notebook, Lab, or manually. Using these cell tags can allow you to [customize](https://jupyterbook.org/interactive/hiding.html) how your code content is displayed and even [demonstrate errors](https://jupyterbook.org/content/execute.html#dealing-with-code-that-raises-errors) without altogether crashing our loyal army of machines!

---

## Summary
Add one final `---` marking the end of your body of content, and then conclude with a brief single paragraph summarizing at a high level the key pieces that were learned and how they tied to your objectives. Look to reiterate what the most important takeaways were.

### What's next?
Let Jupyter book tie this to the next (sequential) piece of content that people could move on to down below and in the sidebar. However, if this page uniquely enables your reader to tackle other nonsequential concepts throughout this book, or even external content, link to it here!

## Resources and references
Finally, be rigorous in your citations and references as necessary. Give credit where credit is due. Also, feel free to link to relevant external material, further reading, documentation, etc. Then you're done! Give yourself a quick review, a high five, and send us a pull request. A few final notes:
 - `Kernel > Restart Kernel and Run All Cells...` to confirm that your notebook will cleanly run from start to finish
 - `Kernel > Restart Kernel and Clear All Outputs...` before committing your notebook, our machines will do the heavy lifting
 - Take credit! Provide author contact information if you'd like; if so, consider adding information here at the bottom of your notebook
 - Give credit! Attribute appropriate authorship for referenced code, information, images, etc.
 - Only include what you're legally allowed: **no copyright infringement or plagiarism**
 
Thank you for your contribution!