# Analysis and Visualization Tutorials
```{article-info}
:avatar: https://secure.gravatar.com/avatar/709ea66dc102e6bc4547032f85ff6c95 
:avatar-link: mailto:paul.gierz@awi.de 
:avatar-outline: muted
:author: Paul Gierz 
:date: November 2022
:read-time: "15 min read"
:class-container: sd-p-2 sd-outline-muted sd-rounded-1
```

Here there is a selection of several tutorials for analyzing your simulation.

```{tableofcontents}
```

Each notebook contains at least a minimal interface you can adjust for your own
simulation, as well as a jumping off point for further customization of a
particular plot. In some cases, recommended background info is included as well
as some explanations of how and why the plot commands are set up in this way. 

## Interactive Tutorials
> **Summary**: Use the rocket icon if it's available to launch an interactive copy of a page.

Some of the tutorials are directly interactive. 

You can click on the little rocket in the upper corner, and select "Launch in
Jupyterhub." This will open up an interactive copy of the analysis notebook
running in the [Jupyterhub of AWI's HPC](http://paleosrv3.dmawi.de:9999) system
(`ollie0` or `ollie1`). 

```{figure} interactive_screenshot.png 
---
height: 300px
name: directive-fig
---
Example of an interactive notebook.
```

```{figure} interactive_screenshot_with_rocket.png 
---
height: 300px
name: directive-fig
---
How to start an interactive notebook.
```

In fact, this notebook is interactive! You can use it to play around with `bash` from within the notebook, and the example provided will explain how to generate graphics from a collection of notebooks.

## Batch Processing of interactive notebooks

> **Summary**: You can also batch process the interactive notebooks by passing in the common parameters needed (normally displayed in the very first few cells). This is accomplished with the [papermill](https://papermill.readthedocs.io/en/latest/) system.

Batch processing of notebooks allows you to repeat certain analyses against a different experiment, a later time in the experiment, or with different plot setting. Here's a basic example of how this can be accomplished using [papermill](https://papermill.readthedocs.io/en/latest/). I'll motivate this by showing the "non-batch" case, and then showing how a batch processing method can speed things up for you.

First, we want to see if we have git available, and which version we have, just as a sanity check. Regardless of if we are doing a batch approach or not, you'll use this step to get a collection of standard plotting notebooks:

In [1]:
which git
git --version

/usr/local/bin/git
git version 2.39.0


Next, we will clone the collection of analysis and visualization notebooks. In my case, I will put these into the `/tmp` folder, but you can put them in a useful location, for example `/albedo/work/user/pgierz/viz_notebook_collection`. You can customize these notebooks however you want!

````{margin}
```{hint}
If you just want to stick with the default set, you can find them pre-downloaded at `viz_notebook_collection` here: `/isibhv/projects/paleo_work/`. 
```
````
You can either enter these next commands into the terminal by hand, or, if you launched this notebook in the interactive mode as described above, you can just use <kbd>Shift ⇧</kbd> + <kbd>Enter ↩</kbd>

In [29]:
DESTINATION=/tmp/analysis_notebooks # TODO: You should put a path here that you want to use to store notebook templates!
if [ -d $DESTIONATION ]; then
    # Remove old clone if it exists
    rm -rf $DESTINATION
fi
git clone https://github.com/AWI-ESM/analysis_notebooks ${DESTINATION}

Cloning into './analysis_notebooks'...
remote: Enumerating objects: 31, done.[K
remote: Counting objects: 100% (31/31), done.[K
remote: Compressing objects: 100% (11/11), done.[K
remote: Total 31 (delta 17), reused 28 (delta 17), pack-reused 0[K
Receiving objects: 100% (31/31), 12.17 MiB | 2.46 MiB/s, done.
Resolving deltas: 100% (17/17), done.


Let's have a look at what we got:

In [30]:
ls $DESTINATION

[34m [1mfesom[0m  [33m [1;4mREADME.md[0m


In [31]:
ls $DESTINATION/fesom

 fesom_amoc.ipynb              fesom_sea_ice_area_climmean.ipynb
 fesom_evap_climmean.ipynb     fesom_ssh_climmean.ipynb
 fesom_mesh_comparison.ipynb   fesom_sss_climmean.ipynb
 fesom_precip_climmean.ipynb   fesom_sst_climmean.ipynb
 fesom_runoff_climmean.ipynb   fesom_velocity_climmean.ipynb


A quick look into the fesom subfolder shows that we have a few notebooks already there for getting a feeling for the basic state of a simulation. You could now copy this notebook to your experiment, customize it, and use those as a jumping off point for a completely custom plot. If I wanted to do this for each notebook there, I would need to modify 10 different notebooks to my settings, each time open up Jupyterlab, and each time run through the notebook one entire time, making sure I have the right Jupyter kernel selected, and that I don't accidently have an old experiment path somewhere. Yuck! There needs to be a nicer way to run many of these notebooks at once. 

After all, sometimes you just want to get a quick overview using the defaults, or, after you've customized some notebooks, you want to run exactly the same set of notebooks against a different experiment. To solve this problem, we can turn to the [`papermill`](https://papermill.readthedocs.io/en/latest/). It does exactly what we need: allows you to modify certain parameters in a notebook from the outside, and automatically execute it for you. We can first check that it is installed:

In [32]:
which -a papermill

/Users/pgierz/.local/bin/papermill
/Users/pgierz/.local/bin/papermill


Since I'm writing this notebook on my laptop, it'll pop up in my Mac's `$HOME` directory. Until a module is available, you can install it by running `pip install --user papermill`.

:::{admonition} {material-regular}`engineering;1.5rem;sd-mr-1` Currently Under Construction
:class: no-icon
`papermill` will be available as:
```console
$ module load papermill
```
in the future.
:::

We can inspect the notebooks with papermill to see what parameters can be set. I'll do this for two of them:

In [33]:
papermill --help-notebook $DESTINATION/fesom/fesom_amoc.ipynb

Usage: papermill [OPTIONS] NOTEBOOK_PATH [OUTPUT_PATH]

Parameters inferred for notebook './analysis_notebooks/fesom/fesom_amoc.ipynb':
  experiment_path: Unknown type (default "/some/path/to/experiment")
  newest_amoc_files: Unknown type (default 30)
  add_contour_lines: Unknown type (default False)
  add_contour_labels: Unknown type (default True)


In [34]:
papermill --help-notebook $DESTINATION/fesom/fesom_sst_climmean.ipynb

Usage: papermill [OPTIONS] NOTEBOOK_PATH [OUTPUT_PATH]

Parameters inferred for notebook './analysis_notebooks/fesom/fesom_sst_climmean.ipynb':
  experiment_path: Unknown type (default "/work/ollie/gknorr/STAR1/pico-fesom/experiments/production/pm20")
  newest_files: Unknown type (default 30)


You can see that we get a usage statement for `papermill`, alsong with which parameters it thinks can be set. In the first case, we see that it accepts 4, namely `experiment_path`, `newest_amoc_files`, `add_contour_lines`, and `add_contour_labels`. The second notebook only knows about two parameters, `experiment_path` and `newest_files`.

Let's now say you wanted run these two notebooks with your own choices for `experiment_path` and `newest_files`. I'll show you how to do this in a few different ways, and you can pick whatever feels easier for you.

### The Bash Way

First, we'll make a YAML file to hold our choices:

```yaml
# Batch Processing for AMOC and SST of my favorite run
experiment_path: /isibhv/projects/paleo_work/example_runs/awiesm-2.1/pre-industrial
newest_files: 10
newest_amoc_files: 50
```

As you can see, we added settings from both `fesom_amoc.ipynb` as well as `fesom_sst_climmean.ipynb` into the same YAML file. You can make comments, but the actual parameters need to be on the uppermost level, and always have `key: value` settings. Open up your favorite editor and make a file for that, or just use `echo`:

In [17]:
mkdir -p work/handbook_examples/batch_processing
cd work/handbook_examples/batch_processing
echo "# Batch Processing for AMOC and SST of My Favorite Run" > pre-industrial-demo.papermill.yaml
echo "experiment_path: /isibhv/projects/paleo_work/example_runs/awiesm-2.1/pre-industrial" >> pre-industrial-demo.papermill.yaml
echo "newest_files: 10" >> pre-industrial-demo.papermill.yaml
echo "newest_amoc_files: 50" >> pre-industrial-demo.papermill.yaml

work
work/handbook_examples
work/handbook_examples/batch_processing
/Users/pgierz/Code/github.com/AWIESM/docs/docs/work/handbook_examples/batch_processing


Now we can use a small loop to run `papermill` for all the notebooks we are interested in. Here's the first attempt, and it will also demostrate one of the things that can happen when this goes wrong:

````{margin}
```{note}
In the loop body, we use the command `papermill` where the first argument is the template notebook, and the second argument is the output notebook. The `-f` flag has the path to our `yaml` file containing our parameter choices.
```
````

In [41]:
for notebook in $DESTINATION/fesom/fesom_amoc.ipynb $DESTINATION/fesom/fesom_sst_climmean.ipynb; do
    echo "Batch processing $notebook"
    notebook_name=$(basename $notebook)
    result_name=${notebook_name%.*}_processed.ipynb
    echo "Saving result to $result_name"
    papermill $notebook $result_name -f pre-industrial-demo.papermill.yaml
done

Batch processing ./analysis_notebooks/fesom/fesom_amoc.ipynb
Saving result to fesom_amoc_processed.ipynb
Input Notebook:  ./analysis_notebooks/fesom/fesom_amoc.ipynb
Output Notebook: fesom_amoc_processed.ipynb
Black is not installed, parameters wont be formatted
Executing:   0%|                                       | 0/20 [00:00<?, ?cell/s]Executing notebook with kernel: python3
Executing:  20%|██████▏                        | 4/20 [00:01<00:05,  3.01cell/s]
Traceback (most recent call last):
  File "/Users/pgierz/.local/bin/papermill", line 8, in <module>
    sys.exit(papermill())
  File "/Users/pgierz/.local/share/pipx/venvs/papermill/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/Users/pgierz/.local/share/pipx/venvs/papermill/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/Users/pgierz/.local/share/pipx/venvs/papermill/lib/python3.10/site-packages/click/core.py", l

: 1

You'll see that both notebooks failed to execute, because the pyfesom2 module was not installed. This is because by default, `papermill` will use the default `Python3` kernel that it finds. You can customize this by tweaking the command slightly. On my laptop, I have a conda environment named `main-toolbox` where all of my plotting tools are in, so I specify to use that one:

````{margin}
```{note}
Here, we add the `--kernel <NAME>` flag. `<NAME>` is replaced by the kernel name you want to use.
```
````

In [44]:
for notebook in $DESTINATION/fesom/fesom_amoc.ipynb $DESTINATION/fesom/fesom_sst_climmean.ipynb; do
    echo "Batch processing $notebook"
    notebook_name=$(basename $notebook)
    result_name=${notebook_name%.*}_processed.ipynb
    echo "Saving result to $result_name"
    papermill $notebook $result_name -f pre-industrial-demo.papermill.yaml --kernel main-toolbox
done

Batch processing ./analysis_notebooks/fesom/fesom_amoc.ipynb
Saving result to fesom_amoc_processed.ipynb
Input Notebook:  ./analysis_notebooks/fesom/fesom_amoc.ipynb
Output Notebook: fesom_amoc_processed.ipynb
Black is not installed, parameters wont be formatted
Executing:   0%|                                       | 0/20 [00:00<?, ?cell/s]Executing notebook with kernel: main-toolbox
Executing:  25%|███████▊                       | 5/20 [00:06<00:19,  1.29s/cell]
Traceback (most recent call last):
  File "/Users/pgierz/.local/bin/papermill", line 8, in <module>
    sys.exit(papermill())
  File "/Users/pgierz/.local/share/pipx/venvs/papermill/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/Users/pgierz/.local/share/pipx/venvs/papermill/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/Users/pgierz/.local/share/pipx/venvs/papermill/lib/python3.10/site-packages/click/core.p

: 1

```{hint}
On several of the HPC systems, a shared kernel may be available! Ask your admins!
```

### The Python Way

In [3]:
import papermill as pm

In [4]:
pm.inspect_notebook("/tmp/analysis_notebooks/fesom/fesom_amoc.ipynb")

{'experiment_path': {'name': 'experiment_path',
  'inferred_type_name': 'None',
  'default': '"/some/path/to/experiment"',
  'help': ''},
 'newest_amoc_files': {'name': 'newest_amoc_files',
  'inferred_type_name': 'None',
  'default': '30',
  'help': ''},
 'add_contour_lines': {'name': 'add_contour_lines',
  'inferred_type_name': 'None',
  'default': 'False',
  'help': ''},
 'add_contour_labels': {'name': 'add_contour_labels',
  'inferred_type_name': 'None',
  'default': 'True',
  'help': ''}}

### The ESM-Tools Way

:::{admonition} {material-regular}`engineering;1.5rem;sd-mr-1` Currently Under Construction
:class: no-icon
In the near future, you will be able to embed this into your runscript and use:
```console
$ esm_runscripts <RUNSCRIPT>.yaml -t viz
```
to get a default collection.
:::