# IceNet: Pipeline usage

## Context

### Purpose
The first notebook demonstrated the use of high level command-line interfaces (CLI) of the IceNet library to download, process, train and predict from end to end.

Now that you have gone through the basic steps of running the IceNet model via the CLI, you may wish to establish a framework to run the model automatically for end-to-end runs. This is often called a Pipeline. A Pipeline can schedule ongoing model runs or run multiple model variations simultaneously.

This notebook illustrates the use of helper scripts from the IceNet pipeline repository for testing and producing operational forecasts.

Please do go through the first notebook before proceeding with this, as the data download exists outside of the pipeline.

### Highlights
The key features of an end to end run are:

 0. [Introduction](#0-Introduction)
 1. [Setup](#1-Setup)
 1. [Process](#2-Process)
 1. [Train](#3-Train)
 1. [Predict](#4-Predict)
 1. [Visualise](#5-Visualise)

### Contributions
#### Notebook

James Byrne (author)

Matthew Gascoyne

Bryn Noel Ubald

__Please raise issues [in this repository](https://github.com/icenet-ai/icenet-notebooks/issues) to suggest updates to this notebook!__ 

Contact me at _jambyr \<at\> bas.ac.uk_ for anything else...

#### Modelling codebase
James Byrne (code author), Tom Andersson (science author)

#### Modelling publications
Andersson, T.R., Hosking, J.S., Pérez-Ortiz, M. et al. Seasonal Arctic sea ice forecasting with probabilistic deep learning. Nat Commun 12, 5124 (2021). https://doi.org/10.1038/s41467-021-25257-4

#### Involved organisations
The Alan Turing Institute and British Antarctic Survey

# 0. Introduction

## CLI vs Library vs Pipeline usage

The IceNet package is designed to support automated runs from end to end by exposing the CLI operations demonstrated in the first notebook. These are simple wrappers around the library itself, and __any__ step of this can be undertaken manually or programmatically by inspecting the relevant endpoints. 

IceNet can be run in a number of ways: from the command line, the python interface, or as a pipeline.

The rule of thumb to follow: 

* Use the [pipeline repository](https://github.com/icenet-ai/icenet-pipeline) if you want to run the end to end IceNet processing out of the box.
* Adapt or customise this process using `icenet_*` commands described in this notebook and in the scripts contained in [the pipeline repo](https://github.com/icenet-ai/icenet-pipeline).
* For ultimate customisation, you can interact with the IceNet repository programmatically (which is how the CLI commands operate.) For more information look at the [IceNet CLI implementations](https://github.com/JimCircadian/icenet2/blob/main/setup.py#L32) and the [library notebook](03.library_usage.ipynb), along with the [library documentation](#TODO). 

## Using the Pipeline

Now that you have gone through the basic steps of running the IceNet model via the high-level CLI commands, you may wish to establish a framework to run the model automatically for end-to-end runs. This is often called a Pipeline. A Pipeline can schedule ongoing model runs or run multiple model variations simultaneously. The pipeline is driven by a series of bash scripts, and an environmental `ENVS` configuration file.

![Diagram of Icenet and it's pipeline](./pipeline_diagram3.png "Icenet pipeline diagram displaying process blocks and data being processed from input on the left to output on the right, through the pipeline")

To automatically produce daily IceNet forecasts we train multiple variations of the model, each with different starting conditions. We call this ensemble training. Then we run predictions for each model variation, producing a mean and error across the whole model ensemble. This captures some of the model uncertainty.

### Data

This assumes that you have a data store in a `data/` folder (This can be the same as the `data/` directory generated when running through the first notebook).

If following these series of notebooks, this current notebook requires running through the `1. Download` section from the first notebook to download the relevant data before proceeding.

### Ensemble Running

To do this, an [icenet-pipeline](https://www.github.com/icenet-ai/icenet-pipeline) repository is available. The icenet-pipeline offers the `run_train_ensemble.sh` and `run_predict_ensemble.sh` script which operates similarly to the `icenet_train` and `icenet_predict` CLI commands demonstrated in the first notebook from the IceNet library.

# 1. Setup

## Get the IceNet Pipeline

Before progressing you will need to clone the icenet-pipeline repository. Assuming you have followed the directory structure from the first notebook:


```bash
git clone https://www.github.com/icenet-ai/icenet-pipeline.git green
ln -s green notebook-pipeline
cd icenet-notebooks
```

We clone a 'fresh' pipeline repository into a directory called 'green' (as an arbitrary way of identifying the fresh pipeline) and then symbolically link to it. This allows us to symbolically swap to another pipeline later if we want to.

```
my-icenet-project/       <--- we're in here!
├── data/
├── icenet-notebooks/
├── green/               <--- Clone of icenet-pipeline
└── notebook-pipeline@   <--- Symlink to the green/ directory
```

In [1]:
# Viewing symbolically linked files.
!find .. -maxdepth 1 -type l -ls

317836352179    0 lrwxrwxrwx   1 bryald   ailab           5 Mar  7 11:12 ../notebook-pipeline -> green


## Configure the Pipeline

Move into the `notebook-pipeline` directory.

In [2]:
import os
os.chdir("../notebook-pipeline")
!pwd

/data/hpcdata/users/bryald/git/icenet/green


The pipeline is driven by environmental variables that are defined within an `ENVS` file.

There is an example ENVS file (`ENVS.example`) which is symbolically linked to by `ENVS`.

Before running through this notebook, please update the following variables in the ENVS file to point to your icenet conda environment (if different to the default):

<pre>
export ICENET_HOME=${ICENET_HOME:-${HOME}/icenet/${ICENET_ENVIRONMENT}}
export ICENET_CONDA=${ICENET_CONDA:-${HOME}/conda-envs/icenet}
</pre>

In [3]:
# Looking at the symlinked files in the `notebook-pipeline` directory
!find . -maxdepth 1 -type l -ls

225493182502    0 lrwxrwxrwx   1 bryald   ailab           7 Mar  7 12:50 ./data -> ../data
225493182496    0 lrwxrwxrwx   1 bryald   ailab          12 Mar  7 18:15 ./ENVS -> ENVS.example


# 2. Process

The following command processes the downloaded data for the dates defined in the ENVS file.

This is equivalent to running `icenet_process_era5`, `icenet_process_ora5`, `icenet_process_sic`, `icenet_process_metadata` commands from the IceNet library (as demonstrated in the first notebook).

The arguments passed to these commands are obtained from the `PROC_ARGS_*` variables in the ENVS file.

And, the dates that are processed are defined by the following variables in the ENVS file:
* `TRAIN_START_*`
* `TRAIN_END_*`
* `VAL_START_*`
* `VAL_END_*`
* `TEST_START_*`
* `TEST_END_*`

This only needs to be run once unless the above variables need to be changed.

In [4]:
!./run_data.sh south


CondaError: Run 'conda init' before 'conda activate'

[08-03-24 10:16:27 :INFO    ] - Got 91 dates for train
[08-03-24 10:16:27 :INFO    ] - Got 18 dates for val
[08-03-24 10:16:27 :INFO    ] - Got 2 dates for test
[08-03-24 10:16:27 :INFO    ] - Creating path: ./processed/demo_pipeline_south/era5
[08-03-24 10:16:27 :DEBUG   ] - Setting range for linear trend steps based on 7
[08-03-24 10:16:27 :INFO    ] - Processing 91 dates for train category
[08-03-24 10:16:27 :INFO    ] - Including lag of 1 days
[08-03-24 10:16:27 :INFO    ] - Including lead of 93 days
[08-03-24 10:16:27 :DEBUG   ] - Globbing train from ./data/era5/south/**/[12]*.nc
[08-03-24 10:16:27 :DEBUG   ] - Globbed 5 files
[08-03-24 10:16:27 :DEBUG   ] - Create structure of 5 files
[08-03-24 10:16:27 :INFO    ] - No data found for 2019-12-31, outside data boundary perhaps?
[08-03-24 10:16:27 :INFO    ] - Processing 18 dates for val category
[08-03-24 10:16:27 :INFO    ] - Including lag of 1 days
[08-03-24 10:16:27 :INFO   

# 3. Train



For producing forecasts in the described pipeline we actually run a set of models using the [model-ensembler](https://github.com/JimCircadian/model-ensembler) tool and as such there are convenience scripts for doing this as part of the end to end run.

This requires the [model-ensembler](https://pypi.org/project/model-ensembler/) (`pip install model-ensembler`) module to be installed.

Note that the model-ensembler will submit jobs and to configure the job scripts, you can access the templates that are used to generate them in the `.yaml` (in particular [`train.tmpl.yaml`](https://github.com/icenet-ai/icenet-pipeline/blob/main/ensemble/train.tmpl.yaml) for the training ensemble jobs) files in the `ensemble/` folder of the clone of the `icenet-pipeline` repository.

Many of the arguments for the following command are equivalent to the `icenet_train` command. However, the `-n` filters factor is actually `-f` in this example (note that because I'm running on a cluster I've doubled this) and we have additional arguments `-n` for the node to run on, `-p` for the pre_run script to use and `-j` for the number of simultaneous runs to execute on the SLURM cluster we use at BAS. However, these arguments are not necessarily required for other clusters, nor is the model-ensembler limited to running on SLURM (it can, at present, also run locally.)
The `-e` flag is used to define the number of epochs to run.

The pipeline repository shell scripts that provide this functionality are easily adaptable, as well as the ensemble itself which is stored in the pipeline repository under `/ensemble/`.

_Please review the `-h` help option for the script to gain further insight the options available._

In [5]:
# Positional Arguments
# argument 1: The loader json file:          loader.demo_pipeline_south.json
# argument 2: The dataset json file:         dataset_config.demo_pipeline_south.json
# argument 3: The trained network name:      name of trained network.
!./run_train_ensemble.sh -e 10 -f 0.6 -m 64gb -q 4 -j 5 demo_pipeline_south demo_pipeline_south demo_pipeline_south_ensemble

ARGS: -e 10 -f 0.6 -m 64gb -q 4 -j 5 demo_pipeline_south demo_pipeline_south demo_pipeline_south_ensemble
ARGS = -x arg_epochs=10 arg_filter_factor=0.6 mem=64gb arg_queue=4 , Leftovers: demo_pipeline_south demo_pipeline_south demo_pipeline_south_ensemble
No. of ensemble members:  2
Ensemble members:  42,46
Running model_ensemble ./tmp.oAHsq79Fm2.train slurm -x arg_epochs=10 arg_filter_factor=0.6 mem=64gb arg_queue=4 


[08-03-24 10:18:08    :INFO    ] - Model Ensemble Runner
[08-03-24 10:18:08    :INFO    ] - Validated configuration file ./tmp.oAHsq79Fm2.train successfully
[08-03-24 10:18:08    :INFO    ] - Importing model_ensembler.cluster.slurm
[08-03-24 10:18:08    :INFO    ] - Running batcher
[08-03-24 10:18:08    :INFO    ] - Running command: mkdir -p ./results/networks
[08-03-24 10:18:08    :INFO    ] - Start batch: 2024-03-08 10:18:08.118741
[08-03-24 10:18:08    :INFO    ] - Running cycle 1
[08-03-24 10:18:08    :INFO    ] - Start run demo_pipeline_south_ensemble-0 at 2024-03-08 10:18:08.123177
[08-03-24 10:18:08    :INFO    ] - rsync -aXE ../template/ /data/hpcdata/users/bryald/git/icenet/green/ensemble/demo_pipeline_south_ensemble/demo_pipeline_south_ensemble-0/
[08-03-24 10:18:08    :INFO    ] - Start run demo_pipeline_south_ensemble-1 at 2024-03-08 10:18:08.138182
[08-03-24 10:18:08    :INFO    ] - rsync -aXE ../template/ /data/hpcdata/users/bryald/git/icenet/green/ensemble/demo_pipeline_

# 4. Predict

In a similar manner to the training script, the `run_predict_ensemble` script will submit jobs to the HPC. The template corresponding to the prediction run is [`predict.tmpl.yaml`](https://github.com/icenet-ai/icenet-pipeline/blob/main/ensemble/predict.tmpl.yaml) found in the `icenet-pipeline` repo.

For the ensemble prediction, we define the dates we want to predict for in a csv file. This can be automatically generated from the dataset as follows.

In [6]:
!./loader_test_dates.sh demo_pipeline_south | tee testdates.csv

2020-04-01
2020-04-02


First look at the required input arguments for running the prediction ensemble.

In [7]:
!./run_predict_ensemble.sh --help

Usage ./run_predict_ensemble.sh NETWORK DATASET NAME DATEFILE [LOADER]


So to to predict from an ensemble training run, we use:  

| argument               |description |
|-----------------------:|:------------|
|*neural network name*   | - demo_pipeline_south_ensemble  
|*dataset*               | - demo_pipeline_south  
|*name of ensemble run*  | - example_south_ensemble_forecast  
|*which dates to run*    | - testdates.csv  

In [8]:
# -b: batch size
# -f: n_filters_factor
# -p: prep bash script
!./run_predict_ensemble.sh -f 0.6 -b 1 -p bashpc.sh demo_pipeline_south_ensemble demo_pipeline_south example_south_ensemble_forecast testdates.csv

ARGS: -f 0.6 -b 1 -p bashpc.sh demo_pipeline_south_ensemble demo_pipeline_south example_south_ensemble_forecast testdates.csv
ARGS = -x arg_filter_factor=0.6 arg_batch=1 arg_prep=bashpc.sh , Leftovers: demo_pipeline_south_ensemble demo_pipeline_south example_south_ensemble_forecast testdates.csv
No. of ensemble members:  2
Ensemble members:  42,46
Running model_ensemble ./tmp.Q5mXUXN6Zg.predict slurm -x arg_filter_factor=0.6 arg_batch=1 arg_prep=bashpc.sh 
[08-03-24 10:26:46    :INFO    ] - Model Ensemble Runner
[08-03-24 10:26:46    :INFO    ] - Validated configuration file ./tmp.Q5mXUXN6Zg.predict successfully
[08-03-24 10:26:46    :INFO    ] - Importing model_ensembler.cluster.slurm
[08-03-24 10:26:46    :INFO    ] - Running batcher
[08-03-24 10:26:46    :INFO    ] - Start batch: 2024-03-08 10:26:46.067519
[08-03-24 10:26:46    :INFO    ] - Running cycle 1
[08-03-24 10:26:46    :INFO    ] - Running command: /usr/bin/ln -s ../../data
[08-03-24 10:26:46    :INFO    ] - Start run examp

As with the previous example, the individual numpy outputs, samples and sample weights are deposited into `/results/predict` for each ensemble member. However, the ensemble also runs `icenet_output` to generate __a CF-compliant NetCDF containing the forecasts requested__ which can then be post-processed or [deposited to an external location](#Uploading-to-Azure) (which is the platform for the [wider IceNet forecasting infrastructure](https://github.com/alan-turing-institute/IceNet-Project)). 

In [9]:
!ls ./results/predict/example_south_ensemble_forecast

demo_pipeline_south_ensemble.42  demo_pipeline_south_ensemble.46


# 5. Visualising

## View the forecast output from the pipeline

In [10]:
from icenet.plotting.video import xarray_to_video as xvid
from icenet.data.sic.mask import Masks
from IPython.display import HTML
import xarray as xr, pandas as pd, datetime as dt

ds = xr.open_dataset("results/predict/example_south_ensemble_forecast.nc")
land_mask = Masks(south=True, north=False).get_land_mask()
ds.info()

xarray.Dataset {
dimensions:
	time = 2 ;
	yc = 432 ;
	xc = 432 ;
	leadtime = 7 ;

variables:
	int32 Lambert_Azimuthal_Grid() ;
		Lambert_Azimuthal_Grid:grid_mapping_name = lambert_azimuthal_equal_area ;
		Lambert_Azimuthal_Grid:longitude_of_projection_origin = 0.0 ;
		Lambert_Azimuthal_Grid:latitude_of_projection_origin = -90.0 ;
		Lambert_Azimuthal_Grid:false_easting = 0.0 ;
		Lambert_Azimuthal_Grid:false_northing = 0.0 ;
		Lambert_Azimuthal_Grid:semi_major_axis = 6378137.0 ;
		Lambert_Azimuthal_Grid:inverse_flattening = 298.257223563 ;
		Lambert_Azimuthal_Grid:proj4_string = +proj=laea +lon_0=0 +datum=WGS84 +ellps=WGS84 +lat_0=-90.0 ;
	float32 sic_mean(time, yc, xc, leadtime) ;
		sic_mean:long_name = mean sea ice area fraction across ensemble runs of icenet model ;
		sic_mean:standard_name = sea_ice_area_fraction ;
		sic_mean:short_name = sic ;
		sic_mean:valid_min = 0 ;
		sic_mean:valid_max = 1 ;
		sic_mean:ancillary_variables = sic_stddev ;
		sic_mean:grid_mapping = Lambert_Azimuth

In [11]:
forecast_date = ds.time.values[0]
print(forecast_date)

2020-04-01T00:00:00.000000000


In [12]:
fc = ds.sic_mean.isel(time=0).drop_vars("time").rename(dict(leadtime="time"))
fc['time'] = [pd.to_datetime(forecast_date) \
              + dt.timedelta(days=int(e)) for e in fc.time.values]

anim = xvid(fc, 15, figsize=4, mask=land_mask)
HTML(anim.to_jshtml())

## Other Pipeline Considerations

### A bit more information on ensemble runs

#### Cleaning up runs

Ensemble runs take place under `/ensemble/` in the pipeline folder and ARE NOT deleted after they've happened, to allow for debugging. Commonly, the ensemble configurations will contain a delete task to remove the extraneous run folders. __In the meantime this should be done manually__ after running `run_train_ensemble` or `run_predict_ensemble`.

The only exception to this is the use of `run_daily.sh` (see below) which does clean up prior to rerunning. 

### Daily execution

Daily execution is facilitated in the pipeline by using [`run_daily.sh`](https://github.com/antarctica/IceNet-Pipeline/blob/main/run_daily.sh). This wraps all the necessary steps to perform the following sequence for producing forecasts from yesterday for the next 93 days, for both northern and southern hemispheres. 

* Removes any old ensemble runs
* Downloads [HRES forecast data from the ECMWF MARS API](https://www.ecmwf.int/en/forecasts/datasets/catalogue-ecmwf-real-time-products)
* Processes the HRES and necessary training metadata to produce a data loader
* Creates a dataset configuration for it
* Runs a [prediction ensemble](#Predict) to produce a NetCDF
* Uploads to the necessary endpoint

#### Automation

With the above shell script it's trivial to automate using cron. Of course this is simply for demonstration, with more complex workflow managers offering far great flexibility especially when considering analysis of the produced forecasts.

```bash
# We assume your environment is configured appropriately to run conda from cron files, for example by adding...
#
# SHELL=/bin/bash
# BASH_ENV=~/.bashrc_env
#
# With conda initialisation in bashrc_env at the top of your crontab
25 9 * * * conda activate icenet; cd $HOME/hpc/icenet/pipeline && bash run_daily.sh >$HOME/daily.log 2>&1; conda deactivate
```

TODO: more information on the usage of this command.

## Summary

Within this notebook we've attempted to give a full crash course to running the CLI tools both __manually__ and using the __pipeline helper scripts__. This is the first of five (currently) notebooks contained within the pipeline repository, covering further information: 

* [Data structure and analysis](03.data_analysis.ipynb): understand the structure of the data stores and products created by these workflows and what tools currently exist in IceNet to looks over them.
* [Library usage](04.library_usage.ipynb): understand how to programmatically perform an end to end run.
* [Library extension](05.library_extension.ipynb): understand why and how to extend the IceNet library.

## Version
- IceNet Codebase: v0.2.7

___

### Configure the Pipeline

The `run_predict_ensemble` script will submit jobs to the HPC and those jobs are generated using `.yaml` templates.  
There are `.yaml` templates in the `ensemble/` folder of the cloned pipeline repository which are used to configure the jobs. In particular [`predict.tmpl.yaml`](https://github.com/icenet-ai/icenet-pipeline/blob/main/ensemble/predict.tmpl.yaml) for the training ensemble jobs.

Move into the `notebook-pipeline` directory.

In [None]:
os.chdir("../notebook-pipeline")
!pwd

Before running the Predict Ensemble you must check that pipeline `ENVS` configuration is correct.

In [None]:
!find . -maxdepth 1 -type l -ls

You can see that the `ENVS` config points to `ENVS.example`. Within this file, make sure the second and third 'export' commands refer to the correct location of your cloned icenet repository and your conda environment respectively.
<pre>
export ICENET_HOME=${ICENET_HOME:-${HOME}/icenet/${ICENET_ENVIRONMENT}}
export ICENET_CONDA=${ICENET_CONDA:-${HOME}/conda-envs/icenet}
</pre>

### Run the Pipeline
First look at the required input arguments for running the prediction ensemble.

In [None]:
!./run_predict_ensemble.sh --help

So to run the ensemble we use:  

| argument               |description |
|-----------------------:|:------------|
|*neural network name*   | - notebook_ensemble  
|*dataset*               | - notebook_data  
|*name of ensemble run*  | - example_south_ensemble_forcast  
|*which dates to run*    | - testdates.csv  


***... and now there are more arguments/switches that aren't documented in the above help***

In [None]:
!./loader_test_dates.sh notebook_data | tee testdates.csv

In [None]:
!./run_predict_ensemble.sh demo_pipeline_south_ensemble demo_pipeline_south example_south_ensemble_forecast testdates.csv -b 1 -f 0.6 -p bashpc.sh

In [None]:
!../icenet/run_predict_ensemble.sh \
    notebook_ensemble notebook_data example_south_ensemble_forecast testdates.csv \
    -b 1 -f 0.6 -p bashpc.sh

***Here I want to highlight what the outcome of the ensemble was and show something graphical.***

As with the previous example, the individual numpy outputs, samples and sample weights are deposited into `/results/predict` for each ensemble member. However, the ensemble also runs `icenet_output` to generate __a CF-compliant NetCDF containing the forecasts requested__ which can then be post-processed or [deposited to an external location](#Uploading-to-Azure) (which is the platform for the [wider IceNet forecasting infrastructure](https://github.com/alan-turing-institute/IceNet-Project)). 

**Move back into the `icenet-notebook` directory before generating the pipeline output.**

In [None]:
os.chdir("../icenet-notebooks-review")
!pwd

In [None]:
!icenet_output -o results/predict example_south_forecast notebook_data testdates.csv

Note that the ensemble run automatically handles this generation of output for the ensemble. For a single run this is relatively meaningless as there is only a single model making predictions, giving no uncertainty quantification, __so this is provided as an example only__. _Please review the `-h` help option for the script to gain further insight the options available._


### View the forecast output from the pipeline

In [None]:
from icenet.plotting.video import xarray_to_video as xvid
from icenet.data.sic.mask import Masks
from IPython.display import HTML
import xarray as xr, pandas as pd, datetime as dt

ds = xr.open_dataset("results/predict/example_south_forecast.nc")
land_mask = Masks(south=True, north=False).get_land_mask()
ds.info()

In [None]:
forecast_date = ds.time.values[0]
print(forecast_date)

In [None]:
fc = ds.sic_mean.isel(time=0).drop_vars("time").rename(dict(leadtime="time"))
fc['time'] = [pd.to_datetime(forecast_date) \
              + dt.timedelta(days=int(e)) for e in fc.time.values]

anim = xvid(fc, 15, figsize=4, mask=land_mask)
HTML(anim.to_jshtml())

### Uploading to Azure

The following command uploads a specific date from the `icenet_output` produced dataset to an Azure storage blob storage account.  

In [None]:
!icenet_upload_azure results/predict/example_south_forecast.nc 2020-04-30

## Other Pipeline Considerations

### A bit more information on ensemble runs

#### Cleaning up runs

Ensemble runs take place under `/ensemble/` in the pipeline folder and ARE NOT deleted after they've happened, to allow for debugging. Commonly, the ensemble configurations will contain a delete task to remove the extraneous run folders. __In the meantime this should be done manually__ after running `run_train_ensemble` or `run_predict_ensemble`.

The only exception to this is the use of `run_daily.sh` (see below) which does clean up prior to rerunning. 

### Daily execution

Daily execution is facilitated in the pipeline by using [`run_daily.sh`](https://github.com/antarctica/IceNet-Pipeline/blob/main/run_daily.sh). This wraps all the necessary steps to perform the following sequence for producing forecasts from yesterday for the next 93 days, for both northern and southern hemispheres. 

* Removes any old ensemble runs
* Downloads [HRES forecast data from the ECMWF MARS API](https://www.ecmwf.int/en/forecasts/datasets/catalogue-ecmwf-real-time-products)
* Processes the HRES and necessary training metadata to produce a data loader
* Creates a dataset configuration for it
* Runs a [prediction ensemble](#Predict) to produce a NetCDF
* Uploads to the necessary endpoint

#### Automation

With the above shell script it's trivial to automate using cron. Of course this is simply for demonstration, with more complex workflow managers offering far great flexibility especially when considering analysis of the produced forecasts.

```bash
# We assume your environment is configured appropriately to run conda from cron files, for example by adding...
#
# SHELL=/bin/bash
# BASH_ENV=~/.bashrc_env
#
# With conda initialisation in bashrc_env at the top of your crontab
25 9 * * * conda activate icenet; cd $HOME/hpc/icenet/pipeline && bash run_daily.sh >$HOME/daily.log 2>&1; conda deactivate
```

TODO: more information on the usage of this command.

## Summary

Within this notebook we've attempted to give a full crash course to running the CLI tools both __manually__ and using the __pipeline helper scripts__. This is the first of four (currently) notebooks contained within the pipeline repository, covering further information: 

* [Data structure and analysis](03.data_analysis.ipynb): understand the structure of the data stores and products created by these workflows and what tools currently exist in IceNet to looks over them.
* [Library usage](04.library_usage.ipynb): understand how to programmatically perform an end to end run.
* [Library extension](05.library_extension.ipynb): understand why and how to extend the IceNet library.

## Version
- IceNet Codebase: v0.2.7