In [1]:
# Quick hack to put us in the root of the repository/pipeline directory
import os
if os.path.exists("01.cli_demonstration.ipynb"):
    os.chdir("..")
print("Running in {}".format(os.getcwd()))

Running in /data/hpcdata/users/jambyr/icenet/notebook-test


# IceNet CLI Usage

## Context

### Purpose
The IceNet library provides the ability to download, process, train and predict from end to end via a set of command-line interfaces. 

This notebook illustrates the CLI utilities that are available natively from the library and in conjunction with helpers scripts from the pipeline repository, for testing and producing operational forecasts.

### Modelling approach
This modelling approach allows users to immediately utilise the library for producing sea ice concentraion forecasts.

### Highlights
The key features of an end to end run are: 
* [Setup](#Setup)
* [Download](#Download) 
* [Process](#Process)
* [Train](#Train)
* [Predict](#Predict)

### Contributions
#### Notebook
James Byrne (author)

__Please raise issues [in this repository](https://github.com/antarctica/IceNet-Pipeline) to suggest updates to this notebook!__ 

Contact me at _jambyr \<at\> bas.ac.uk_ for anything else...

#### Modelling codebase
James Byrne (code author), Tom Andersson (science author)

#### Modelling publications
Andersson, T.R., Hosking, J.S., Pérez-Ortiz, M. et al. Seasonal Arctic sea ice forecasting with probabilistic deep learning. Nat Commun 12, 5124 (2021). https://doi.org/10.1038/s41467-021-25257-4

#### Involved organisations
The Alan Turing Institute and British Antarctic Survey

## Setup

### Prerequisites

In order to undertake the following, I'm assuming you have a the following at your disposal:

* A host to run this on
* A working conda installation on that host
* Either a slurm cluster to submit jobs to or run locally
* Wherever you run, you want GPUs for training (predictions run fine without)
* Git, python and shell knowledge to a basic degree :-)
* There are numerous external facilities that we interface with, which it's assumed you're set up to use (otherwise check the options as they can be disabled/overlooked)
  * Data sources under [Climate and Sea Ice Data](#Climate-and-Sea-Ice-Data)
  * Wandb (Weights and Biases) - can be disabled when using `icenet_train`
  * Azure - we demonstrate native uploading which can be skipped if required

The important thing to follow this notebook is to clone the [IceNet-Pipeline repository](https://github.com/antarctica/IceNet-Pipeline). The cloned directory __will become your working directory for the rest of your work in the notebooks unless otherwise specified__. 

It's worth noting that this is already done as we're using the notebook here!

```bash
git clone git@github.com:antarctica/IceNet-Pipeline.git <targetFolder>
cd <targetFolder>
```

I've called my folder green as this was derived from the blue-green infrastructure used for operational forecasting at the moment in BAS.

__Generally I run these commands in a screen or tmux session, so that they can be picked up from.__

### Environment Configuration

___TODO: update, as at time of writing the icenet package is not publicly available for installation and instead should be installed from source...___

Note that this is not run and stored in this notebook but is provided for reference. __The notebook assumes you're already running within the pipeline repository__ and using a suitable kernel.

```bash
./install_env.sh notebook-test
git clone git@github.com:JimCircadian/icenet2.git ../icenet2
pip install ../icenet2

# Do this every time you restart work in your shell OR change the kernel for the notebook appropriately
conda activate notebook-test
```

Of course if you want to edit the library in place and contribute, you'll need to fork the repo, amend your `git clone` address and add the `-e` flag to `pip install` to edit in place.

```
# Log from an install-env, it can take a while!

$ tail -f logs/install_env.log 
Environment notebook-test install started 2022-01-10 17:17:32
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... b'By downloading and using the CUDA Toolkit conda packages, you accept the terms and conditions of the CUDA End User License Agreement (EULA): https://docs.nvidia.com/cuda/eula/index.html\n'
b'By downloading and using the cuDNN conda packages, you accept the terms and conditions of the NVIDIA cuDNN EULA -\n  https://docs.nvidia.com/deeplearning/cudnn/sla/index.html\n'
done
Environment notebook-test install finished 2022-01-10 21:39:17
```

#### Commands

Once the icenet library is installed, you'll be able to access all commands made available by the library. Some are utilities that won't be covered, but using `icenet_<TAB>`-complete you should be able to see a list that includes (but _is not limited to_):

* icenet_data_cmip
* icenet_data_era5
* icenet_data_hres
* icenet_data_masks
* icenet_data_sic
* icenet_dataset_create
* icenet_output
* icenet_predict
* icenet_process_cmip
* icenet_process_era5
* icenet_process_hres
* icenet_process_metadata
* icenet_process_sic
* icenet_train
* icenet_upload_azure

All of these commands are either directly or indirectly (through pipeline shell scripts) used in this notebook...

All commands accept options such as `-v` for turning on verbose logging and `-h` for obtaining help about what options they offer. ___As is best practice for all commands in *nix land, use `-h` to obtain information about options___.

### The idea behind end to end runs

The IceNet package is designed to support automated runs from end to end by exposing the above CLI operations. These are simple wrappers around the library itself, and __any__ step of this can be undertaken manually or programmatically by inspecting the relevant endpoints. 

___TL;DR: for those of you just wanting to skip straight to an end to end example in shell, [please look at the daily execution script](#Daily-execution)...___

The end to end execution methodology is illustrated by this diagram:

![Full IceNet operational workflow...](https://raw.githubusercontent.com/wiki/alan-turing-institute/IceNet-Project/Pipeline%20Layout.png)

The portion of this you're really interested in understand is in the green box however, with the IceNet-Pipeline directory (e.g. `green`) corresponding to the green box and thus being, essentially, an ephemeral environment. 

#### A tip behind source data

You'll see that `Source Data Store` is located outside the green box. Because of the expense and time required to interface with external sources, _we recommend the following step so that source data can be shared between environments_...

```bash
# Make a source data store outside our ephemeral environment
# For the sake of brevity the rest of the notebooks use a fresh environment
mkdir ../data
ln -s ../data
```

### Pipeline versus CLI verses Library usage

Though this notebook is tailored around use of the [IceNet-Pipeline repository](https://github.com/antarctica/IceNet-Pipeline) there is no dependency on this repository for using the `icenet_*` commands. The pipeline repository just offers helpers scripts written in [`bash`](https://tldp.org/LDP/abs/html) for running an end to end pipeline out of the box. 

You are welcome to use any arbitrary directory to run the CLI scripts below. However, when it comes to the sections on [training](#Train) and [prediction](#Predict), as well as [running daily predictions](#Daily execution), you'll notice that we leverage scripts from the pipeline repository. This is because these scripts interact with the [model ensembling tool from BAS](https://github.com/JimCircadian/model-ensembler) to train and predict across multiple models instances. 

The rule of thumb to follow: 

* Use the pipeline repository if you want to run the end to end IceNet processing out of the box.
* Adapt or customise this process using `icenet_*` commands described in this notebook and in the scripts contained in the pipeline repo.
* For ultimate customisation, you can interact with the IceNet repository programmatically (which is how the CLI commands operate.) For more information look at the [IceNet CLI implementations](https://github.com/JimCircadian/icenet2/blob/main/setup.py#L32) and the [library notebook](03.library_usage.ipynb), along with the [library documentation](#TODO). 

## Download

### Mask data

IceNet relies on some generated masks for training/prediction, which can be automatically generated very easily using `icenet_data_masks {north,south}`. Once performed, this does not need to be rerun under the pipeline directory...

In [3]:
!icenet_data_masks south

INFO:root:Creating path: ./data/masks
INFO:root:Creating hemisphere path: ./data/masks/sh
INFO:root:Creating var path: ./data/masks/sh/masks
INFO:root:Creating var path: ./data/masks/sh/siconca
--2022-01-11 23:32:06--  ftp://osisaf.met.no/reprocessed/ice/conc/v2p0/2000/01/ice_conc_sh_ease2-250_cdr-v2p0_200001021200.nc
           => ‘./data/masks/sh/siconca/2000/01/.listing’
Resolving osisaf.met.no (osisaf.met.no)... 157.249.75.10
Connecting to osisaf.met.no (osisaf.met.no)|157.249.75.10|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD (1) /reprocessed/ice/conc/v2p0/2000/01 ... done.
==> PASV ... done.    ==> LIST ... done.

    [ <=>                                   ] 6,567       --.-K/s   in 0s      

2022-01-11 23:32:07 (329 MB/s) - ‘./data/masks/sh/siconca/2000/01/.listing’ saved [6567]

--2022-01-11 23:32:07--  ftp://osisaf.met.no/reprocessed/ice/conc/v2p0/2000/01/ice_conc_sh_ease2-250_cdr-v2p0_200001021

### Climate and Sea Ice Data

Obtaining and preparing data is simply achieved using `icenet_data_*` commands, which share common arguments `hemisphere`, `start_date` and `end_date`. There are also implementation specific options worth reviewing under `--help`. For example, getting the last two days data from the ERA5 reanalysis dataset can be done thus:

In [4]:
!icenet_data_era5 south -c cdsapi 2019-12-28 2020-4-30

INFO:root:ERA5 Data Downloading
INFO:root:Creating path: ./data/era5
INFO:root:Building request(s), downloading and daily averaging from ERA5 API
INFO:root:Creating hemisphere path: ./data/era5/sh
INFO:root:Creating hemisphere path: ./data/era5/sh
INFO:root:Creating hemisphere path: ./data/era5/sh
INFO:root:Creating hemisphere path: ./data/era5/sh
INFO:root:Creating hemisphere path: ./data/era5/sh
INFO:root:Creating hemisphere path: ./data/era5/sh
INFO:root:Creating hemisphere path: ./data/era5/sh
INFO:root:Creating hemisphere path: ./data/era5/sh
INFO:root:Creating var path: ./data/era5/sh/tas
INFO:root:Creating var path: ./data/era5/sh/tas
INFO:root:Creating var path: ./data/era5/sh/tas
INFO:root:Creating var path: ./data/era5/sh/tas
INFO:root:Creating var path: ./data/era5/sh/tas
INFO:root:Creating var path: ./data/era5/sh/ta500
INFO:root:Creating var path: ./data/era5/sh/ta500
INFO:root:Creating var path: ./data/era5/sh/ta500
INFO:root:Processing 4 dates
INFO:root:Downloading data 

For this relatively small dataset I've used the CDS API method directly (with `-c cdsapi`), though the native toolbox implementation can offer much better transfer efficiency for larger datasets.

In [7]:
!icenet_data_sic south 2019-12-28 2020-4-30

INFO:root:OSASIF-SIC Data Downloading
INFO:root:Creating path: ./data/osisaf
INFO:root:Downloading SIC datafiles to .temp intermediates...
  if el in self._invalid_dates:
INFO:root:Creating hemisphere path: ./data/osisaf/sh
INFO:root:Creating var path: ./data/osisaf/sh/siconca
INFO:root:FTP opening
INFO:root:Writing ./data/osisaf/sh/siconca/2019/2019_12_28.nc
INFO:root:Writing ./data/osisaf/sh/siconca/2019/2019_12_29.nc
INFO:root:Writing ./data/osisaf/sh/siconca/2019/2019_12_30.nc
INFO:root:Writing ./data/osisaf/sh/siconca/2019/2019_12_31.nc
INFO:root:Writing ./data/osisaf/sh/siconca/2020/2020_01_01.nc
INFO:root:Writing ./data/osisaf/sh/siconca/2020/2020_01_02.nc
INFO:root:Writing ./data/osisaf/sh/siconca/2020/2020_01_03.nc
INFO:root:Writing ./data/osisaf/sh/siconca/2020/2020_01_04.nc
INFO:root:Writing ./data/osisaf/sh/siconca/2020/2020_01_05.nc
INFO:root:Writing ./data/osisaf/sh/siconca/2020/2020_01_06.nc
INFO:root:Writing ./data/osisaf/sh/siconca/2020/2020_01_07.nc
INFO:root:Writing 

By default, the IceNet commands regrid and rotates data as required to align with the OSISAF SIC data, which is used as the output for the dataset. Programmatic usage allows you to avoid this ([see notebook 03](0.3.library_usage)).

At time of writing there are the following downloaders: 

* `icenet_data_era5` - downloads [ERA5 reanalysis](https://cds.climate.copernicus.eu/cdsapp#!/search?type=dataset&keywords=((%20%22Product%20type:%20Reanalysis%22%20) data using either the CDS Toolbox or direct API
* `icenet_data_cmip` - downloads the prescribed experiments from [CMIP6](https://esgf-node.llnl.gov/search/cmip6/) for the original IceNet paper runs
* `icenet_data_hres` - downloads up to date [forecast generated data from the ECMWF MARS API](https://www.ecmwf.int/en/forecasts/datasets/catalogue-ecmwf-real-time-products)
* `icenet_data_sic` - downloads [OSISAF sea-ice concentration (SIC) data](https://osisaf-hl.met.no/v2p1-sea-ice-index)

## Process

Processing takes the data made available through the source data store and undertakes the necessary normalisation for use as input channels to the UNet architecture. This intermediary step means that the original source data can be reused numerous times with varying training, validation and test date setups.

### Command example

In [11]:
!icenet_process_era5 notebook_data south -ns 2020-1-1 -ne 2020-3-31 -vs 2020-4-1 -ve 2020-4-20 -ts 2020-4-21 -te 2020-4-30 -l 3
!icenet_process_sic notebook_data south -ns 2020-1-1 -ne 2020-3-31 -vs 2020-4-1 -ve 2020-4-20 -ts 2020-4-21 -te 2020-4-30 -l 3
!icenet_process_metadata notebook_data south

2022-01-12 11:19:42.448470: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
INFO:root:Generated 91 dates for train
INFO:root:After reduction we have 91 train dates
INFO:root:Generated 20 dates for val
INFO:root:After reduction we have 20 val dates
INFO:root:Generated 10 dates for test
INFO:root:After reduction we have 10 test dates
INFO:root:Creating path: ./processed/notebook_data/era5
INFO:root:Processing 91 dates for train category
INFO:root:Including lag of 3 days
INFO:root:Processing 20 dates for val category
INFO:root:Including lag of 3 days
INFO:root:Processing 10 dates for test category
INFO:root:Including lag of 3 days
INFO:root:Got 124 files for hus1000
INFO:root:Got 124 files for psl
INFO:root:Got 124 files for rlds
INFO:root:Got 124 files for rsds
INFO:root:Got 124 files for ta500
INFO:root:Got 124 files for tas
INFO:root:Got 124 files for tos
INFO:root:Got 124 files for uas
INFO:root:Got 124 files for va

Consulting the command options will make the above more obvious (as well as further options) but a few things we can note that are helpful: 

* Options `-ns`, `-ne`, `-vs`, `-ve`, `-ts`, `-te`, which correspond to training, validation and test sets, allow ranges to be comma-delimited. The above example produces a split training set, for example, that spans two periods: 2000-2009 and 2011-2019.
* These date ranges can be randomised and subsampled using `-d`, __though this is still a bit experimental__
* The `-l` option (which is for `--lag`) specified the number of days back we look at input data variables for the output in question.

There are plenty of other options available for preprocessing the data, but it should be noted that whilst this is not strongly coupled to dataset creation, options like the lag specified here might influence the creation of datasets in the next step. 

These commands, especially with decadal ranges, can take a long time (12+ hours) to complete depending on the hosts/storage in use.

### Dataset creation

Once the above preprocessing is taken care of datasets can easily be created thus. This operation _creates a cached dataset_ in the filesystem that can be fed in for training runs. 

In [12]:
!icenet_dataset_create -l 3 -ob 2 -w 32 notebook_data south

2022-01-12 12:32:36.940937: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
INFO:root:Creating path: ./network_datasets/notebook_data
INFO:root:Loading configuration loader.notebook_data.json
INFO:root:Creating hemisphere path: ./network_datasets/notebook_data/sh
INFO:root:Creating var path: ./network_datasets/notebook_data/sh/train
INFO:root:91 train dates in total, generating cache data.
INFO:root:46 tasks submitted
INFO:root:Creating var path: ./network_datasets/notebook_data/sh/val
INFO:root:20 val dates in total, generating cache data.
INFO:root:56 tasks submitted
INFO:root:Creating var path: ./network_datasets/notebook_data/sh/test
INFO:root:10 test dates in total, generating cache data.
INFO:root:61 tasks submitted
INFO:root:Finished output ./network_datasets/notebook_data/sh/train/00000010.tfrecord
INFO:root:Finished output ./network_datasets/notebook_data/sh/train/00000000.tfrecord
INFO:root:Finished output 

The common options used here: 

* `-l` as in the preprocessing stage. If experimenting and using full date ranges, creating a dataset with a different lag can save having to reprocess everything.
* `-ob` is the output batch size for the tfrecords. It is advisable to keep this smaller except where there are seriously large numbers of sets, preferably near to the expected size being used for training.
* `-w` specifies the number of worker subprocesses to use for producing the output. Probably advisable to keep this below the number of cores on your host! :) 

#### Config-only operation / Prediction datasets

Datasets used to predict don't benefit from caching, so adding the `-c` option and dropping `-w` and `-ob` will create a configuration for the dataset without writing sets to disk. You can also use this option to create a dataset that is fed directly from the preprocessed data, though bear in mind, depending on your infrastructure, that this requires the batches to be created on the fly and can have a significant impact on performance. By specifying `-fn` we ensure the dataset is given a different name to the previously cached one above (though this is more commonly used for prediction datasets where caching isn't necessary...) 

In [13]:
!icenet_dataset_create -l 3 -c -fn notebook_raw_dataset notebook_data south

2022-01-12 15:37:13.618655: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
INFO:root:Creating path: ./network_datasets/notebook_raw_dataset
INFO:root:Loading configuration loader.notebook_data.json
INFO:root:Writing dataset configuration without data generation
INFO:root:91 train dates in total, NOT generating cache data.
INFO:root:20 val dates in total, NOT generating cache data.
INFO:root:10 test dates in total, NOT generating cache data.
INFO:root:Writing configuration to ./dataset_config.notebook_raw_dataset.json


## Train

Once the dataset is prepared, running a network is then as simple as using `icenet_train` with the appropriate parameters. Some key parameters are illustrated in the following commands:
 

In [24]:
!icenet_train notebook_data notebook_testrun 42 -b 4 -e 5 -m -qs 4 -w 4 -n 0.6

2022-01-13 15:10:54.890474: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
[34m[1mwandb[0m: Currently logged in as: [33mjambyr[0m (use `wandb login --relogin` to force relogin)
[34m[1mwandb[0m: Tracking run with wandb version 0.12.9
[34m[1mwandb[0m: Syncing run [33mnotebook_testrun[0m
[34m[1mwandb[0m:  View project at [34m[4mhttps://wandb.ai/jambyr/icenet2[0m
[34m[1mwandb[0m:  View run at [34m[4mhttps://wandb.ai/jambyr/icenet2/runs/1o4w181u[0m
[34m[1mwandb[0m: Run data is saved locally in /data/hpcdata/users/jambyr/icenet/notebook-test/wandb/run-20220113_151100-1o4w181u
[34m[1mwandb[0m: Run `wandb offline` to turn off syncing.

INFO:root:Hyperparameters: {'seed': 42, 'learning_rate': 0.0001, 'filter_size': 3, 'n_filters_factor': 0.6, 'lr_10e_decay_fac': 1.0, 'lr_decay': -0.0, 'lr_decay_start': 10, 'lr_decay_end': 30, 'batch_size': 4}
INFO:root:Loading configuration ./dataset_config.not

In [25]:
!icenet_train notebook_data notebook_testrun 42 -b 4 -e 5 -m -qs 4 -w 4 -n 0.6 \
    -p ./results/networks/notebook_testrun/notebook_testrun.network_notebook_data.42.h5 

2022-01-13 15:14:13.305089: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
[34m[1mwandb[0m: Currently logged in as: [33mjambyr[0m (use `wandb login --relogin` to force relogin)
[34m[1mwandb[0m: Tracking run with wandb version 0.12.9
[34m[1mwandb[0m: Syncing run [33mnotebook_testrun[0m
[34m[1mwandb[0m:  View project at [34m[4mhttps://wandb.ai/jambyr/icenet2[0m
[34m[1mwandb[0m:  View run at [34m[4mhttps://wandb.ai/jambyr/icenet2/runs/6hxwu35o[0m
[34m[1mwandb[0m: Run data is saved locally in /data/hpcdata/users/jambyr/icenet/notebook-test/wandb/run-20220113_151417-6hxwu35o
[34m[1mwandb[0m: Run `wandb offline` to turn off syncing.

INFO:root:Hyperparameters: {'seed': 42, 'learning_rate': 0.0001, 'filter_size': 3, 'n_filters_factor': 0.6, 'lr_10e_decay_fac': 1.0, 'lr_decay': -0.0, 'lr_decay_start': 10, 'lr_decay_end': 30, 'batch_size': 4}
INFO:root:Loading configuration ./dataset_config.not

These runs demonstrate using the aforementioned dataset, in `-b` batches of 4 for a run of `-e` five epochs. Using `-m` for multiprocessing we enable up to `-w` four process workers to load data at a time into a data queue `-qs` of length four. We could specify a `-r` ratio we use only 0.2x of the files from the dataset (_useful when testing on a low power machine with a large dataset, but unnecessary with our example here_) supplying a UNet built with 0.6x the `-n` numbers of filters as normal. 

With the second command we `-p` pickup the output weights from the previous run to continue training.

There are a few things to note about the `icenet_train` and `icenet_predict` (see [the prediction section below](#Predict)) commands and the switches they provide: 

* Common switches such as `-n` should be applied consistently between training and prediction. 
* These commands work with __individual network runs__ (see the next section).

### Ensemble running

For producing forecasts in the described pipeline we actually run a set of models using the [model-ensembler](https://github.com/JimCircadian/model-ensembler) tool and as such there are convenience scripts for doing this as part of the end to end run. 

In [27]:
!./run_train_ensemble.sh \
    -b 4 -e 10 -f 0.6 -n node022 -p bashpc.sh -q 4 -j 3 \
    notebook_data notebook_data notebook_ensemble

ARGS: -b 4 -e 10 -f 0.6 -n node022 -p bashpc.sh -q 4 -j 3 notebook_data notebook_data notebook_ensemble
ARGS = -x arg_batch=4 arg_epochs=10 arg_filter_factor=0.6 nodelist=node022 arg_prep=bashpc.sh arg_queue=4 , Leftovers: notebook_data notebook_data notebook_ensemble
Running model_ensemble ./tmp.Np62Pm6P51.train slurm -x arg_batch=4 arg_epochs=10 arg_filter_factor=0.6 nodelist=node022 arg_prep=bashpc.sh arg_queue=4 
[13-01-22 15:17:22    :INFO    ] - Model Ensemble Runner
[13-01-22 15:17:22    :INFO    ] - Validated configuration file ./tmp.Np62Pm6P51.train successfully
[13-01-22 15:17:22    :INFO    ] - Importing model_ensembler.cluster.slurm
[13-01-22 15:17:22    :INFO    ] - Running batcher
[13-01-22 15:17:22    :INFO    ] - Running command: mkdir -p ./results/networks
[13-01-22 15:17:22    :INFO    ] - Start batch: 2022-01-13 15:17:22.835865
[13-01-22 15:17:22    :INFO    ] - Start run notebook_ensemble-0 at 2022-01-13 15:17:22.837979
[13-01-22 15:17:22    :INFO    ] - rsync -aXE 

Many of the arguments are equivalent to the above `icenet_train` command. However, the `-n` filters factor is actually `-f` in this example (note that because I'm running on a cluster I've doubled this) and we have additional arguments `-n` for the node to run on, `-p` for the pre_run script to use and `-j` for the number of simultaneous runs to execute on the SLURM cluster we use at BAS. However, these arguments are not necessarily required for other clusters, nor is the model-ensembler limited to running on SLURM (it can, at present, also run locally.)  

The pipeline repository shell scripts that provide this functionality are easily adaptable, as well as the ensemble itself which is stored in the pipeline repository under `/ensemble/`.

_Please review the `-h` help option for the script to gain further insight the options available._

## Predict

One the network is trained it is possible to run any suitable sets through the network for training. __This is the purpose of configuration only datasets__ which are used by the `run_predict_ensemble` to, similarly to the training process, run predictions through all of the ensemble members. 

To run an individual sets through the test network from the test dataset we produced earlier can be easily achieved. The steps are to create a date file, which can be produced from the configuration created by `icenet_process` in the [processing section](#Process). This date file then can be supplied to the `icenet_predict` command to produce files using either cached data (useful for test data prepared at the same time as the training and validation sets) or directly from the normalised data (as is the case for nearly all data that isn't part of the training run.)

In [28]:
!./loader_test_dates.sh notebook_data | tee testdates

2020-04-21
2020-04-22
2020-04-23
2020-04-24
2020-04-25
2020-04-26
2020-04-27
2020-04-28
2020-04-29
2020-04-30


In [29]:
!icenet_predict -n 0.6 -t \
    notebook_data notebook_testrun example_south_forecast 42 testdates

2022-01-13 15:27:49.844082: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
INFO:root:Loading configuration ./dataset_config.notebook_data.json
INFO:root:Loading configuration loader.notebook_data.json
INFO:root:Datasets: 46 train, 10 val and 5 test filenames
2022-01-13 15:27:51.327371: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2022-01-13 15:27:51.328808: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2022-01-13 15:27:51.347520: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:3b:00.0 name: Quadro P4000 computeCapability: 6.1
coreClock: 1.48GHz coreCount: 14 deviceMemorySize: 7.93GiB deviceMemoryBandwidth: 226.62GiB/s
2022-01-13 15:27:51.347571: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dyna

The example uses the cached test data from the training run, but the process is the same for any other processed data with only the need to _omit the `-t` option, which specifies to source from cached test data_.

### Outputs

In the above example, there are three outputs: 

* __forecast__: the ___predicted___ forecast data from the model output layer
* __outputs__: the outputs from the data loader which would be used for training
* __weights__: the generated sample weights from the data loader for the training sample

The outputs initially are stored as Numpy arrays under the `results` directory thusly: 

```
results/predict/example_south_forecast/south_run.42/2010_09_01.npy
results/predict/example_south_forecast/south_run.42/loader/outputs/2010_09_01.npy
results/predict/example_south_forecast/south_run.42/loader/weights/2010_09_01.npy
```

It should be noted that when predicting the __first__ of these files is what we're really interested in, as the generated data uses a linear trend forecast to produce the data sample. 

### Ensemble running

When producing daily forecasts for IceNet we train on an ensemble of models and also run predictions across them producing a mean and error across that model ensemble. To do this the pipeline repository offers the `run_predict_ensemble` which operates similarly to the above training script. An example of running the ensemble: 

In [31]:
!./run_predict_ensemble.sh \
    -b 1 -f 0.6 -p bashpc.sh \
    notebook_ensemble notebook_data example_south_ensemble_forecast testdates

ARGS: -b 1 -f 0.6 -p bashpc.sh notebook_ensemble notebook_data example_south_ensemble_forecast testdates
ARGS = -x arg_batch=1 arg_filter_factor=0.6 arg_prep=bashpc.sh , Leftovers: notebook_ensemble notebook_data example_south_ensemble_forecast testdates
Running model_ensemble ./tmp.uOCA9LbimM.predict slurm -x arg_batch=1 arg_filter_factor=0.6 arg_prep=bashpc.sh 
[13-01-22 15:29:58    :INFO    ] - Model Ensemble Runner
[13-01-22 15:29:58    :INFO    ] - Validated configuration file ./tmp.uOCA9LbimM.predict successfully
[13-01-22 15:29:58    :INFO    ] - Importing model_ensembler.cluster.slurm
[13-01-22 15:29:58    :INFO    ] - Running batcher
[13-01-22 15:29:58    :INFO    ] - Start batch: 2022-01-13 15:29:58.383837
[13-01-22 15:29:58    :INFO    ] - Start run example_south_ensemble_forecast-0 at 2022-01-13 15:29:58.385446
[13-01-22 15:29:58    :INFO    ] - rsync -aXE ../template/ /data/hpcdata/users/jambyr/icenet/notebook-test/ensemble/example_south_ensemble_forecast/example_south_ens

As with the previous example, the individual numpy outputs, samples and sample weights are deposited into `/results/predict` for each ensemble member. However, the ensemble also runs `icenet_output` to generate __a CF-compliant NetCDF containing the forecasts requested__ which can then be post-processed or [deposited to an external location](#Uploading-to-Azure) (which is the platform for the [wider IceNet forecasting infrastructure](https://github.com/alan-turing-institute/IceNet-Project). 

In [32]:
!icenet_output -o results/predict example_south_forecast notebook_data testdates

2022-01-13 15:33:21.868186: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
INFO:root:Loading configuration ./dataset_config.notebook_data.json
INFO:root:Post-processing 2020-04-21
INFO:root:Post-processing 2020-04-22
INFO:root:Post-processing 2020-04-23
INFO:root:Post-processing 2020-04-24
INFO:root:Post-processing 2020-04-25
INFO:root:Post-processing 2020-04-26
INFO:root:Post-processing 2020-04-27
INFO:root:Post-processing 2020-04-28
INFO:root:Post-processing 2020-04-29
INFO:root:Post-processing 2020-04-30
INFO:root:Dataset arr shape: (10, 432, 432, 93, 2)
INFO:root:Saving to results/predict/example_south_forecast.nc


Note that the ensemble run automatically handles this generation of output for the ensemble. For a single run this is relatively meaningless as there is only a single model making predictions, giving no uncertainty quantification, __so this is provided as an example only__. _Please review the `-h` help option for the script to gain further insight the options available._

### Uploading to Azure

The following command uploads a specific date from the `icenet_output` produced dataset to an Azure storage blob storage account.  

In [33]:
!icenet_upload_azure results/predict/example_south_ensemble_forecast.nc 2020-04-30

INFO:root:Azure upload facility
INFO:root:Uploading ./tmp2kkkrbkb/example_south_ensemble_forecast.30042020.nc
INFO:root:Connecting client
INFO:azure.core.pipeline.policies.http_logging_policy:Request URL: 'https://sticenetetldata.blob.core.windows.net/input/example_south_ensemble_forecast.30042020.nc?comp=REDACTED&blockid=REDACTED&sv=REDACTED&ss=REDACTED&srt=REDACTED&sp=REDACTED&se=REDACTED&st=REDACTED&spr=REDACTED&sig=REDACTED'/nRequest method: 'PUT'/nRequest headers:/n    'Content-Length': '4194304'/n    'x-ms-version': 'REDACTED'/n    'Content-Type': 'application/octet-stream'/n    'Accept': 'application/xml'/n    'User-Agent': 'azsdk-python-storage-blob/12.9.0 Python/3.8.0 (Linux-3.10.0-1160.49.1.el7.x86_64-x86_64-with-glibc2.10)'/n    'x-ms-date': 'REDACTED'/n    'x-ms-client-request-id': '3de3c292-7486-11ec-b2dc-246e96a1b912'/nA body is sent with the request
INFO:azure.core.pipeline.policies.http_logging_policy:Response status: 201/nResponse headers:/n    'Content-Length': '0'/n 

## Other Pipeline Considerations

### A bit more information on ensemble runs

#### Cleaning up runs

Ensemble runs take place under `/ensemble/` in the pipeline folder and ARE NOT deleted after they've happened, to allow for debugging. Commonly, the ensemble configurations will contain a delete task to remove the extraneous run folders. __In the meantime this should be done manually__ after running `run_train_ensemble` or `run_predict_ensemble`.

The only exception to this is the use of `run_daily.sh` (see below) which does clean up prior to rerunning. 

### Daily execution

Daily execution is facilitated in the pipeline by using [`run_daily.sh`](https://github.com/antarctica/IceNet-Pipeline/blob/main/run_daily.sh). This wraps all the necessary steps to perform the following sequence for producing forecasts from yesterday for the next 93 days, for both northern and southern hemispheres. 

* Removes any old ensemble runs
* Downloads [HRES forecast data from the ECMWF MARS API](https://www.ecmwf.int/en/forecasts/datasets/catalogue-ecmwf-real-time-products)
* Processes the HRES and necessary training metadata to produce a data loader
* Creates a dataset configuration for it
* Runs a [prediction ensemble](#Predict) to produce a NetCDF
* Uploads to the necessary endpoint

#### Automation

With the above shell script it's trivial to automate using cron. Of course this is simply for demonstration, with more complex workflow managers offering far great flexibility especially when considering analysis of the produced forecasts.

```bash
# We assume your environment is configured appropriately to run conda from cron files, for example by adding...
#
# SHELL=/bin/bash
# BASH_ENV=~/.bashrc_env
#
# With conda initialisation in bashrc_env at the top of your crontab
25 9 * * * conda activate icenet; cd $HOME/hpc/icenet/pipeline && bash run_daily.sh >$HOME/daily.log 2>&1; conda deactivate
```

TODO: more information on the usage of this command.

## Summary

Within this notebook we've attempted to give a full crash course to running the CLI tools both __manually__ and using the __pipeline helper scripts__. This is the first of four (currently) notebooks contained within the pipeline repository, covering further information: 

* [Data structure and analysis](02.data_analysis.ipynb): understand the structure of the data stores and products created by these workflows and what tools currently exist in IceNet to looks over them.
* [Library usage](03.library_usage.ipynb): understand how to programmatically perform an end to end run.
* [Library extension](04.library_extension.ipynb): understand why and how to extend the IceNet library.

## Version
- IceNet Codebase: v0.1.0