# MONAI Auto3Dseg Reference Python APIs

In this notebook, we will break down the Auto3Dseg by the modules in the pipeline and introduce the API calls in Python and CLI commands. Particularly, if you have used the AutoRunner class, we will map the AutoRunner commands and configurations to each of the Auto3Dseg module APIs

![workflow](../figures/workflow.png)

## 1 Set up environment, imports and datasets

If you have set up MONAI and run the AutoRunner notebooks in simulated and real-world datasets, you may skip this step.

### 1.1 Set up Environment

In [None]:
!python -c "import monai" || pip install -q "monai-weekly[nibabel]"

### 1.2 Set up imports

In [None]:
# Copyright (c) MONAI Consortium
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#     http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import os

from monai.apps import download_and_extract
from monai.apps.auto3dseg import (
    DataAnalyzer,
    BundleGen,
    AlgoEnsembleBestN,
    AlgoEnsembleBuilder,
    algo_to_pickle,
    export_bundle_algo_history,
    import_bundle_algo_history,
)
from monai.bundle.config_parser import ConfigParser

from pprint import pprint

### 1.3 Download public datasets

In [None]:
root = "./"
work_dir = os.path.join(root, 'auto3dseg_work_dir')
if not os.path.isdir(work_dir):
    os.makedirs(work_dir)

msd_task = "Task05_Prostate"
dataroot = os.path.join(root, msd_task)
datalist = "../tasks/msd/Task05_Prostate/msd_task05_prostate_folds.json"

resource = "https://msd-for-monai.s3-us-west-2.amazonaws.com/" + msd_task + ".tar"
compressed_file = os.path.join(root, msd_task + ".tar")
if os.path.exists(root):
    download_and_extract(resource, compressed_file, root)

### 1.4 Prepare a input YAML configuration

In [None]:
data_src_cfg = {
    "name": "Task05_Prostate",
    "task": "segmentation",
    "modality": "MRI",
    "datalist": datalist,
    "dataroot": dataroot,
}
input = os.path.join(root, 'input.yaml')
ConfigParser.export_config_file(data_src_cfg, input)

## 2. Breaking down the AutoRunner

Below is the typical usage of AutoRunner
```python
runner = AutoRunner(input=input)
runner.run() 
```

The two lines cover the typical settings in Auto3Dseg and now we are going through the internal APIs calls inside these two lines

### 2.1 Data Analysis

When the `analyze` flag is set to `True`, `AutoRunner` will call `DataAnalyzer` to analyze the datasets and generate a statisical report in YAML. Below is the equivalent Python API calls of `DataAnalyzer`:


In [None]:
datastats_file = os.path.join(work_dir, 'data_stats.yaml')
analyser = DataAnalyzer(datalist, dataroot, output_path=datastats_file)
datastat = analyser.get_all_case_stats()
pprint(datastat)

Besides the Python API call, user can also use command line interface (CLI) provided by the user's OS. One example is the following bash commands:

```bash
python -m monai.apps.auto3dseg DataAnalyzer get_all_case_stats --datalist="../tasks/msd/Task05_Prostate/msd_task05_prostate_folds.json" --dataroot="./Task05_Prostate" --output_path="./auto3dseg_work_dir/data_stats.yaml"
```

### 2.2 Algorithm Generation (algo_gen)

When the `algo_gen` flag is set to `True`, `AutoRunner` will use `BundleGen` to generate monai bundles from templated algorithms in the working directory. 

The templated algorithms are customized for the datasets when the `generate` method is called. In detail, the `generate` method will fill the templates using information from the data_stats report. Also, it will copy the necessary scripts (train.py/infer.py) to the algorithm folder. Finally, it will create an algo_object.pkl to save the `Algo` so that it can be instantiated in the local or remote machine. Cross validation is used by default, and `num_fold` can be set to 1 if the users do not want cross validation.

Below is the equivalent Python API calls of `BundleGen`:

In [None]:
bundle_generator = BundleGen(
    algo_path=work_dir,
    data_stats_filename=datastats_file,
    data_src_cfg_name=input,
)

bundle_generator.generate(work_dir, num_fold=5)

Besides the Python API call, user can also use command line interface (CLI) provided by the user's OS. One example is the following bash commands:

```bash
python -m monai.apps.auto3dseg BundleGen generate 
--algo_path="./auto3dseg_work_dir/" --data_stats_filename="./auto3dseg_work_dir/data_stats.yaml" --data_src_cfg_name="./auto3dseg_work_dir/input.yaml"
```

### 2.2.1 Getting and Saving the history to hard drive

If the users continue to train the algorithms on local system, The history of the algorithm generation can be fetched via `get_history` method of the `BundleGen` object. There also are scenarios that users need to stop the Python process after the `algo_gen`. For example, the users may need to transfer the files to a remote cluster to start the training. `Auto3Dseg` offers a utility function `export_bundle_algo_history` to dump the history to hard drive and recall it by `import_bundle_algo_history`. 

If the files are copied to a remote system, please make sure the alrogirthm templates are also copied there. Some functions require the path to instantiate the algorithm class properly.

In [None]:
history = bundle_generator.get_history()
export_bundle_algo_history(history)  # save Algo objects

## 2.3 Training

### 2.3.1 Training the neural network sequentially

The algo_gen history contains `Algo` object that has multiple methods such as `train` and `predict`. We can easily use such APIs to trigger neural network training. By default, `AutoRunnner` will start a training on a single node (single or multiple GPUs) in a seqential manner:

`algo_to_pickle` is optional and it will update the dumped Algo objects with the accuracies information.

In [None]:
history = import_bundle_algo_history(work_dir, only_trained=False)
for task in history:
    for _, algo in task.items():
        algo.train()
        acc = algo.get_score()
        algo_to_pickle(algo, template_path=algo.template_path, best_metrics=acc)

#### 2.3.2 Train with Hyper-parameter Optimization (HPO)

Another method to handle the neural network training is to perform HPO (e.g. training & searching). This is made possible by NNI or Optuna packages which are installed in the MONAI development environment. `AutoRunner` uses NNI as backend via the `NNIGen`, but Optuna HPO can also be chosen via the `OptunaGen` method in the Auto3Dseg pipeline

To start a NNI, the users need to prepare a config file `nni_config.yaml` and run the command in bash:

```bash
nnictl create --config nni_config.yaml
```

Below is an example of the config:
```
default_nni_config = {
    "experimentName": name,
    "search_space": search_space,
    "trialCommand": cmd,
    "trialCodeDirectory": ".",
    "trialGpuNumber": torch.cuda.device_count(),
    "trialConcurrency": 1,
    "maxTrialNumber": 10,
    "maxExperimentDuration": "1h",
    "tuner": {"name": "GridSearch"},
    "trainingService": {"platform": "local", "useActiveGpu": True},
}
```

Example of the search space:
```python
search_space = {"_type": "choice", "_value": [0.0001, 0.001, 0.01, 0.1]}}
```

Example of the search command for `segresnet_0`
```python
cmd = "python -m monai.apps.auto3dseg NNIGen run_algo " + "./auto3dseg/segresnet_0/algo_object.pkl" + " ./auto3dseg"
```

### 2.4 Ensemble

Finally, after the neural networks are trained, `AutoRunner` will apply the ensemble methods in Auto3Dseg to improve the overall performance. 

Here we used a utility function `import_bundle_algo_history` to load the `Algo` that are trained into the ensemble. With the history loaded, we build an ensemble method and use the method to perform the inference on all testing data. By default, `AutoRunner` uses the `AlgoEnsembleBestN` to find the best N models and ensemble the prediction maps by taking the mean of the feature maps.

Note: Because we need to get the prediction in Python, there are no CLI command suggestion in this step.

In [None]:
history = import_bundle_algo_history(work_dir, only_trained=True)
builder = AlgoEnsembleBuilder(history, input)
builder.set_ensemble_method(AlgoEnsembleBestN(n_best=5))
ensembler = builder.get_ensemble()
preds = ensembler()