# MONAI Auto3Dseg AutoRunner

This notebook will introduce `AutoRunner`, the interface to run the Auto3Dseg pipeline with minimal user inputs.

## 1. Set up environment, imports and datasets
### 1.1 Set up Environment

In [1]:
!python -c "import monai" || pip install -q "monai-weekly[nibabel]"

### 1.2 Set up imports

In [2]:
import os
import torch

from monai.bundle.config_parser import ConfigParser
from monai.apps import download_and_extract

from monai.apps.auto3dseg import AutoRunner
from monai.auto3dseg import datafold_read

  from .autonotebook import tqdm as notebook_tqdm


### 1.3 Download public datasets

In [3]:
root = "./"
msd_task = "Task05_Prostate"
resource = "https://msd-for-monai.s3-us-west-2.amazonaws.com/" + msd_task + ".tar"
compressed_file = os.path.join(root, msd_task + ".tar")
if os.path.exists(root):
    download_and_extract(resource, compressed_file, root)

dataroot = os.path.join(root, msd_task)
datalist = "../tasks/msd/Task05_Prostate/msd_task05_prostate_folds.json"

Task05_Prostate.tar: 229MB [00:18, 12.9MB/s]                               


2022-09-21 05:37:43,880 - INFO - Downloaded: Task05_Prostate.tar
2022-09-21 05:37:43,880 - INFO - Expected md5 is None, skip md5 check for file Task05_Prostate.tar.
2022-09-21 05:37:43,881 - INFO - Non-empty folder exists in Task05_Prostate, skipped extracting.


### 1.4 Prepare a input YAML configuration

In [4]:
data_src_cfg = {
    "name": "Task05_Prostate",
    "task": "segmentation",
    "modality": "MRI",
    "datalist": datalist,
    "dataroot": dataroot,
}
input = './input.yaml'
ConfigParser.export_config_file(data_src_cfg, input)

## 2. Run the Auto3Dseg pipeline in a few lines of code

Below is the typical usage of AutoRunner
```python
runner = AutoRunner(input=input)
runner.run()
```

The `run` command will take a long time since it will train algorithms over iterations.

If the user would like to perform a full training in the tutorial, it is recommended to uncomment the `runner.run()` appended at the end of each code block.

### 2.1 Use the default setting

In [5]:
runner = AutoRunner(input=input)
# runner.run()

2022-09-21 05:37:51,714 - INFO - ./work_dir does not exists. Creating...
2022-09-21 05:37:51,717 - INFO - ./work_dir created to save all results
2022-09-21 05:37:51,719 - INFO - Loading ./input.yaml for AutoRunner and making a copy in /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/input.yaml
2022-09-21 05:37:51,729 - INFO - The output_dir is not specified. /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/ensemble_output will be used to save ensemble predictions
2022-09-21 05:37:51,731 - INFO - Directory /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/ensemble_output is created to save ensemble predictions


### 2.2 Use the dictionary instead of a YAML file as the input

In [6]:
runner = AutoRunner(input=data_src_cfg)
# runner.run()

2022-09-21 05:37:53,192 - INFO - Work directory ./work_dir is used to save all results
2022-09-21 05:37:53,196 - INFO - The output_dir is not specified. /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/ensemble_output will be used to save ensemble predictions


## 3 Customize and configure the Auto3Dseg
### 3.1 Set your working directory

In [7]:
runner = AutoRunner(work_dir='./my_workspace', input=input)
# runner.run()

2022-09-21 05:37:54,640 - INFO - ./my_workspace does not exists. Creating...
2022-09-21 05:37:54,646 - INFO - ./my_workspace created to save all results
2022-09-21 05:37:54,649 - INFO - Loading ./input.yaml for AutoRunner and making a copy in /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/my_workspace/input.yaml
2022-09-21 05:37:54,655 - INFO - The output_dir is not specified. /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/my_workspace/ensemble_output will be used to save ensemble predictions
2022-09-21 05:37:54,657 - INFO - Directory /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/my_workspace/ensemble_output is created to save ensemble predictions


### 3.2 Use cached result to save computation time

AutoRunner saves intermediate results by default. The user can choose whether it uses the cached results or restart from scratch.

If the users want to start from scratch, they can set `not_use_cache` to True

In [8]:
# This will restart from scratch and not use any cached results
runner = AutoRunner(input=input, not_use_cache=True)
# runner.run()

# Below will skip data analysis.
# Because data analysis was NOT completed and cache before, AutoRunner will throw an error

# runner = AutoRunner(input=input, analyze=False)  # This will throw error

2022-09-21 05:37:55,878 - INFO - Work directory ./work_dir is used to save all results
2022-09-21 05:37:55,880 - INFO - Loading ./input.yaml for AutoRunner and making a copy in /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/input.yaml
2022-09-21 05:37:55,882 - INFO - The output_dir is not specified. /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/ensemble_output will be used to save ensemble predictions


### 3.3 Output Ensemble Result

AutoRunner will perform inference on the testing data specified by the `datalist` in the data source config input. The inference result will be written to the `ensemble_output` folder under the working directory in the form of `nii.gz`. The user can choose the format by adding keyword arguments to the AutoRunner. A list of argument can be found in [MONAI tranforms documentation](https://docs.monai.io/en/stable/transforms.html#saveimage).

In [9]:
runner = AutoRunner(input=input, output_dir='./output_dir')
# runner.run()

2022-09-21 05:37:56,914 - INFO - Work directory ./work_dir is used to save all results
2022-09-21 05:37:56,916 - INFO - Loading ./input.yaml for AutoRunner and making a copy in /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/input.yaml
2022-09-21 05:37:56,923 - INFO - Directory ./output_dir is created to save ensemble predictions


## 4 Setting Auto3Dseg internal parameters
### 4.1 Change the number of folds for cross-validation

In [10]:
runner = AutoRunner(input=input)
runner.set_num_fold(num_fold=2)
# runner.run()

2022-09-21 05:37:58,125 - INFO - Work directory ./work_dir is used to save all results
2022-09-21 05:37:58,126 - INFO - Loading ./input.yaml for AutoRunner and making a copy in /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/input.yaml
2022-09-21 05:37:58,129 - INFO - The output_dir is not specified. /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/ensemble_output will be used to save ensemble predictions


### 4.2 Customize traininig parameters by override the default values

In [11]:
runner = AutoRunner(input=input)
# Note: among the provided bundles, most networks takes "num_iterations" to control the training iterations except segresnet
train_param = {"num_iterations": 8}
runner.set_training_params(params=train_param)
# runner.run()

2022-09-21 05:37:59,788 - INFO - Work directory ./work_dir is used to save all results
2022-09-21 05:37:59,793 - INFO - Loading ./input.yaml for AutoRunner and making a copy in /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/input.yaml
2022-09-21 05:37:59,805 - INFO - The output_dir is not specified. /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/ensemble_output will be used to save ensemble predictions


#### 4.2.1 A common set of training parameter for all algorithm templates

Note: This is for demo purpose. The user doesn't need to specify this training params.

**Auto3DSeg** uses bundle templates to perform training, validation, and inference. The number of epochs/iterations of training is specified by the config files in each template. While we can override them, it is also noted that some bundle templates may use "num_iterations" and other may use "num_epochs" to iterate. Below is code-block to convert num_epoch to iteration style and override all algorithms with the same training parameters for 1-GPU/2-GPU machine. 

In [12]:
max_epochs = 2000

num_gpus = 1 if "multigpu" in data_src_cfg and not data_src_cfg["multigpu"] else torch.cuda.device_count()

num_epoch = max_epochs
num_images_per_batch = 2
files_train_fold0, _ = datafold_read(datalist, "", 0)
n_data = len(files_train_fold0)
n_iter = int(num_epoch * n_data / num_images_per_batch / num_gpus)
n_iter_val = int(n_iter / 2)

train_param = {
    "num_iterations": n_iter,
    "num_iterations_per_validation": n_iter_val,
    "num_images_per_batch": num_images_per_batch,
    "num_epochs": num_epoch,
    "num_warmup_iterations": n_iter_val,
}
runner.set_training_params(params=train_param)
# runner.run()


### 4.3 Customize the ensemble method (mean vs. majority voting)

In [13]:
runner = AutoRunner(input=input)
runner.set_ensemble_method(ensemble_method_name="AlgoEnsembleBestByFold")
# runner.run()

2022-09-21 05:38:03,548 - INFO - Work directory ./work_dir is used to save all results
2022-09-21 05:38:03,553 - INFO - Loading ./input.yaml for AutoRunner and making a copy in /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/input.yaml
2022-09-21 05:38:03,564 - INFO - The output_dir is not specified. /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/ensemble_output will be used to save ensemble predictions


### 4.4 Customize the inference parameters by override the default values

In [14]:
# set model ensemble method
pred_params = {
    'files_slices': slice(0, 2),  # only infer the first two files in the testing data
    'mode': "vote",              # use majority vote instead of mean to ensemble the predictions
    'sigmoid': True,             # when to use sigmoid to binarize the prediction and output the label
}
runner = AutoRunner(input=input)
runner.set_prediction_params(params=pred_params)
# runner.run()

2022-09-21 05:38:04,464 - INFO - Work directory ./work_dir is used to save all results
2022-09-21 05:38:04,469 - INFO - Loading ./input.yaml for AutoRunner and making a copy in /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/input.yaml
2022-09-21 05:38:04,479 - INFO - The output_dir is not specified. /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/ensemble_output will be used to save ensemble predictions


## 5 Train model with HPO (NNI Grid-search)
### 5.1 Apply HPO to search hyper-parameter in Auto3Dseg

Note: Auto3Dseg supports hyper parameter optimization (HPO) via NNI and Optuna backends. Notebook of how to use these modules can be found in this directory.
AutoRunner supports NNI backend with a grid search method via automatically generating a the NNI config and run `nnictl` commands in subprocess.
Note: to run the HPO, you need to ensure the development environment has `nni` package. Please refer to the [MONAI Installation Guide](https://docs.monai.io/en/stable/installation.html#installing-the-recommended-dependencies) for how to install the recommended dependencies.

In [15]:
runner = AutoRunner(input=input, hpo=True)
search_space = {"learning_rate": {"_type": "choice", "_value": [0.0001, 0.001, 0.01, 0.1]}}
runner.set_nni_search_space(search_space)
# runner.run()

2022-09-21 05:38:05,959 - INFO - Work directory ./work_dir is used to save all results
2022-09-21 05:38:05,961 - INFO - Loading ./input.yaml for AutoRunner and making a copy in /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/input.yaml
2022-09-21 05:38:05,965 - INFO - The output_dir is not specified. /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/ensemble_output will be used to save ensemble predictions


### 5.2 Override the templated values

AutoRunner uses the following NNI config in its HPO module
```python
default_nni_config = {
    "trialCodeDirectory": ".",
    "trialGpuNumber": torch.cuda.device_count(),
    "trialConcurrency": 1,
    "maxTrialNumber": 10,
    "maxExperimentDuration": "1h",
    "tuner": {"name": "GridSearch"},
    "trainingService": {"platform": "local", "useActiveGpu": True},
}
```

It can be override by setting the hpo parameters

In [16]:
runner = AutoRunner(input=input, hpo=True)
hpo_params = {"maxTrialNumber": 20}
search_space = {"learning_rate": {"_type": "choice", "_value": [0.0001, 0.001, 0.01, 0.1]}}
runner.set_hpo_params(params=hpo_params)
runner.set_nni_search_space(search_space)
# runner.run()

2022-09-21 05:38:07,627 - INFO - Work directory ./work_dir is used to save all results
2022-09-21 05:38:07,630 - INFO - Loading ./input.yaml for AutoRunner and making a copy in /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/input.yaml
2022-09-21 05:38:07,640 - INFO - The output_dir is not specified. /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/ensemble_output will be used to save ensemble predictions


## 6 Conclusion

Here we demonstrate how to use the AutoRunner APIs to customize your **Auto3DSeg** pipeline with mininal inputs. Don't forget you need to execute the `run` command to start the training and make everything take effect.

```python
runner.run()
```