Copyright (c) MONAI Consortium  
Licensed under the Apache License, Version 2.0 (the "License");  
you may not use this file except in compliance with the License.  
You may obtain a copy of the License at  
&nbsp;&nbsp;&nbsp;&nbsp;http://www.apache.org/licenses/LICENSE-2.0  
Unless required by applicable law or agreed to in writing, software  
distributed under the License is distributed on an "AS IS" BASIS,  
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  
See the License for the specific language governing permissions and  
limitations under the License.

## Setup environment

In [None]:
!python -c "import monai" || pip install -q "monai-weekly[ignite,pyyaml]"

## Setup imports

In [None]:
from monai.config import print_config

print_config()

# MedNIST Classification Bundle

In this tutorial we'll describe how to create a bundle for a classification network. This will include how to train and apply the network on the command line. MedNIST will be used as the dataset with the bundle based off the [MONAI 101 notebook](https://github.com/Project-MONAI/tutorials/blob/main/2d_classification/monai_101.ipynb).

The dataset is kindly made available by Dr. Bradley J. Erickson M.D., Ph.D. (Department of Radiology, Mayo Clinic) under the Creative Commons CC BY-SA 4.0 license. If you use the MedNIST dataset, please acknowledge the source of the MedNIST dataset: the repository https://github.com/Project-MONAI/MONAI/ or the MedNIST tutorial for image classification https://github.com/Project-MONAI/MONAI/blob/master/examples/notebooks/mednist_tutorial.ipynb.

This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/.

First we'll consider a condensed version of the code from that notebook and go step-by-step how best to represent this as a bundle:

```python
import os

import monai.transforms as mt
import torch
from monai.apps import MedNISTDataset
from monai.data import DataLoader
from monai.engines import SupervisedTrainer
from monai.inferers import SimpleInferer
from monai.networks import eval_mode
from monai.networks.nets import densenet121

root_dir = os.environ.get("ROOTDIR", ".")

max_epochs = 25
device = torch.device("cuda:0")
net = densenet121(spatial_dims=2, in_channels=1, out_channels=6).to(device)

transform = mt.Compose([
        mt.LoadImaged(keys="image", image_only=True),
        mt.EnsureChannelFirstd(keys="image"),
        mt.ScaleIntensityd(keys="image"),
])

dataset = MedNISTDataset(root_dir=root_dir, transform=transform, section="training", download=True)

train_dl = DataLoader(dataset, batch_size=512, shuffle=True, num_workers=4)

trainer = SupervisedTrainer(
    device=device,
    max_epochs=max_epochs,
    train_data_loader=train_dl,
    network=net,
    optimizer=torch.optim.Adam(net.parameters(), lr=1e-5),
    loss_function=torch.nn.CrossEntropyLoss(),
    inferer=SimpleInferer(),
)

trainer.run()

torch.jit.script(net).save("mednist.ts")

class_names = ("AbdomenCT", "BreastMRI", "CXR", "ChestCT", "Hand", "HeadCT")
testdata = MedNISTDataset(root_dir=root_dir, transform=transform, section="test", runtime_cache=True)

max_items_to_print = 10
eval_dl = DataLoader(testdata[:max_items_to_print], batch_size=1, num_workers=0)
with eval_mode(net):
    for item in eval_dl:
        result = net(item["image"].to(device))
        prob = result.detach().to("cpu")[0]
        pred = class_names[prob.argmax()]
        gt = item["class_name"][0]
        print(f"Prediction: {pred}. Ground-truth: {gt}")
```

You can run this cell or save it to a file and run it on the command line. A `DenseNet` based network will be trained to classify MedNIST images into one of six categories. Mostly this script uses Ignite-based classes such as `SupervisedTrainer` which is great for converting into a bundle. Let's start by initialising a bundle directory structure:

In [1]:
%%bash

python -m monai.bundle init_bundle MedNISTClassifier
which tree && tree MedNISTClassifier || true

/usr/bin/tree
[01;34mMedNISTClassifier[00m
├── [01;34mconfigs[00m
│   ├── inference.json
│   └── metadata.json
├── [01;34mdocs[00m
│   └── README.md
├── LICENSE
└── [01;34mmodels[00m

3 directories, 4 files


## Metadata

We'll first replace the `metadata.json` file with our description of what the network will do:

In [2]:
%%writefile MedNISTClassifier/configs/metadata.json

{
    "version": "0.0.1",
    "changelog": {
        "0.0.1": "Initial version"
    },
    "monai_version": "1.2.0",
    "pytorch_version": "2.0.0",
    "numpy_version": "1.23.5",
    "optional_packages_version": {},
    "name": "MedNISTClassifier",
    "task": "MedNIST Classification Network",
    "description": "This is a demo network for classifying MedNIST images by type/modality",
    "authors": "Your Name Here",
    "copyright": "Copyright (c) Your Name Here",
    "data_source": "MedNIST dataset kindly made available by Dr. Bradley J. Erickson M.D., Ph.D. (Department of Radiology, Mayo Clinic)",
    "data_type": "jpeg",
    "intended_use": "This is suitable for demonstration only",
    "network_data_format": {
        "inputs": {
            "image": {
                "type": "image",
                "format": "magnitude",
                "modality": "any",
                "num_channels": 1,
                "spatial_shape": [64, 64],
                "dtype": "float32",
                "value_range": [0, 1],
                "is_patch_data": false,
                "channel_def": {
                    "0": "image"
                }
            }
        },
        "outputs": {
            "pred": {
                "type": "probabilities",
                "format": "classes",
                "num_channels": 6,
                "spatial_shape": [6],
                "dtype": "float32",
                "value_range": [0, 1],
                "is_patch_data": false,
                "channel_def": {
                    "0": "AbdomenCT",
                    "1": "BreastMRI",
                    "2": "CXR",
                    "3": "ChestCT",
                    "4": "Hand",
                    "5": "HeadCT"
                }
            }
        }
    }
}

Overwriting MedNISTClassifier/configs/metadata.json


This contains more information compared to the previous tutorial's file. For inputs the network, a tensor "image" is given as a 64x64 sized single-channel image. This is one of the MedNIST images whose modality varies but will have a value range of `[0, 1]` after rescaling in the transform pipeline. The channel definition states the meaning of each channel, this input has only one which is the greyscale image itself. For network outputs there is only one, "pred", representing the prediction of the network as a tensor of size 6. Each of the six values is a prediction of that class which is described in `channel_def`.

## Common Definitions

What we'll now do is construct the bundle configuration scripts to implement training, testing, and inference based off the original script file given above. Common definitions should be placed in a common file used with other scripts to reduce duplication. In our original script, the network definition and transform sequence will be used in multiple places so should go in this common file:

In [3]:
%%writefile MedNISTClassifier/configs/common.yaml
# only need to import torch right now
imports: 
- $import torch

# define a default root directory value, this can overridden on the command line
root_dir: "."

# define a device for the network
device: '$torch.device(''cuda:0'')'

# store the class names for inference later
class_names: [AbdomenCT, BreastMRI, CXR, ChestCT, Hand, HeadCT]

# define the network separately, don't need to refer to MONAI types by name or import MONAI
network_def:
  _target_: densenet121
  spatial_dims: 2
  in_channels: 1
  out_channels: 6

# define the network to be the given definition moved to the device
net: '$@network_def.to(@device)'

# define a transform sequence by instantiating a Compose instance with a transform sequence
transform:
  _target_: Compose
  transforms:
  - _target_: LoadImaged
    keys: 'image'
    image_only: true
  - _target_: EnsureChannelFirstd
    keys: 'image'
  - _target_: ScaleIntensityd
    keys: 'image'

Writing MedNISTClassifier/configs/common.yaml


Although this YAML is very different from the Python code it's defining essentially the same objects. Whether in YAML or JSON a bundle script defines an object instantiation as a dictionary containing the key `_target_` declaring the type to create, with other keys treated as arguments. A Python statement like `obj = ObjType(arg1=val1, arg2=val2)` is thus equivalent to 

```yaml
obj:
  _target_: ObjType
  arg1: val1
  arg2: val2
```

Note here that MONAI will import all its own symbols such that an explicit import statement is not needed nor is referring to types by fully qualified name, ie. `Compose` is adequate instead of `monai.transforms.Compose`. Definitions found in other packages or those in scripts associated with the bundle need to be referred to by the name they are imported as, eg. `torch.device` as show above.

## Training

For training we need a dataset, dataloader, and trainer object which will be used in the running "program":

In [4]:
%%writefile MedNISTClassifier/configs/train.yaml

max_epochs: 25

dataset:
  _target_: MedNISTDataset
  root_dir: '@root_dir'
  transform: '@transform'
  section: training
  download: true

train_dl:
  _target_: DataLoader
  dataset: '@dataset'
  batch_size: 512
  shuffle: true
  num_workers: 4

trainer:
  _target_: SupervisedTrainer
  device: '@device'
  max_epochs: '@max_epochs'
  train_data_loader: '@train_dl'
  network: '@net'
  optimizer: 
    _target_: torch.optim.Adam
    params: '$@net.parameters()'
    lr: 0.00001  # learning rate set slow so that you can see network improvement over epochs
  loss_function: 
    _target_: torch.nn.CrossEntropyLoss
  inferer: 
    _target_: SimpleInferer

train:
- '$@trainer.run()'
- '$torch.jit.script(@net).save(''model.ts'')'

Writing MedNISTClassifier/configs/train.yaml


There is a lot going on here but hopefully you see how this replicates the object definitions in the original source file. A few specific points:
* References are made to objects defined in `common.yaml` such as `@root_dir`, so this file needs to be used in conjunction with this one.
* A `max_epochs` hyperparameter is provided whose value you can change on the command line, eg. `--max_epochs 5`.
* Definitions for the `optimizer`, `loss_function`, and `inferer` arguments of `trainer` are provided inline but it would be better practice to define these separately.
* The learning rate is hard-coded as `1e-5`, it would again be better practice to define a separate `lr` hyperparameter, although it can be changed on the command line with `'--trainer#optimizer#lr' 0.001`.
* The trained network is saved using Pytorch's `jit` module directly, better practice would be to provide a handler, such as `CheckpointSaver`, to the trainer or to an evaluator object, see other tutorial examples on how to do this. This was kept here to match the original example.

Now the network can be trained by running the bundle:

In [6]:
%%bash

BUNDLE="./MedNISTClassifier"

# run the bundle with epochs set to 2 for speed during testing, change this to get a better result
python -m monai.bundle run train \
    --meta_file "$BUNDLE/configs/metadata.json" \
    --config_file "['$BUNDLE/configs/common.yaml','$BUNDLE/configs/train.yaml']" \
    --max_epochs 2

# we'll use the trained network as the model object for this bundle
mv model.ts $BUNDLE/models/model.ts

# generate the saved dictionary file as well
cd "$BUNDLE/models"
python -c 'import torch; obj = torch.jit.load("model.ts"); torch.save(obj.state_dict(), "model.pt")'

workflow_name None
config_file ['./MedNISTClassifier/configs/common.yaml', './MedNISTClassifier/configs/train.yaml']
meta_file ./MedNISTClassifier/configs/metadata.json
logging_file None
init_id None
run_id train
final_id None
tracking None
max_epochs 2
2023-09-11 16:19:49,915 - INFO - --- input summary of monai.bundle.scripts.run ---
2023-09-11 16:19:49,915 - INFO - > config_file: ['./MedNISTClassifier/configs/common.yaml',
 './MedNISTClassifier/configs/train.yaml']
2023-09-11 16:19:49,915 - INFO - > meta_file: './MedNISTClassifier/configs/metadata.json'
2023-09-11 16:19:49,915 - INFO - > run_id: 'train'
2023-09-11 16:19:49,915 - INFO - > max_epochs: 2
2023-09-11 16:19:49,915 - INFO - ---






2023-09-11 16:19:50,055 - INFO - Verified 'MedNIST.tar.gz', md5: 0bc7306e7427e00ad1c5526a6677552d.
2023-09-11 16:19:50,055 - INFO - File exists: MedNIST.tar.gz, skipped downloading.
2023-09-11 16:19:50,055 - INFO - Non-empty folder exists in MedNIST, skipped extracting.


Loading dataset: 100%|██████████| 47164/47164 [00:41<00:00, 1145.05it/s]


As shown here the Torchscript object produced by the training is moved into the `models` directory of the bundle. The saved weight file is also produced by loading that file again and saving the state. Once again best practice would be to instead use `CheckpointSaver` to save weights in an output location before the final file is chosen for the bundle. 

## Evaluation

To replicate the original example's code we'll need to put the evaluation loop code into a separate function and call it. The best practice would be to use an `Evaluator` class to do this with metric classes for assessing performance. Instead we'll stick close to the original code and demonstrate how to integrate your own code into a bundle.

The first thing to do is put the evaluation loop into a function and store it in the `scripts` module within the bundle:

In [7]:
!mkdir MedNISTClassifier/scripts

In [8]:
%%writefile MedNISTClassifier/scripts/__init__.py

from monai.networks.utils import eval_mode

def evaluate(net, dataloader, class_names, device):
    with eval_mode(net):
        for item in dataloader:
            result = net(item["image"].to(device))
            prob = result.detach().to("cpu")[0]
            pred = class_names[prob.argmax()]
            gt = item["class_name"][0]
            print(f"Prediction: {pred}. Ground-truth: {gt}")


Writing MedNISTClassifier/scripts/__init__.py


The `scripts` directory has to be a valid Python module so needs a `__init__.py` file, you can include other files and import them separately or import their members into this file. Here we defined `evaluate` to enclose the loop from the original script. This can then be called as part of a expression sequence "program":

In [9]:
%%writefile MedNISTClassifier/configs/evaluate.yaml

imports: 
- $import scripts

max_items_to_print: 10

ckpt_file: ""

testdata:
  _target_: MedNISTDataset
  root_dir: '@root_dir'
  transform: '@transform'
  section: test
  download: false
  runtime_cache: true

eval_dl:
  _target_: DataLoader
  dataset: '$@testdata[:@max_items_to_print]'
  batch_size: 1
  num_workers: 0

# loads the weights from the given file (which needs to be set on the command line) then calls "evaluate"
evaluate:
- '$@net.load_state_dict(torch.load(@ckpt_file))'
- '$scripts.evaluate(@net, @eval_dl, @class_names, @device)'


Writing MedNISTClassifier/configs/evaluate.yaml


Evaluation is then run on the command line, using "evaluate" as the program to run and providing a path to the model weights with the `ckpt_file` variable:

In [10]:
%%bash

BUNDLE="./MedNISTClassifier"
export PYTHONPATH="$BUNDLE"

python -m monai.bundle run evaluate \
    --meta_file "$BUNDLE/configs/metadata.json" \
    --config_file "['$BUNDLE/configs/common.yaml','$BUNDLE/configs/evaluate.yaml']" \
    --ckpt_file "$BUNDLE/models/model.pt"

workflow_name None
config_file ['./MedNISTClassifier/configs/common.yaml', './MedNISTClassifier/configs/evaluate.yaml']
meta_file ./MedNISTClassifier/configs/metadata.json
logging_file None
init_id None
run_id evaluate
final_id None
tracking None
ckpt_file ./MedNISTClassifier/models/model.pt
2023-09-11 16:22:56,379 - INFO - --- input summary of monai.bundle.scripts.run ---
2023-09-11 16:22:56,379 - INFO - > config_file: ['./MedNISTClassifier/configs/common.yaml',
 './MedNISTClassifier/configs/evaluate.yaml']
2023-09-11 16:22:56,379 - INFO - > meta_file: './MedNISTClassifier/configs/metadata.json'
2023-09-11 16:22:56,379 - INFO - > run_id: 'evaluate'
2023-09-11 16:22:56,379 - INFO - > ckpt_file: './MedNISTClassifier/models/model.pt'
2023-09-11 16:22:56,379 - INFO - ---






Prediction: AbdomenCT. Ground-truth: AbdomenCT
Prediction: BreastMRI. Ground-truth: BreastMRI
Prediction: ChestCT. Ground-truth: ChestCT
Prediction: CXR. Ground-truth: CXR
Prediction: Hand. Ground-truth: Hand
Prediction: HeadCT. Ground-truth: HeadCT
Prediction: HeadCT. Ground-truth: HeadCT
Prediction: CXR. Ground-truth: CXR
Prediction: ChestCT. Ground-truth: ChestCT
Prediction: BreastMRI. Ground-truth: BreastMRI


## Summary and Next

This tutorial has covered:
* Creating full training scripts in bundles
* Training a network then evaluating it's performance with scripts

That's it to creating a bundle to match an existing script. It was mentioned in a number of places that best practice wasn't followed to stick to the original script's structure, so further tutorials will cover this in greater detail. 