In [None]:
%load_ext autoreload
%aimport

In [2]:
%autoreload
import sys, os
sys.path.append(os.path.dirname(os.getcwd())) # Include ../SSD in path

# Introduction to SSD

This code is much more complex than your prior assignment, and we recommend you to spend some time getting familiar with the code structure.
The "complex" code structure is made to simplify aspects of deep learning experimentation, and help you structure the usual "sphagetti" deep learning code.


All scripts in this code requires a configuration file. To **start training**, you can type:
```
python train.py configs/ssd300.py
```

## Configuration files
The key difference from previous starter codes is the use of configuration files. This enables us to change small parts of the experiment without changing hard-coded values (e.g. learning rate in previous assignments).

If you take a look in [`configs/ssd300.py`](../configs/ssd300.py) you will notice a large set of objects describing model architecture (`backbone` and `model`), the optimizer, dataset, data loading, and hyperparameters.

To load the config we can write the following:

In [3]:
from ssd.utils import load_config
cfg = load_config("../configs/ssd300.py")

It looks like you are currently working on the server, with a 'working directory' in: /work/snotra/mariueng
Saving all SSD outputs to: /work/snotra/mariueng/ssd_outputs
Found dataset directory in: /work/datasets/mnist_object_detection/train
Found dataset directory in: /work/datasets/mnist_object_detection/val


`cfg` supports access syntax, where all objects in `configs/ssd300.py` are accessible via their attribute name.



In [4]:
print(cfg.model)

{'feature_extractor': {'output_channels': [128, 256, 128, 128, 64, 64], 'image_channels': '${train.image_channels}', 'output_feature_sizes': '${anchors.feature_sizes}', '_target_': <class 'ssd.modeling.backbones.basic.BasicModel'>}, 'anchors': {'feature_sizes': [[38, 38], [19, 19], [10, 10], [5, 5], [3, 3], [1, 1]], 'strides': [[8, 8], [16, 16], [32, 32], [64, 64], [100, 100], [300, 300]], 'min_sizes': [[30, 30], [60, 60], [111, 111], [162, 162], [213, 213], [264, 264], [315, 315]], 'aspect_ratios': [[2], [2, 3], [2, 3], [2, 3], [2], [2]], 'image_shape': '${train.imshape}', 'scale_center_variance': 0.1, 'scale_size_variance': 0.2, '_target_': <class 'ssd.modeling.anchor_boxes.AnchorBoxes'>}, 'loss_objective': <class 'ssd.modeling.ssd_multibox_loss.SSDMultiboxLoss'>, 'num_classes': 11, '_target_': <class 'ssd.modeling.ssd.SSD300'>}


If we print `cfg.model`, notice that it returns a dictionary and not the model object itself (which is `SSD300` from [`ssd/modeling/ssd.py`](../ssd/modeling/ssd.py)). This is because the model is defined "lazily" (wrapped with a `LazyCall`).

To create the model, we have to instantiate it:

In [5]:
from tops.config import instantiate
model = instantiate(cfg.model)
print(model)

SSD300(
  (feature_extractor): BasicModel(
    (feature_map_zero): Sequential(
      (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (2): ReLU()
      (3): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (5): ReLU()
      (6): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (7): ReLU()
      (8): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
    )
    (feature_map_one): Sequential(
      (0): ReLU()
      (1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (2): ReLU()
      (3): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
    )
    (feature_map_two): Sequential(
      (0): ReLU()
      (1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (2): ReL

Another example, we can load the first batch of the dataset and run a forward pass with the model:

In [6]:
dataloader_train = instantiate(cfg.data_train.dataloader)
print(type(dataloader_train))
batch = next(iter(dataloader_train))
for key, item in batch.items():
    print(key, "has the shape:", item.shape)
gpu_transform = instantiate(cfg.data_train.gpu_transform)
batch = gpu_transform(batch)
bbox_delta, confidences = model(batch["image"])
print(f"The model predicted anchors with  bbox delta: {bbox_delta.shape} and confidences: {confidences.shape}")

Downloading train-images-idx3-ubyte.gz...
Downloading t10k-images-idx3-ubyte.gz...
Downloading train-labels-idx1-ubyte.gz...
Downloading t10k-labels-idx1-ubyte.gz...
(47040000,)
(7840000,)
<class 'torch.utils.data.dataloader.DataLoader'>




image has the shape: torch.Size([32, 3, 300, 300])
boxes has the shape: torch.Size([32, 8732, 4])
labels has the shape: torch.Size([32, 8732])
The model predicted anchors with  bbox delta: torch.Size([32, 4, 8732]) and confidences: torch.Size([32, 11, 8732])


You might ask yourself, why? At first sight, this seems very complicated rather than plain-old  hard coded values.

The reason is easy manipulation of experiments. If I want to run the same experiment, but with a different batch size, I can change it with the following:

In [7]:
# Lets print the batch size of the original data loader:
print("Original batch size:", dataloader_train.batch_size)
cfg.train.batch_size = 2 # Setting the batch size to 2
dataloader_train = instantiate(cfg.data_train.dataloader)
print("New batch size:", dataloader_train.batch_size)

Original batch size: 32
New batch size: 2


Another reason is **configuration inheritance**. E.g. for the last task, you are going to train VGG on the VOC dataset. 
This requires us to change the backbone and dataset, however, keep all other parameters the same (e.g. general configs (cfg.train), anchors, schedulers, optimizers, etc.).

Take a look in [`configs/voc_vgg.py`](../configs/voc_vgg.py) and notice that we inherit from the original config:
```
from .ssd300 import train, anchors, optimizer, schedulers, model, data_train, data_val
```
The only changes done are to the backbone, dataset and dataset transforms.

In [8]:
cfg = load_config("../configs/voc_vgg.py")
model = instantiate(cfg.model)
print(model)

It looks like you are currently working on the server, with a 'working directory' in: /work/snotra/mariueng
Saving all SSD outputs to: /work/snotra/mariueng/ssd_outputs
Found dataset directory in: /work/datasets/mnist_object_detection/train
Found dataset directory in: /work/datasets/mnist_object_detection/val
Found dataset directory in: /work/datasets/VOCdevkit/VOC2007
Found dataset directory in: /work/datasets/VOCdevkit/VOC2012
Found dataset directory in: /work/datasets/VOCdevkit/VOC2007


Downloading: "https://download.pytorch.org/models/vgg16-397923af.pth" to /home/mariueng/.cache/torch/hub/checkpoints/vgg16-397923af.pth


ImportError: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html

# Useful commands:


#### Training and evaluation
To start training:
```
python train.py  configs/ssd300.py
```

To starting training VGG on VOC:
```
python train.py  configs/voc_vgg.py
```

To only run evaluation:
```
python train.py  configs/ssd300.py --evaluate-only
```

#### Demo.py
For VOC:
```
python demo.py configs/voc_vgg.py demo/voc demo/voc_output
```

For MNIST:
```
python demo.py configs/ssd300.py demo/mnist demo/mnist_output
```


#### Runtime analysis:
```
python3 runtime_analysis.py configs/ssd300.py
```