<div align="center">

# Case Study: Package Detection with Synthetic Data

<a href="https://www.zumolabs.ai/?utm_source=github.com&utm_medium=referral&utm_campaign=zpy"><img src="https://github.com/ZumoLabs/zpy/raw/main/docs/assets/zl_tile_logo.png" width="100px"/></a>

<p align="center">
  <a href="https://discord.gg/nXvXweHtG8"><img alt="Discord" title="Discord" src="https://img.shields.io/badge/-ZPY Devs-grey?style=for-the-badge&logo=discord&logoColor=white"/></a>
  <a href="https://twitter.com/ZumoLabs"><img alt="Twitter" title="Twitter" src="https://img.shields.io/badge/-@ZumoLabs-1DA1F2?style=for-the-badge&logo=twitter&logoColor=white"/></a>
  <a href="https://www.youtube.com/channel/UCcU2Z8ArljfDzfq7SOz-ytQ"><img alt="Youtube" title="Youtube" src="https://img.shields.io/badge/-ZumoLabs-red?style=for-the-badge&logo=youtube&logoColor=white"/></a>
</p>

<a href="Example synthetic package images."><img src="https://github.com/ZumoLabs/zpy/raw/main/docs/assets/package_sim_boxes.png" width="600px"/></a>

<a href="Results from model trained on package sim dataset."><img src="https://github.com/ZumoLabs/zpy/raw/main/docs/assets/package_sim_results.png" width="600px"/></a>

</div>

<!-- ![Example synthetic package images.](https://github.com/ZumoLabs/zpy/raw/main/docs/assets/package_sim_boxes.png)

![Results from model trained on package sim dataset.](https://github.com/ZumoLabs/zpy/raw/main/docs/assets/package_sim_results.png) -->

In this example, we train a detection model which predicts the bounding boxes for cardboard packages and parcels in images. We fine tune a pre-trained model on small synthetic datasets with different types of domain randomization.

# Installation

## Running Locally with Docker

You can connect this notebook to the provided Docker container to train locally with your GPU.  To build the docker image:

```
docker build \
     -t "zumolabs/package" \
     -f Dockerfile.gpu .
```

To run the docker image:

```
docker run \
    --gpus all \
    -p 8888:8888 \
    -p 6006:6006 \
    -v /tmp:/tmp \
    -v /home/tren/data:/data \
    zumolabs/package
```

In [None]:
# What GPU is currently connected?
!nvidia-smi

## Log into Zumo Labs

To generate data on the Zumo Labs cloud you will first need to [create an account](https://www.zumolabs.ai/).

In [None]:
import zpy
from zpy import client

# This is your temporary authtoken. It can be found by visiting:
#     https://app.zumolabs.ai/settings/auth-token
#
# The auth token will expire when you log out of the web app
auth_token = "..."

zpy.client.init(project_uuid='...', auth_token=auth_token)

# Generate Training (Synthetic) Data

We will create 4 separate synthetic datasets for training, each with different types of *Domain Randomization*: a technique commonly used in synthetic data to increase the variance of a dataset distribution.

- `package_sim_dr_light` - Synthetic dataset of 512 images. Domain randomization is applied to lighting only. The position of a sun light object in the scene, as well as the intensity of the light, is randomized within a range for every image
- `package_sim_dr_mats` - Synthetic dataset of 512 images. Domain randomization is applied to materials only. The material of each individial package is created in each image, starting with a randomly chosen texture from a library of thousands of textures scraped from the internet. Several properties of the material, such as specular, metallic, and roughness are then jittered within a broad range.
- `package_sim_dr_bg` - Synthetic dataset of 512 images. Domain randomization is applied to background only. Each image is rendered with a different HDRI, which is chosen from a library of hundreds of HDRIs scraped from the internet.
- `package_sim_dr_all` - Synthetic dataset of 512 images. Domain randomization is applied to lighting, background, and materials.


In [None]:
_config = zpy.client.DatasetConfig('package_v3')
_config.set('run\.domain_randomize_lights', True)
_config.set('run\.domain_randomize_materials', False)
_config.set('run\.domain_randomize_background', False)
zpy.client.generate('package_sim_DR_lights', _config, num_datapoints=256, materialize=True)

_config = zpy.client.DatasetConfig('package_v3')
_config.set('run\.domain_randomize_lights', False)
_config.set('run\.domain_randomize_materials', True)
_config.set('run\.domain_randomize_background', False)
zpy.client.generate('package_sim_DR_mats', _config, num_datapoints=256, materialize=True)

_config = zpy.client.DatasetConfig('package_v3')
_config.set('run\.domain_randomize_lights', False)
_config.set('run\.domain_randomize_materials', False)
_config.set('run\.domain_randomize_background', True)
zpy.client.generate('package_sim_DR_bg', _config, num_datapoints=256, materialize=True)

_config = zpy.client.DatasetConfig('package_v3')
_config.set('run\.domain_randomize_lights', True)
_config.set('run\.domain_randomize_materials', True)
_config.set('run\.domain_randomize_background', True)
zpy.client.generate('package_sim_DR_all', _config, num_datapoints=256, materialize=True)

In [None]:
from collections import namedtuple

# Lets make a small helper object to keep track of the location of
# the images and the annotation files for each dataset
Dataset = namedtuple('Dataset', ['name', 'image_directory_path', 'annotation_file_path'])

package_sim_DR_lights = Dataset(
    name='package_sim_DR_lights',
    image_directory_path='/data/package_ablation/package-sim-dr-lights',
    annotation_file_path='/data/package_ablation/package-sim-dr-lights/_annotations.coco.json',
)

package_sim_DR_mats = Dataset(
    name='package_sim_DR_mats',
    image_directory_path='/data/package_ablation/package-sim-dr-mats',
    annotation_file_path='/data/package_ablation/package-sim-dr-mats/_annotations.coco.json',
)

package_sim_DR_bg = Dataset(
    name='package_sim_DR_bg',
    image_directory_path='/data/package_ablation/package-sim-dr-bg',
    annotation_file_path='/data/package_ablation/package-sim-dr-bg/_annotations.coco.json',
)

package_sim_DR_all = Dataset(
    name='package_sim_DR_all',
    image_directory_path='/data/package_ablation/package-sim-dr-all',
    annotation_file_path='/data/package_ablation/package-sim-dr-all/_annotations.coco.json',
)

package_real_test = Dataset(
    name='package_real_test',
    image_directory_path='/data/package_ablation/package-real-test',
    annotation_file_path='/data/package_ablation/package-real-test/_annotations.coco.json',
)


# Visualize Data

Look through some sample images for each synthetic dataset with a different kind of domain randomization.


In [None]:
import zpy.viz
from pathlib import Path

def visualize_dataset(dataset : Dataset, search_string : str = '**/*.rgb.png'):
    """ Plot some sample images of a dataset"""
    opened_images = [zpy.image.open_image(i) for i in Path(dataset.image_directory_path).glob(search_string)]
    print(f'Found {len(opened_images)} images for {dataset.name}')
    zpy.viz.image_grid_plot(images=opened_images, output_path='/tmp', show=True)

# Synthetic data for training
print('\n\n Synthetic Dataset with Material Domain Randomization \n\n')
visualize_dataset(package_sim_DR_mats)
print('\n\n Synthetic Dataset with Background Domain Randomization \n\n')
visualize_dataset(package_sim_DR_bg)
print('\n\n Synthetic Dataset with Lighting Domain Randomization \n\n')
visualize_dataset(package_sim_DR_lights)
print('\n\n Synthetic Dataset with All Domain Randomization \n\n')
visualize_dataset(package_sim_DR_all)


# Real data for testing
print('\n\n Real Images \n\n')
visualize_dataset(package_real_test, '**/*.jpg')

# Train

Training code (`train_code.py`) fine-tunes on synthetic data and then evaluates on the real data, outputting prediction images.

The pre-trained models come from the [Detectron2 Model Zoo](https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md), more info on the models:

- Includes basic data augmentation: scale jittering and horizontal flipping.

- 3 different backbone combinations:
    - **FPN**: Use a ResNet+FPN backbone with standard conv and FC heads for mask and box prediction, respectively. It obtains the best speed/accuracy tradeoff, but the other two are still useful for research.

    - **C4**: Use a ResNet conv4 backbone with conv5 head. The original baseline in the Faster R-CNN paper.

    - **DC5** (Dilated-C5): Use a ResNet conv5 backbone with dilations in conv5, and standard conv and FC heads for mask and box prediction, respectively. This is used by the Deformable ConvNet paper.

- Trained with the 3x schedule (~37 COCO epochs). Although 1x models are heavily under-trained, we provide some ResNet-50 models with the 1x (~12 COCO epochs) training schedule for comparison when doing quick research iteration.


In [None]:
# Does the GPU work?
import torch
torch.cuda.is_available()

In [None]:
import itertools
import random

# Grid search over iterations, learning rate, models, training dataset
train_dataset = [
    package_sim_DR_lights,
    package_sim_DR_mats,
    package_sim_DR_bg,
    package_sim_DR_all,
]
models = [
    # 'faster_rcnn_R_50_C4_1x',
    'faster_rcnn_R_50_C4_3x',
    # 'faster_rcnn_R_101_C4_3x',
    # 'faster_rcnn_R_50_DC5_1x',
    'faster_rcnn_R_50_DC5_3x',
    # 'faster_rcnn_R_101_DC5_3x',
    'faster_rcnn_R_50_FPN_1x',
    'faster_rcnn_R_50_FPN_3x',
    'faster_rcnn_R_101_FPN_3x',
]
iterations = [2, 8, 32, 64, 128, 256, 512, 1024]
learning_rate = [0.001, 0.0001, 0.00001]
grid_search = list(itertools.product(train_dataset, iterations, learning_rate, models))

In [None]:
# Remove any existing training logs
!rm -rf /tmp/package_logs3

In [None]:
from train_code import train
from pathlib import Path

# random.shuffle(grid_search)
for train_dataset, iters, lr, model in grid_search:   
    train(
        # Directory for output of training logs and prediction images
        output_dir = Path('/tmp/package_logs3') / train_dataset.name / model / str(iters) / str(lr),
        model=model, 
        test_thresh=0.9, # How good do predictions have to be to be evaluated? 0.7 = 70% confidence
        iters=iters, # Total iterations = dataset size / batch size 
        lr=lr,
        batch_size=2, # Depends on your GPU memory and the size of the model and images
        class_dict={0: 'box'}, # Cardboard package box detection task
        # Fine-tune on synthetic data
        train_name = train_dataset.name,
        train_image_dir_path = train_dataset.image_directory_path,
        train_annotation_file_path = train_dataset.annotation_file_path,
        # Test on real data
        test_name = package_real_test.name,
        test_image_dir_path = package_real_test.image_directory_path,
        test_annotation_file_path = package_real_test.annotation_file_path,
    )

# Results



## Training Logs

You can view the training logs through tensorboard by navigating to [http://127.0.0.1:6006](http://127.0.0.1:6006)

In [None]:
%load_ext tensorboard
%tensorboard --logdir /tmp/package_logs

## Predictions

Visualize the top performing runs for each type of domain randomization.

In [None]:
import zpy.viz
from pathlib import Path

def visualize_predictions(prediction_image_dir_path : str):
    """ Plot some sample predictions"""
    opened_images = [zpy.image.open_image(i) for i in Path(prediction_image_dir_path).glob('**/*.jpg')]
    zpy.viz.image_grid_plot(images=opened_images, output_path='/tmp', show=True)

print('\n\n Material Domain Randomization \n\n')
visualize_predictions('/tmp/package_logs/package_sim_DR_mats/faster_rcnn_R_50_C4_1x/500/0.001/predictions')
print('\n\n Background Domain Randomization \n\n')
visualize_predictions('/tmp/package_logs/package_sim_DR_bg/faster_rcnn_R_50_C4_1x/128/0.001/predictions')
print('\n\n Lighting Domain Randomization \n\n')
visualize_predictions('/tmp/package_logs/package_sim_DR_lights/faster_rcnn_R_50_C4_1x/256/0.001/predictions')
print('\n\n All Domain Randomization \n\n')
visualize_predictions('/tmp/package_logs/package_sim_DR_all/faster_rcnn_R_50_C4_1x/256/0.001/predictions')
