## Clone the Repository

In [0]:
!git clone https://github.com/sovit-123/fastercnn-pytorch-training-pipeline.git

We will execute all the code within the cloned project directory, that is `fastercnn-pytorch-training-pipeline`.

In [0]:
# Enter the repo directory.
%cd fastercnn-pytorch-training-pipeline/

In [0]:
# Install the Requirements
!pip install -r requirements.txt

## Download the Dataset

Here we are using the [Aquarium Dataset](https://public.roboflow.com/object-detection/aquarium) from Roboflow.

Download the unzip the dataset to `custom_data` directory.

In [0]:
!curl -L "https://public.roboflow.com/ds/CNyGy97q45?key=eSpwiC1Ah7" > roboflow.zip; unzip roboflow.zip -d custom_data; rm roboflow.zip

## Create the Custom Dataset YAML File

The YAML file should contain:
* `TRAIN_DIR_IMAGES`: Path to the training images directory.
* `TRAIN_DIR_LABELS`: Path to the training labels directory containing the XML files. Can be the same as `TRAIN_DIR_IMAGES`.
* `VALID_DIR_IMAGES`: Path to the validation images directory.
* `VALID_DIR_LABELS`: Path to the validation labels directory containing the XML files. Can be the same as `VALID_DIR_IMAGES`.
* `CLASSES`: All the class names in the dataset along with the `__background__` class as the first class.
* `NC`: The number of classes. This should be the number of classes in the dataset + the background class. If the number of classes in the dataset are 7, then `NC` should be 8.
* `SAVE_VALID_PREDICTION_IMAGES`: Whether to save the prediction results from the validation loop or not.

In [0]:
%%writefile data_configs/custom_data.yaml
# Images and labels direcotry should be relative to train.py
TRAIN_DIR_IMAGES: 'custom_data/train'
TRAIN_DIR_LABELS: 'custom_data/train'
VALID_DIR_IMAGES: 'custom_data/valid'
VALID_DIR_LABELS: 'custom_data/valid'

# Class names.
CLASSES: [
    '__background__',
    'fish', 'jellyfish', 'penguin', 
    'shark', 'puffin', 'stingray',
    'starfish'
]

# Number of classes (object classes + 1 for background class in Faster RCNN).
NC: 8

# Whether to save the predictions of the validation set while training.
SAVE_VALID_PREDICTION_IMAGES: True

## Training

For this training example we use:
* The official Faster RCNN ResNet50 FPN model.
* Batch size of 8. You may change it according to the GPU memory available.

In [0]:
!wandb disabled

In [0]:
# Train the Aquarium dataset for 30 epochs.
!python train.py --config data_configs/custom_data.yaml --epochs 5 --model fasterrcnn_resnet50_fpn_v2 --project-name custom_training --batch-size 4 --no-mosaic

## Visualize Validation Results

Check out a few validation results from `outputs/training/custom_training` directory.

In [0]:
import matplotlib.pyplot as plt
import glob as glob

In [0]:
results_dir_path = 'outputs/training/custom_training'
valid_images = glob.glob(f"{results_dir_path}/*.jpg")

for i in range(3):
    plt.figure(figsize=(10, 7))
    image = plt.imread(valid_images[i])
    plt.imshow(image)
    plt.axis('off')
    plt.show()

## Check Out the Repo for Latest Updates

https://github.com/sovit-123/fastercnn-pytorch-training-pipeline

## Evaluation

In [0]:
# No verbose mAP.
!python eval.py --weights outputs/training/custom_training/best_model.pth --config data_configs/custom_data.yaml --model fasterrcnn_resnet50_fpn_v2

In [0]:
# Verbose mAP.
!python eval.py --weights outputs/training/custom_training/best_model.pth --config data_configs/custom_data.yaml --model fasterrcnn_resnet50_fpn_v2 --verbose