<a href="https://colab.research.google.com/github/lawsonk16/Object-Detection/blob/main/FasterRcnn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Train Model - Mask Detection
 1. Import packages and set paths
 2. Create Data Loaders
 3. Define the Model Path
 4. Load the Model Parameters
 5. Train the Model
 6. Run on Test Data

<i> Note: These paths are not universal - they are relative and require the user to set up their own data directory. With Colab it is difficult to do otherwise. </i>


In [1]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


### Step 1: Make Relevant Imports
We have defined many custom scripts, so we need to add them to our system path

In [1]:
import sys
paths = ['/content/drive/MyDrive/Colab Notebooks/scripts']

for p in paths:
    sys.path.append(p)

import os
import json
import shutil
import torch

from coco_utils.pytorch_coco_detect import *
from coco_utils.coco_help import *
from coco_utils.eval_coco import *

In [2]:
# fix a couple small issues with the fair1m annotations - to be added directly to the fair1m coco files later
from tqdm import tqdm 
def fix_fair1m_anns(ann_file):
    with open(ann_file, 'r') as f:
        gt = json.load(f)

    new_anns = []
    for a in gt['annotations']:
        new_a = a.copy()
        [x,y,w,h] = a['bbox']
        new_a['area'] = w*h
        if new_a['area'] > 0:
          new_anns.append(new_a)

    gt['annotations'] = new_anns


    new_images = []
    for img in tqdm(gt['images']):
        im_id = img['id']
        if len(anns_on_image(im_id, gt)) > 0 :
            new_images.append(img)

    gt['images'] = new_images

    with open(ann_file, 'w') as f:
        json.dump(gt, f)
    return

### Step 2: Unzip Data, Set Data Paths and Create Data Loaders

In [3]:
# data_zip_path = '/content/drive/MyDrive/Colab Notebooks/Clean Datasets/FAIR1M/500/FAIR1M_500_50-20-30.zip'
# data_path = '/content/'

# shutil.copy2(data_zip_path, data_path)

# shutil.unpack_archive(data_zip_path.split('/')[-1])
# os.remove(data_zip_path.split('/')[-1])

In [4]:
# change the data tag to match your dataset, if applicable
data_tag = 'FAIR1M_500'

train_ims = 'train_images/'
val_ims = 'val_images/'
test_ims = 'test_images/'

train_anns = f'train_{data_tag}_gt.json'
val_anns = f'val_{data_tag}_gt.json'
test_anns = f'test_{data_tag}_gt.json'

In [5]:
# for a in [train_anns, val_anns, test_anns]:
#     fix_fair1m_anns(a)

In [6]:
num_workers = 0
train_batch_size = 2

train_data_loader = make_train_loader(train_ims, train_anns, train_batch_size, num_workers)

loading annotations into memory...
Done (t=4.53s)
creating index...
index created!


In [7]:
num_workers = 0
val_batch_size = 1
val_data_loader = make_test_loader(val_ims, val_anns, val_batch_size, num_workers)

loading annotations into memory...
Done (t=1.00s)
creating index...
index created!


### Part 3: Prepare the Model
 - Create a model name for experiments with these hyperparameters
   - If previous training has occurred, this will be the path where that info is stored
 - Load a model of the correct depth

In [8]:
# Check the number of catgeories
with open(train_anns, 'r') as f:
    gt = json.load(f)

cats = gt['categories']

In [9]:
model_folder = '/content/drive/MyDrive/Colab Notebooks/Experiments/Detection/FAIR1M/Train-50_Val-20_Test-30/'
resnet_backbone = 18
num_classes = len(cats) + 1
model_name = f'resnet{resnet_backbone}fpn'
data_name = 'DOTA'
optim = 'SGD'
lr = 0.0001
mom = 0.9
wd = 0.0005
pretrained = True

model_path = name_model(model_folder, num_classes, model_name, data_name, optim, lr, mom, wd, pretrained, train_batch_size)
model_path

'/content/drive/MyDrive/Colab Notebooks/Experiments/Detection/FAIR1M/Train-50_Val-20_Test-30/resnet18fpn/DOTA_classes_38_optim_SGD_lr_0p0001_mom_0p9_wd_0p0005_pretrained_True_batch_2.pt'

In [10]:
# get the model
model = get_fasterrcnn(num_classes, pretrained, resnet_backbone)

### Part 4: Train the Model 
Set a few more hyperparameters, and display precision and recall on validation set at a frequency you set

In [None]:
#Choose a total number of epochs to train with this configuration
num_epochs = 5

data_loaders = [train_data_loader, val_data_loader]

losses_train = train_fasterrcnn(model, model_path, data_loaders, optim, lr, mom, wd, num_epochs)

Loading 1 epochs of training for fasterrcnn
Training epoch 2 of 5


  1%|▏         | 109/7692 [00:20<23:39,  5.34it/s]