![remo_logo](assets/remo_normal.png)

In this tutorial, Remo will be used to accelerate the process of building a transfer learning pipeline for the task of Object Detection.

In [1]:
import sys
%load_ext autoreload
%autoreload 2
# Specify path to Remo
# Mac version
local_path_to_repo =  '/home/harsha/Documents/remo-python'
# Windows version
#local_path_to_repo =  'C:/Users/crows/Documents/GitHub/remo-python'

sys.path.insert(0, local_path_to_repo)

In [1]:
# Imports
import pandas as pd
import numpy as np
import os
import glob
import random
from PIL import Image
import csv
random.seed(4)

import tqdm

import torch
from torch.utils.data import DataLoader, Dataset

import torchvision
from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
from torchvision.models.detection import FasterRCNN
from torchvision.models.detection.rpn import AnchorGenerator
import torchvision.transforms as transforms


import remo
remo.set_viewer('jupyter')

## Adding Data to Remo
- The dataset used in this example is a subset of the [Open Images Dataset](https://storage.googleapis.com/openimages/web/index.html).

- The directory structure of the dataset is:

        ├── object_detection_dataset
            ├── images
                ├── image_1.jpg
                ├── image_2.jpg
                ├── ...
            ├── annotations
                ├── object_detection_annotations.csv

In [2]:
# The dataset will be extracted in a new folder
if not os.path.exists('object_detection_dataset.zip'):
    !wget https://s-3.s3-eu-west-1.amazonaws.com/object_detection_dataset.zip
    !unzip -qq object_detection_dataset.zip
else:
    print('Files already downloaded')

--2020-08-20 12:53:01--  https://s-3.s3-eu-west-1.amazonaws.com/object_detection_dataset.zip
Resolving s-3.s3-eu-west-1.amazonaws.com (s-3.s3-eu-west-1.amazonaws.com)...52.218.61.152
Connecting to s-3.s3-eu-west-1.amazonaws.com (s-3.s3-eu-west-1.amazonaws.com)|52.218.61.152|:443...connected.
HTTP request sent, awaiting response...200 OK
Length: 2041750 (1.9M) [application/zip]
Saving to: ‘object_detection_dataset.zip’


2020-08-20 12:53:04 (928 KB/s) - ‘object_detection_dataset.zip’ saved [2041750/2041750]



In [3]:
root_dir = 'object_detection_dataset'
images_path = os.path.join(root_dir, 'images')
#annotations_path = os.path.join(root_dir, 'annotations')

**Train / test split**

To organise our images, we can also generate a list of annotation tags. Among other things, this allows to generate train / test splits without the need to move around image files.

To do this, we just need to pass a dictionary that maps tags to the relevant images paths to the function 
```remo.generate_tags_from_folders```.


In [5]:
## REQUIRES MERGE OF PULL REQUEST TO RUN

im_list = [os.path.basename(i) for i in glob.glob(str(images_path) + '/**/*.jpg', recursive=True)]
im_list = random.sample(im_list, len(im_list))
# Defining the train test split
train_idx = round(len(im_list) * 0.4)
valid_idx = train_idx + round(len(im_list) * 0.3)
test_idx  = valid_idx + round(len(im_list) * 0.3)

# Tags Dictionary
tags_dict = { 'train' : im_list[0:train_idx], 
              'valid' : im_list[train_idx:valid_idx], 
              'test'  : im_list[valid_idx:test_idx] }

# Generating Tags file
remo.generate_image_tags(tags_dictionary = tags_dict)

AttributeError: module 'remo' has no attribute 'generate_image_tags'

### Creating a Dataset
To add a dataset, you can use the remo.create_dataset() specifying the path to data and annotations.

The class encoding is passed via a dictionary.

For a complete list of formats supported please refer the documentation.

In [6]:
object_detection_dataset = remo.create_dataset(name = 'Object_Detection_Dataset', 
                                               local_files = [root_dir], 
                                               annotation_task= 'Object Detection')

Acquiring data - completed                                                                           
Processing data - Processing annotation files: 1 of 1 filesProcessing data - completed                                                                          
Data upload completed


**Visualizing the dataset**

To view your data and labels using the Remo visual interface directly in the notebook, call the ```dataset.view()``` method.




In [7]:
object_detection_dataset.view()

Open http://localhost:8123/datasets/179


![dataset_view](assets/obj_dataset_view.png)

**Dataset Statistics**

Remo alleviates the need to write extra boilerplate for accessing dataset properties.

This can be done either using code, or via the visual interface.


In [8]:
object_detection_dataset.view_annotation_stats()

Open http://localhost:8123/annotation-detail/243/insights


![view_annotations_stats](assets/obj_view_annotations.png)


## Feeding Data into PyTorch

A custom PyTorch Dataset object defined below is used to load data.

In order to adapt this to your dataset, the following are required:

- **Path to Tags:** Path to Tags file for Train, Test, Validation split CSV generated by Remo
- **Path to Annotations:** Path to Annotations CSV File (Format : file_name, classes, xmin, ymin, xmax, ymax)
- **transforms:** Transforms to be applied to the images before passing it to the network.

In [9]:
class ObjectDetectionDataset(Dataset):

    def __init__(self, annotations, train_test_split, transform=None, mapping=None, mode="train"):
        self.mode = mode

        self.data = pd.read_csv(annotations)
        self.data["im_name"] = self.data["file_name"].apply(lambda x : os.path.basename(x))
        self.data = self.data.set_index("im_name")

        # Tags for Test Train Split
        self.train_test_split = pd.read_csv(train_test_split).set_index("file_name")
        self.data["tag"] = -1

        # Update Tags using Pandas, Column im_name in self.data is compared to file_name in self.train_test_split 
        self.data.update(self.train_test_split)
        
        # Load only Train/Test/Split depending on the mode
        self.data = self.data[self.data["tag"] == self.mode].reset_index(drop=True)
        
        self.file_names = self.data['file_name'].unique()
        self.transform = transform
        self.mapping = mapping

    def __len__(self) -> int:
        return self.file_names.shape[0]


    def __getitem__(self, index: int):

        file_name = self.file_names[index]
        records = self.data[self.data['file_name'] == file_name].reset_index()
        
        image = np.array(Image.open(file_name), dtype=np.float32)
        image /= 255.0

        if self.transform:
            image = self.transform(image)  
            
        if self.mode != "test":
            boxes = records[['xmin', 'ymin', 'xmax', 'ymax']].values
            
            area = (boxes[:, 3] - boxes[:, 1]) * (boxes[:, 2] - boxes[:, 0])
            area = torch.as_tensor(area, dtype=torch.float32)

            if self.mapping is not None:
                labels = np.zeros((records.shape[0],))
            
                for i in range(records.shape[0]):
                    labels[i] = self.mapping[records.loc[i, "classes"]]
                    
                labels = torch.as_tensor(labels, dtype=torch.int64)
            
            else:
                labels = torch.ones((records.shape[0],), dtype=torch.int64)

            iscrowd = torch.zeros((records.shape[0],), dtype=torch.int64)
            
            target = {}

            target['boxes'] = boxes
            target['labels'] = labels
            target['image_id'] = torch.tensor([index])
            target['area'] = area
            target['iscrowd'] = iscrowd 
            target['boxes'] = torch.stack(list((map(torch.tensor, target['boxes'])))).type(torch.float32)

            return image, target, file_name
        else:
            return image, file_name

def collate_fn(batch):
    return tuple(zip(*batch))


The train, test and validation datasets are instantiated and wrapped around a DataLoader method.



In [None]:
tensor_transform = transforms.Compose([transforms.ToTensor()])

# Mapping between Class name and Index
cat_to_index = {"Wheel" : 1, "Car" : 2, "Person" : 3, "Land vehicle" : 4, 
                "Human body" : 5, "Plant" : 6, "Tire" : 7, "Vehicle" : 8, 
                "Vehicle registration plate" : 9}

train_dataset = ObjectDetectionDataset(annotations="Object detection.csv",  
                                      train_test_split="tags.csv", 
                                      transform=tensor_transform,
                                      mapping=cat_to_index,
                                      mode="train")

test_dataset = ObjectDetectionDataset(annotations="Object detection.csv", 
                                      train_test_split="tags.csv", 
                                      transform=tensor_transform, 
                                      mapping=cat_to_index,
                                      mode="test")


train_data_loader = DataLoader(train_dataset, batch_size=1, shuffle=False, num_workers=0, collate_fn=collate_fn)
test_data_loader  = DataLoader(test_dataset, batch_size=1, shuffle=False, num_workers=0, collate_fn=collate_fn)

## Training the Model

The pre-trained ```Faster RCNN``` Model with the ```ResNet-50 Backbone``` is used in this tutorial.

To train the model, the following details are specified:

- **Model**: The edited version of the pre-trained model.
- **num_classes**: The number of classes present in your dataset (Eg: num_classes + 1 (background))
- **Optimizer:** The optimizer used for training the network
- **Num_epochs:** The number of epochs for which we would like to train the network.

In [None]:
device      = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
num_classes = 10
loss_value  = 0.0
num_epochs  = 10

In [None]:
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)

in_features = model.roi_heads.box_predictor.cls_score.in_features
model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)

model.to(device)

params = [p for p in model.parameters() if p.requires_grad]

optimizer = torch.optim.SGD(params, lr=0.005, momentum=0.9, weight_decay=0.0005)

In [None]:
# Training Loop
for epoch in range(num_epochs):
    train_data_loader = tqdm.tqdm(train_data_loader)
    for images, targets, image_ids in train_data_loader:
        
        images = list(image.to(device) for image in images)
        targets = [{k: v.to(device) for k, v in t.items()} for t in targets]

        loss_dict = model(images, targets)

        losses = sum(loss for loss in loss_dict.values())
        loss_value = losses.item()

        optimizer.zero_grad()
        losses.backward()
        optimizer.step() 
    print(f"\n Epoch #{epoch} loss: {loss_value}")

torch.save(model.state_dict(), 'fasterrcnn_resnet50_fpn.pth') 


## Visualizing Predictions

Using Remo, we can visualize predictions vs the original labels.

To do this we create a new AnnotationSet, and upload predictions as a csv file

In [None]:
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')

model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=False, pretrained_backbone=False)

in_features = model.roi_heads.box_predictor.cls_score.in_features
model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)
model.load_state_dict(torch.load('fasterrcnn_resnet50_fpn.pth'))
model.eval()

x = model.to(device)

In [None]:
# Mapping Between Predicted Index and Class Name
mapping = { value : key for (key, value) in cat_to_index.items()}

detection_threshold = 0.5
results = []

test_data_loader = tqdm.tqdm(test_data_loader)


for images, image_ids in test_data_loader:

    images = list(image.to(device) for image in images)
    outputs = model(images)

    for i, image in enumerate(images):

        boxes = outputs[i]['boxes'].data.cpu().numpy()
        scores = outputs[i]['scores'].data.cpu().numpy()
        boxes = boxes[scores >= detection_threshold].astype(np.int32)
        scores = scores[scores >= detection_threshold]
        image_id = image_ids[i]
        
        for box, labels in zip(boxes, outputs[i]['labels']):
            results.append({"file_name" : os.path.basename(image_id), "classes" : mapping[labels.item()], 
            "xmin" : box[0], "ymin" : box[1], "xmax" : box[2], "ymax" : box[3]})

with open('results.csv', 'w') as csvfile:
    writer = csv.DictWriter(csvfile, fieldnames=["file_name", "classes", "xmin", "ymin", "xmax", "ymax"])
    writer.writeheader()
    writer.writerows(results)



In [None]:
object_detection_dataset.create_annotation_set("Object Detection", name="model_predictions", path_to_annotation_file="./results.csv")

In [None]:
object_detection_dataset.view()

![visualize_predictions](assets/obj_visualize_results.png)