# Train YOLACT on VoTT annotated images

This notebook was tested in docker image `pytorch/pytorch:0.4.1-cuda9-cudnn7-runtime` with 4G shared memory.

0. [Install](#0.-Install)
  1. [Install PyTorch](#0.1-Install-PyTorch)
  2. [Install some other packages](#0.2-Install-some-other-packages)
1. [Prepare pretrained weights and dataset](#1.-Prepare-dataset-and-weights)
  1. [Download pretrained weights](#1.1-Download-weights)
  2. [Prepare dataset](#1.2-Prepare-dataset)
  3. [Modify the config file](#1.3-Write-config-file)
2. [Train](#2.-Train)
3. [Evaluate](#3.-Evaluate)

## 0. Install

### 0.1 Install PyTorch

Install [PyTorch](http://pytorch.org/) 1.0.1 (or higher) and TorchVision.

### 0.2 Install some other packages

In [None]:
# Install opencv dependencies
!apt-get update
!apt-get -y install libglib2.0-0 libsm6 libxrender1 libxext-dev
# Install yolact dependencies
!pip install cython
!pip install opencv-python pillow pycocotools matplotlib
# Download yolact
!git clone https://github.com/dbolya/yolact yolact
!mkdir -p yolact/weights

## 1. Prepare pretrained weights and dataset

### 1.1 Download pretrained weights
Download the imagenet-pretrained model from [here](https://drive.google.com/file/d/1tvqFPd4bJtakOlmn-uIA492g2qurRChj/view?usp=sharing) and put it in `yolact/weights`.

### 1.2 Prepare dataset

Annotate images in [VoTT](https://github.com/microsoft/VoTT), export the project, and set the path as shown below. If you do not have annotated images, there are some [Azure logos](https://github.com/microsoft/AIVisualProvision/tree/master/Documents/Images/Training_DataSet) with VoTT annotations for you to try out. 4 out of 15 logos are tagged with polygon for instance segmentation.

In [None]:
# Run this cell if you want to use Azure logo images
!git clone https://github.com/microsoft/AIVisualProvision

Transform VoTT annotation into YOLACT format.

In [None]:
import os

########################################
# Set paths
########################################

base_dir = os.path.join(os.getcwd())
# The path of VoTT project exported in VoTT JSON format
vott_path = os.path.join(base_dir, "vott", "vott-json-export", "logo_seg-export.json")
# Raw data path
data_path = os.path.join(base_dir, "AIVisualProvision", "Documents", "Images", "Training_DataSet")
# Processed data path
train_image_path = data_path
valid_image_path = data_path
# The output YOLACT info path
train_info_path = os.path.join(base_dir, "yolact", "yolact_info.json")
valid_info_path = train_info_path

In [None]:
import glob
import json

from PIL import Image, ExifTags
import numpy as np


with open(vott_path, "r") as f:
    vott = json.load(f)

########################################
# Prepare images
########################################

print("Processing images...", end="")

def rotate(origin, point, angle):
    """
    Rotate a point counterclockwise by a given angle (0/90/180/270) around a given origin.
    """
    ox, oy = origin
    px, py = point

    if angle == 90:
        qx, qy = oy - (py - oy), ox + (px - ox)
    elif angle == 180:
        qx, qy = ox - (px - ox), oy - (py - oy)
    elif angle == 270:
        qx, qy = oy + (py - oy), ox - (px - ox)
    elif angle in [0, 360]:
        qx, qy = px, py
    #qx = ox + math.cos(angle) * (px) - math.sin(angle) * (py)
    #qy = oy + math.sin(angle) * (px) + math.cos(angle) * (py)
    return qx, qy

vott_images = [image for image in vott["assets"].values() if image["asset"]["format"] in ["jpeg", "jpg", "png"]]

class_names = tuple(tag["name"] for tag in vott["tags"])
label_map = {tag["name"]:i for i, tag in enumerate(vott["tags"], 1)}
#yolcal_label_map = {i:label_map[] for i, class_id in enumerate(class_names)}

orientation_key = [k for k, v in ExifTags.TAGS.items() if v == 'Orientation'][0]
orientation_map = {3:180, 6:90, 8:270}
images = list()
for i_im, image in enumerate(vott_images, 1):
    #print("\rProcessing images {}/{}".format(i_im, len(vott_images)), end="")
    
    file_name = image["asset"]["name"].replace("%20", " ")
    file_path = os.path.join(data_path, file_name)
    img = Image.open(file_path)
    orientation = None if img._getexif() is None else img._getexif().get(orientation_key, None)
    yolact_image = {
        "id" : int(image["asset"]["id"], 16),
        "orientation" : orientation,
        "file_name" : file_name,
        "license" : None,
        "flickr_url" : None,
        "coco_url" : None,
        "date_captured" : None,
    }
    if orientation not in [6, 8]:
        yolact_image.update(image["asset"]["size"])
    else:
        yolact_image.update({
            "width" : image["asset"]["size"]["height"],
            "height" : image["asset"]["size"]["width"],
        })
    images.append(yolact_image)

print("\rProcessed "+str(len(vott_images))+" images.")

########################################
# Process annotations
########################################

print("Processing annotations...", end="")

def rotate_annotations(vott_points, angle, size):
    
    origin = [size["width"]//2, size["height"]//2]
    return [
        {k:v for k, v in zip(
            ["x", "y"],
            rotate(origin, [p["x"], p["y"]], angle))} \
        for p in vott_points
    ]

def get_poly_area(x,y):
    return 0.5*np.abs(np.dot(x,np.roll(y,1))-np.dot(y,np.roll(x,1)))

def rotate_bbox(bbox, angle, size):
    origin = [size["width"]//2, size["height"]//2]
    
    points = [
        [bbox["left"], bbox["left"], bbox["left"]+bbox["width"], bbox["left"]+bbox["width"]],
        [bbox["top"], bbox["top"]+bbox["height"], bbox["top"], bbox["top"]+bbox["height"]]
    ]
    points = [rotate(origin, p, angle) for p in zip(*points)]
    xs, ys = zip(*points)
    left, top = min(xs), min(ys)
    width, height = max(xs) - left, max(ys) - top
    
    return left, top, width, height

annotations = [{
    "id" : region["id"],
    "image_id" : int(image["asset"]["id"], 16),
    "category_id" : label_map[tag],
    "segmentation" : [[
        coor for point in rotate_annotations(
            region["points"],
            orientation_map.get(yolact_image["orientation"], 0),
            image["asset"]["size"]) \
        for coor in [point["x"], point["y"]]
    ]],
    "area" : get_poly_area(*zip(*[(point["x"], point["y"]) for point in region["points"]])),
    #"bbox" : [region["boundingBox"][k] for k in ["left", "top", "width", "height"]],
    "bbox" : rotate_bbox(
        region["boundingBox"],
        orientation_map.get(yolact_image["orientation"], 0),
        image["asset"]["size"]),
    "iscrowd" : 0,
} for image, yolact_image in zip(vott_images, images) for region in image["regions"] for tag in region["tags"]]

print("\rProcessed "+str(len(annotations))+" annotations.")

########################################
# Wrap-up
########################################

yolact_info = {
    "images" : images,
    "annotations" : annotations
}

with open(train_info_path, "w") as f:
    json.dump(yolact_info, f)
with open(valid_info_path, "w") as f:
    json.dump(yolact_info, f)

print("Done.")

### 1.3 Modify the config file

In [None]:
import os

config_path = os.path.join('yolact', 'data', 'config.py')
config_add_txt = """

# Custom dataset config
my_yolact_dataset = dataset_base.copy({{
    'name' : 'vott2yolact',
    'train_images' : '{}',
    'train_info' : '{}',
    'valid_images' : '{}',
    'valid_info' : '{}',
    'has_gt' : True,
    'class_names' : {},
}})

my_yolact_config = yolact_base_config.copy({{
    'name': 'my_yolact',
    'dataset': my_yolact_dataset,
    'num_classes': len(my_yolact_dataset.class_names) + 1,
    'max_iter': 2000,
    'max_size': 400,
}})
""".format(train_image_path, train_info_path, valid_image_path, valid_info_path, class_names[:4])

with open(config_path, 'a') as f:
    f.write(config_add_txt)

## 2. Train

Fill your custom config name in `--config` and retrain the model.

In [None]:
%cd yolact
!CUDA_VISIBLE_DEVICES=0 python train.py \
    --config=my_yolact_config \
    --start_iter=-1 \
    --validation_epoch=10 \
    --keep_latest
%cd -

## 3. Evaluate

Find the trained weights in `yolact/weights` and fill the file name in `--trained_model`. The weight file should have the custom config name as its prefix. Then provide the evaluate image and its output path in `--image`.

In [None]:
%cd yolact
!CUDA_VISIBLE_DEVICES=0 python eval.py \
    --trained_model=weights/my_yolact_199_2000.pth \
    --score_threshold=0.5 \
    --top_k=100 \
    --image="../AIVisualProvision/Documents/Images/Training_DataSet/allmagnets.jpg:../output_image.jpg"
%cd -

Visualize the result.

In [None]:
from PIL import Image
import matplotlib.pyplot as plt
%matplotlib inline

px = []

plt.figure(figsize=(10,10))
plt.imshow(Image.open("output_image.jpg"))