# ML4CV project work

Summary: Improving and explaining instance segmentation on a litter detection dataset

Members:
- Dell'Olio Domenico
- Delvecchio Giovanni Pio
- Disabato Raffaele

The project was developed in order to improve instance segmentation results on the [TACO Dataset](http://tacodataset.org/).

We decided to implement and test various architectures, among the highest scoring on COCO instance segmentation datasets, in order to compare their performances.
We also tested some explainability methods on these models to try and explain model predictions.

## This notebook contains:
- Validation and Test set resizing, dataset rotation

## Validation and Test set resizing, dataset rotation
As mentioned in the Training notebook for the MaskDINO architecture, we encountered some problems in allowing images in the test and validation set to be resized during evaluation.
We found the idea of applying the preprocessing, including image rotation, on the stored images to be a faster solution. This required to modify the annotations accordingly, thus we exploited some functions from the Detectron 2 framework.

In [None]:
#install detectron2
!pip install 'git+https://github.com/facebookresearch/detectron2.git@5aeb252b194b93dc2879b4ac34bc51a31b5aee13'

In [None]:
# pull and copy repository
%cd /content/
!git clone https://github.com/DomMcOyle/TACO-expl.git
%cd /content/TACO-expl
!git checkout maskdino
!git pull origin maskdino

/content
fatal: destination path 'TACO-expl' already exists and is not an empty directory.
/content/TACO-expl
Already on 'maskdino'
Your branch is up to date with 'origin/maskdino'.
From https://github.com/DomMcOyle/TACO-expl
 * branch            maskdino   -> FETCH_HEAD
Already up to date.


In [None]:
# install repository requirements and detectron2
%cd /content/TACO-expl/MaskDINO
!pip install -r requirements.txt
%cd /content/TACO-expl/MaskDINO/maskdino/modeling/pixel_decoder/ops
!sh make.sh

In [None]:
# import drive
from google.colab import drive
drive.mount("/content/MyDrive/", force_remount = True)

Mounted at /content/MyDrive/


In [None]:
# import some common libraries
import os, cv2
from google.colab.patches import cv2_imshow
import os.path
import json
import argparse
import numpy as np
import random
import datetime as dt
import copy
from pathlib import Path
from itertools import groupby

%matplotlib inline
import matplotlib.pyplot as plt
from PIL import Image

from skimage import measure
import torch
import torch.nn as nn
from sklearn.model_selection import train_test_split

import torch.utils.data
from torch.utils.data import DataLoader
from pycocotools import mask as coco_mask

%cd /content/
# Some basic setup:
# Setup detectron2 logger
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()

# import some common detectron2 utilities
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog, DatasetCatalog
from detectron2.data.transforms.augmentation_impl import ResizeShortestEdge
from detectron2.data.transforms import ResizeTransform

%cd /content/TACO-expl/



/content


In [None]:
# setting seeds
DEFAULT_RANDOM_SEED = 42
# basic random seed
def seedBasic(seed=DEFAULT_RANDOM_SEED):
    random.seed(seed)
    os.environ['PYTHONHASHSEED'] = str(seed)
    np.random.seed(seed)
# torch random seed
def seedTorch(seed=DEFAULT_RANDOM_SEED):
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
# combine
def seedEverything(seed=DEFAULT_RANDOM_SEED):
    seedBasic(seed)
    seedTorch(seed)

seedEverything()

Here we list the annotations JSON file to load to apply the edits.

In [None]:
train_annotation_file = '/content/TACO-expl/data/annotations_off_0_train.json'
val_annotation_file = '/content/TACO-expl/data/annotations_off_0_val.json'
test_annotation_file = '/content/TACO-expl/data/annotations_off_0_test.json'


img_dir = '/content/MyDrive/MyDrive/official/'

In [None]:
with open(val_annotation_file, "r") as f:
  val_annotations = json.load(f)

In [None]:
with open(test_annotation_file, "r") as f:
  test_annotations = json.load(f)

Here we declare the functions to rotate the images and eventually remove the alpha channels, as done in the original TACO repository, and the function to convert a binary mask to uncompressed Run-Lenght-Encoding.

The latter function is re-adapted from the following Stack Overflow [question](https://stackoverflow.com/questions/49494337/encode-numpy-array-using-uncompressed-rle-for-coco-dataset). This function is required as the segmentations, within the COCO format, may be described in different ways, mainly as polygons (this is the case of the official TACO dataset) and as uncompressed RLE. Since the masks are loaded from polygons and rendered as bitmaps in the moment of resizing, we were required to convert the bitmaps back to a "COCO-compatible" format.

 We ruled out the polygon representation as it would require a pretty convoluted operation which would also introduce some quantization error and, since the pycocotools repository only handles and converts in the internal compressed RLE format, we had to resort to an external function.


In [None]:
def binary_mask_to_rle(binary_mask):
    """
    Function converting a bitmap mask to Run-Lenght Encoding format.
    :param binary_mask: numpy array containing the binary map
    """
    rle = {'counts': [], 'size': list(binary_mask.shape)}
    counts = rle.get('counts')
    for i, (value, elements) in enumerate(groupby(binary_mask.ravel(order='F'))):
        if i == 0 and value == 1:
            counts.append(0)
        counts.append(len(list(elements)))
    return rle


def check_rotation_and_alpha(image):
    """
    Function checking the Exif rotation code and rotating the image accordingly.
    It also removes the alpha channel.
    :param image: PIL image to be rotated
    """
    img_shape = np.shape(image)
    rot = False
    # load metadata
    exif = image.getexif()
    if exif:
        exif = dict(exif.items())
        # Rotate portrait images if necessary (274 is the orientation tag code)
        if 274 in exif:
            if exif[274] == 3:
                image = image.rotate(180, expand=True)
            if exif[274] == 6:
                image = image.rotate(270, expand=True)
            if exif[274] == 8:
                image = image.rotate(90, expand=True)
    # If has an alpha channel, remove it for consistency
    if img_shape[-1] == 4:
        image = image[..., :3]
    return image



Then, with the following function we perform image rotation and resizing, modifying their annotation together.

In [None]:
def resize_split(ann_json_string, new_json_path, new_img_folder):
  """
  Function resizing and rotating a given split.
  :param ann_json_string: Annotation read from a json file
  :param new_json_path: path where to save the new annotation file
  :param new_image_folder: path where the edited images must be saved
  """
  # copying information from the previous annotation file
  resized_json = {}
  resized_json['info'] = ann_json_string['info']
  resized_json['scene_annotations'] = ann_json_string['scene_annotations']
  resized_json['licenses'] = ann_json_string['licenses']
  resized_json['categories'] = ann_json_string['categories']
  resized_json['scene_categories'] = ann_json_string['scene_categories']
  # creating the object for getting image resizing dimensions
  resize_obj_func = ResizeShortestEdge(800,
                                  1333,
                                  sample_style = "choice")
  # creates the image folder if necessary
  if not os.path.exists(os.path.join("/content/", new_img_folder)):
    os.mkdir(os.path.join("/content/", new_img_folder))
  res_imgs_json_path = new_json_path

  res_imgs_list = []
  res_annotation_list = []
  # for each metadata on the images
  for img_meta in ann_json_string["images"]:
    # the information is copied
    res_img_dict = img_meta.copy()
    id = img_meta["id"]
    height = img_meta["height"]
    width = img_meta["width"]
    img_path = img_meta["file_name"]
    sub_dir = img_path.split("/")[0]
    # creates subdir if necessary
    folder = os.path.join("/content/", new_img_folder, sub_dir)
    if not os.path.exists(folder):
      os.mkdir(folder)

    # resizes image
    new_h, new_w = ResizeShortestEdge.get_output_shape(height, width, 800, 1333)
    res_img_dict.update({"height": new_h, "width": new_w })
    res_imgs_list.append(res_img_dict)

    trans_func = ResizeTransform(height, width, new_h, new_w, Image.BILINEAR)

    img = Image.open(os.path.join("/content/MyDrive/MyDrive/official", img_path))
    img = check_rotation_and_alpha(img)
    img = np.array(img)
    res_img = trans_func.apply_image(img)
    res_pil = Image.fromarray(res_img.astype(np.uint8))
    res_pil.save(os.path.join(folder, img_path.split("/")[1]))

    for annotation in ann_json_string["annotations"]:
      # updates any annotation
      if annotation["image_id"] == id:
        res_segm_dict = annotation.copy()
        # decodes segmentation
        rles = coco_mask.frPyObjects(annotation["segmentation"], height, width)
        rle = coco_mask.merge(rles)
        mask = coco_mask.decode(rle)
        # transforms mask and encodes it
        res_mask = trans_func.apply_segmentation(mask)
        res_mask = np.asfortranarray(np.squeeze(res_mask))
        uncomp = binary_mask_to_rle(res_mask)

        encoded_ground_truth = coco_mask.encode(res_mask)
        # resizes area and boxes
        ground_truth_area = float(coco_mask.area(encoded_ground_truth))
        ground_truth_bounding_box = coco_mask.toBbox(encoded_ground_truth).tolist()

        res_segm_dict.update({"segmentation": uncomp,
                            "area": ground_truth_area,
                            "bbox": ground_truth_bounding_box})
        res_annotation_list.append(res_segm_dict)

  resized_json["images"] = res_imgs_list
  resized_json["annotations"] = res_annotation_list
  with open(res_imgs_json_path, "w") as f:
    json.dump(resized_json, f)
  return resized_json


In [None]:
jso = resize_split(val_annotations, "/content/TACO-expl/data/annotations_off_0_resval.json", "res_val" )


In [None]:
jso = resize_split(test_annotations, "/content/TACO-expl/data/annotations_off_0_restest.json", "res_test" )


Finally we leave the small script to apply only rotation on training images, as augmentation and pre-processing is automatically dealt with by the MaskDINO framework.

In [None]:
train_annotation_file = '/content/TACO-expl/data/annotations_off_0_train.json'

with open(train_annotation_file, "r") as f:
  train_annotations = json.load(f)

if not os.path.exists("/content/rot_train"):
  os.mkdir("/content/rot_train")

for img_info in train_annotations["images"]:
  sub_dir = img_info["file_name"].split("/")[0]
  folder = os.path.join("/content/rot_train", sub_dir)
  if not os.path.exists(folder):
      os.mkdir(folder)
  img = Image.open(os.path.join("/content/MyDrive/MyDrive/official", img_info["file_name"]))
  img = check_rotation_and_alpha(img)


  img.save(os.path.join("/content/rot_train/", img_info["file_name"]))

