# Loading Custom Annotations

In this notebook, we illustrate how to load an annotations file from one of the *Digital Humanities Datasets* (i.e., artistical or archeological data) so as to fit them into the COCO API.

In addition to the code examples, we also give an overview of the data structure.

In [1]:
import os 
import copy
import json 

import torch
import numpy as np
from matplotlib import pyplot as plt
from pycocotools.coco import COCO

from resources_angel.utils import print_pairs
from resources_angel.detection_coco_eval import CocoEvaluator

In [2]:
version = np.version.version  # important to work with numpy 1.17 or below
assert int(version.split(".")[1]) < 18, "Downgrade your numpy to 1.17 or below"

In [3]:
%reload_ext autoreload
%autoreload 2

In [10]:
# constants
JSON_FILE_NAME = "efi_arthist_valid.json"
JSON_PATH = os.path.join(os.getcwd(), "resources_coco", JSON_FILE_NAME)

## 1. Data Inspection

In this section, we load the *JSON* file with the preprocessed (CSV to JSON) annotations and display some stats and a subset

In [11]:
# loading annotations
with open(JSON_PATH) as file:
    annotations = json.load(file)

In [12]:
instances = annotations["annotations"]
n_instances = len(instances)

classes = [(ann["id"],ann["name"]) for ann in annotations["categories"]]
n_classes = len(classes)

imgs = [(img["id"],img["file_name"]) for img in annotations["images"]]
n_imgs = len(imgs)

In [13]:
print("Classes: ")
print("---------")
print(f"   Total on {n_classes} classes:")
print_pairs(classes)

print("\n\nImages: ")
print("-------")
print(f"Total of {n_imgs} images")
print_pairs(imgs[:3], n_pairs=1)

print("\n\nAnnotations: ")
print("------------")
print(f"Total of {n_instances} person instances annotated")
for i in range(5):
    print(f"   {instances[i]}")

Classes: 
---------
   Total on 21 classes:
      1:putto  2:column  3:mary  4:book  5:gabriel  6:flower  7:bookrest  8:god  9:annunciation  
      10:speech scroll  11:basket  12:dove  13:angel  14:scepter  15:flower vase  16:bed  17:stool  
      18:jesus child  19:cat  20:vase  21:window  

Images: 
-------
Total of 2543 images
      0:/localhome/prathmeshmadhu/work/EFI/Data/Art_history/latest/Joconde/Joconde_iid0103193-000.jpg  
      1:/localhome/prathmeshmadhu/work/EFI/Data/Art_history/latest/RMN/page_3_item_21_09-503957.jpg  
      2:/localhome/prathmeshmadhu/work/EFI/Data/Art_history/latest/stadel/44_initial-m-ornament-mit-verkuendigung-an-maria.jpg  
      

Annotations: 
------------
Total of 5018 person instances annotated
   {'id': 0, 'image_id': 0, 'img_name': 'Joconde_iid0103193-000.jpg', 'filename': '/localhome/prathmeshmadhu/work/EFI/Data/Art_history/latest/Joconde/Joconde_iid0103193-000.jpg', 'bbox': '312,205,477,470', 'category_id': 1, 'iscrowd': 0, 'area': 43725}
   

Basically, we need three different fields in the *annotations* json. The script **aux_process_arch_data.py** automatically extracts those from the original *.csv* annotation files:
 - **images**: Maps the unique ID of a dataset sample (*image_id*) to the name or path of the image used for loading.
 
 - **categories**: Maps the numeric identifier of a class with its semantic label (e.g., 1: 'abduction')
 
 - **annotations**: Dictionary with the detection information for each of the annotated instances in the dataset. It must containg the following fields:
     - **id**: Unique identifier of the detection instance
     - **image_id**: Identifier of the image to which the current instance belongs to
     - **bbox**: coordinates of the bounding box containing the detection. Format is (x_min, y_min, x_max, y_max)
     - **category_id**: numeric identifier of the class to which the detection corresponds
     - **iscrowd**: If 1, detection is considered as part of a crowd and results do not count. It is hardcoded to 0 for our data.
     - **area**: Area (in pixels) of the bounding box. COCO discards annotatios with areas too large and too small
     - **img_name** & **filename**: These are not necessary, but I add them for completeness

## 2. Data Preprocessing

COCO wants the bounding boxes in the format (x_min, y_min, width, height). Therefore, we need to convert the annotations into the desired format

In [14]:
processed_instances = annotations["annotations"]
n_instances = len(processed_instances)

for inst in processed_instances:
    xmin, ymin, xmax, ymax = [int(c) for c in inst["bbox"].split(",")]
    coords = [xmin, ymin, xmax - xmin, ymax - ymin]  # converting to x,y,w,h
    inst["bbox"] = coords

In [15]:
print("\n\nAnnotations: ")
print("------------")
print(f"Total of {n_instances} person instances annotated")
for i in range(3):
    print(f"   {processed_instances[i]}")



Annotations: 
------------
Total of 5018 person instances annotated
   {'id': 0, 'image_id': 0, 'img_name': 'Joconde_iid0103193-000.jpg', 'filename': '/localhome/prathmeshmadhu/work/EFI/Data/Art_history/latest/Joconde/Joconde_iid0103193-000.jpg', 'bbox': [312, 205, 165, 265], 'category_id': 1, 'iscrowd': 0, 'area': 43725}
   {'id': 1, 'image_id': 1, 'img_name': 'page_3_item_21_09-503957.jpg', 'filename': '/localhome/prathmeshmadhu/work/EFI/Data/Art_history/latest/RMN/page_3_item_21_09-503957.jpg', 'bbox': [0, 72, 57, 496], 'category_id': 2, 'iscrowd': 0, 'area': 28272}
   {'id': 2, 'image_id': 2, 'img_name': '44_initial-m-ornament-mit-verkuendigung-an-maria.jpg', 'filename': '/localhome/prathmeshmadhu/work/EFI/Data/Art_history/latest/stadel/44_initial-m-ornament-mit-verkuendigung-an-maria.jpg', 'bbox': [455, 224, 62, 99], 'category_id': 3, 'iscrowd': 0, 'area': 6138}


In [16]:
fit_annotations = {
    "annotations": processed_instances,
    "images": annotations["images"],
    "categories": annotations["categories"]
}

In [19]:
annotations["categories"]

import yaml
params = yaml.safe_load(open(f'projects/efi_arthist.yml'))
train_categories = params['obj_list']
train_cat_list = []
for i, val in enumerate(train_categories):
    one_dict = {}
    one_dict['id'] = i + 1
    one_dict['name'] = val
    train_cat_list.append(one_dict)
print (train_cat_list)

[{'id': 1, 'name': 'putto'}, {'id': 2, 'name': 'mary'}, {'id': 3, 'name': 'gabriel'}, {'id': 4, 'name': 'book'}, {'id': 5, 'name': 'column'}, {'id': 6, 'name': 'dove'}, {'id': 7, 'name': 'bookrest'}, {'id': 8, 'name': 'flower'}, {'id': 9, 'name': 'angel'}, {'id': 10, 'name': 'annunciation'}, {'id': 11, 'name': 'flower vase'}, {'id': 12, 'name': 'speech scroll'}, {'id': 13, 'name': 'god'}, {'id': 14, 'name': 'scepter'}, {'id': 15, 'name': 'bed'}, {'id': 16, 'name': 'basket'}, {'id': 17, 'name': 'stool'}, {'id': 18, 'name': 'vase'}, {'id': 19, 'name': 'jesus child'}, {'id': 20, 'name': 'cat'}, {'id': 21, 'name': 'window'}, {'id': 22, 'name': 'door'}]


## 3. Fitting COCO API

We now use the loaded annotations to fit the COCO API and a COCO Evaluator for further evaluation of the detected results


In [11]:
# intiializing COCO dataset and fitting the annotations
coco_dataset = COCO()
coco_dataset.dataset = fit_annotations
coco_dataset.createIndex()

creating index...
index created!


In [12]:
coco_dataset

<pycocotools.coco.COCO at 0x7f78ee9f1dd8>

In [13]:
# this is necessary to let COCO know that we only do BBOX detection, but not keypoint or mask
iou_types = ["bbox"]
# intializing COCO evaluator
coco_evaluator = CocoEvaluator(coco_dataset, iou_types)

## 4. Simulating Evaluation

To make sure everything works alright, we simulate the evaluation process. For simplicity, we use the annotations (instead of the outputs of a CNN) for the evaluation, thus obtaining 100% mAP.

Somehow, we only obtain 96.8% mAP. I have not found out yet why this happens, but I conjecture it might be due to rounding issues or to some annotation that is filtered because it does not fulfill the area condition.

In [14]:
# loading annotations
with open(JSON_PATH) as file:
    annotations = json.load(file)

# iterating all images
for i, img in enumerate(annotations["images"]):
    img_id = img["id"]
    boxes, scores, labels = [], [], []
    
    # obtaining all annotations with the current image id
    for j, instance in enumerate(annotations["annotations"]):
        inst_id = int(instance["image_id"])
        if(inst_id != img_id):
            continue
        # saving relevant features of current instance
        boxes.append([int(coord) for coord in instance["bbox"].split(",")])  # coords are string when reading from file
        labels.append(instance["category_id"])
        scores.append(1)
    
    # the results must be given in this shape to the COCO Evaluator
    output = {
        "scores": torch.Tensor(scores),
        "labels": torch.Tensor(labels),
        "boxes": torch.Tensor(boxes)
    }
    res = {img_id: output}
    # updating the evaluator with current results
    coco_evaluator.update(res)
    

In [15]:
coco_evaluator.synchronize_between_processes()
coco_evaluator.accumulate()
valid_stats = coco_evaluator.summarize()["bbox"].tolist()

Accumulating evaluation results...
DONE (t=0.21s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.952
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.952
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.952
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.947
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.944
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.952
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.952
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= la

In [20]:
import pycocotools

In [22]:
pycocotools.coco.__version__

'2.0'

<div class=alert style="background-color:#F5F5F5; border-color:#C8C8C8">
   This notebook was created by <b>Angel Villar-Corrales</b>
</div> 