# Loading Custom Annotations

In this notebook, we illustrate how to load an annotations file from one of the *Digital Humanities Datasets* (i.e., artistical or archeological data) so as to fit them into the COCO API.

In addition to the code examples, we also give an overview of the data structure.

In [1]:
import os 
import copy
import json 

import torch
import numpy as np
from matplotlib import pyplot as plt
from pycocotools.coco import COCO

from resources_angel.utils import print_pairs
from resources_angel.detection_coco_eval import CocoEvaluator

In [2]:
version = np.version.version  # important to work with numpy 1.17 or below
assert int(version.split(".")[1]) < 18, "Downgrade your numpy to 1.17 or below"

In [3]:
%reload_ext autoreload
%autoreload 2

In [5]:
# constants
JSON_FILE_NAME = "efi_classarch_train.json"
JSON_PATH = os.path.join(os.getcwd(), "resources_coco", JSON_FILE_NAME)

## 1. Data Inspection

In this section, we load the *JSON* file with the preprocessed (CSV to JSON) annotations and display some stats and a subset

In [6]:
# loading annotations
with open(JSON_PATH) as file:
    annotations = json.load(file)

In [7]:
instances = annotations["annotations"]
n_instances = len(instances)

classes = [(ann["id"],ann["name"]) for ann in annotations["categories"]]
n_classes = len(classes)

imgs = [(img["id"],img["file_name"]) for img in annotations["images"]]
n_imgs = len(imgs)

In [8]:
print("Classes: ")
print("---------")
print(f"   Total on {n_classes} classes:")
print_pairs(classes)

print("\n\nImages: ")
print("-------")
print(f"Total of {n_imgs} images")
print_pairs(imgs[:3], n_pairs=1)

print("\n\nAnnotations: ")
print("------------")
print(f"Total of {n_instances} person instances annotated")
for i in range(5):
    print(f"   {instances[i]}")

Classes: 
---------
   Total on 68 classes:
      7:shield  2:spear  15:vessel  17:trident  3:wreath (worn)  1:palmette  24:torch  28:aulos  13:winged sandal  
      5:lyre  4:column  16:club  14:Eros  19:kerykeion  30:fish  22:petasos  6:dolphin  
      20:quiver  18:thyrsos  38:dog  29:phiale  23:scepter  46:kantharos  10:bow  40:vessel (oinochoe)  
      63:vessel (loutrophoros)  9:vessel (kantharos)  11:sword  27:lions skin (headdress)  21:stick  34:door  31:wreath  8:altar  
      12:cock  41:hoop  44:tauros  35:centaur  49:hippocamp  37:stephane (bride)  48:pomegranate  26:arrow  
      25:tripod  45:phrygian cap  33:cornucopia  59:harp  43:owl  57:panther  32:lion  36:sphinx  
      39:thunderbolt  68:harpe  56:ship  67:winged sandals  53:axe  51:box  52:lions skin  47:thymiaterion  
      50:basket  55:pom  61:octopus  42:vessel (amphora)  60:ram  54:chimaira  58:pegasus  64:bed  
      62:griffin  66:taenia  65:hand-held fan  

Images: 
-------
Total of 8636 images
      0:/lo

Basically, we need three different fields in the *annotations* json. The script **aux_process_arch_data.py** automatically extracts those from the original *.csv* annotation files:
 - **images**: Maps the unique ID of a dataset sample (*image_id*) to the name or path of the image used for loading.
 
 - **categories**: Maps the numeric identifier of a class with its semantic label (e.g., 1: 'abduction')
 
 - **annotations**: Dictionary with the detection information for each of the annotated instances in the dataset. It must containg the following fields:
     - **id**: Unique identifier of the detection instance
     - **image_id**: Identifier of the image to which the current instance belongs to
     - **bbox**: coordinates of the bounding box containing the detection. Format is (x_min, y_min, x_max, y_max)
     - **category_id**: numeric identifier of the class to which the detection corresponds
     - **iscrowd**: If 1, detection is considered as part of a crowd and results do not count. It is hardcoded to 0 for our data.
     - **area**: Area (in pixels) of the bounding box. COCO discards annotatios with areas too large and too small
     - **img_name** & **filename**: These are not necessary, but I add them for completeness

## 2. Data Preprocessing

COCO wants the bounding boxes in the format (x_min, y_min, width, height). Therefore, we need to convert the annotations into the desired format

In [9]:
processed_instances = annotations["annotations"]
n_instances = len(processed_instances)

for inst in processed_instances:
    xmin, ymin, xmax, ymax = [int(c) for c in inst["bbox"].split(",")]
    coords = [xmin, ymin, xmax - xmin, ymax - ymin]  # converting to x,y,w,h
    inst["bbox"] = coords

In [10]:
print("\n\nAnnotations: ")
print("------------")
print(f"Total of {n_instances} person instances annotated")
for i in range(3):
    print(f"   {processed_instances[i]}")



Annotations: 
------------
Total of 22510 person instances annotated
   {'id': 0, 'image_id': 0, 'img_name': 'herakles_1010.jpg', 'filename': '/localhome/prathmeshmadhu/work/EFI/Data/Classical_Arch/latest/Herakles/herakles_1010.jpg', 'bbox': [326, 143, 151, 196], 'category_id': 7, 'iscrowd': 0, 'area': 29596}
   {'id': 1, 'image_id': 1, 'img_name': 'dolphin_0065.jpg', 'filename': '/localhome/prathmeshmadhu/work/EFI/Data/Classical_Arch/latest/Dolphin/dolphin_0065.jpg', 'bbox': [188, 114, 85, 70], 'category_id': 2, 'iscrowd': 0, 'area': 5950}
   {'id': 2, 'image_id': 2, 'img_name': 'kantharos_1348.jpg', 'filename': '/localhome/prathmeshmadhu/work/EFI/Data/Classical_Arch/latest/Kantharos/kantharos_1348.jpg', 'bbox': [394, 9, 115, 95], 'category_id': 7, 'iscrowd': 0, 'area': 10925}


In [11]:
fit_annotations = {
    "annotations": processed_instances,
    "images": annotations["images"],
    "categories": annotations["categories"]
}

## 3. Fitting COCO API

We now use the loaded annotations to fit the COCO API and a COCO Evaluator for further evaluation of the detected results


In [12]:
# intiializing COCO dataset and fitting the annotations
coco_dataset = COCO()
coco_dataset.dataset = fit_annotations
coco_dataset.createIndex()

creating index...
index created!


In [13]:
# this is necessary to let COCO know that we only do BBOX detection, but not keypoint or mask
iou_types = ["bbox"]
# intializing COCO evaluator
coco_evaluator = CocoEvaluator(coco_dataset, iou_types)

## 4. Simulating Evaluation

To make sure everything works alright, we simulate the evaluation process. For simplicity, we use the annotations (instead of the outputs of a CNN) for the evaluation, thus obtaining 100% mAP.

In [14]:
# loading annotations
with open(JSON_PATH) as file:
    annotations = json.load(file)

# iterating all images
for i, img in enumerate(annotations["images"]):
    img_id = img["id"]
    boxes, scores, labels = [], [], []
    
    # obtaining all annotations with the current image id
    for j, instance in enumerate(annotations["annotations"]):
        inst_id = int(instance["image_id"])
        if(inst_id != img_id):
            continue
        # saving relevant features of current instance
        boxes.append([int(coord) for coord in instance["bbox"].split(",")])  # coords are string when reading from file
        labels.append(instance["category_id"])
        scores.append(1)
    
    # the results must be given in this shape to the COCO Evaluator
    output = {
        "scores": torch.Tensor(scores),
        "labels": torch.Tensor(labels),
        "boxes": torch.Tensor(boxes)
    }
    res = {img_id: output}
    # updating the evaluator with current results
    coco_evaluator.update(res)
    

In [15]:
coco_evaluator.synchronize_between_processes()
coco_evaluator.accumulate()
valid_stats = coco_evaluator.summarize()["bbox"].tolist()

Accumulating evaluation results...
DONE (t=2.33s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 1.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 1.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.853
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.999
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= la

<div class=alert style="background-color:#F5F5F5; border-color:#C8C8C8">
   This notebook was created by <b>Angel Villar-Corrales</b>
</div> 