In [1]:
import os
import glob
from entities import VOC2COCO

## Annotation convertion from PASCAL VOC to COCO
The annotation type used by the KAIST multispectral pedestrian dataset is not in official PASCAL VOC format. 
This makes it more difficult to convert the annotations to the YOLO format required for the YOLOV8 model.
We first convert the annotations to COCO and subsequently convert the COCO annotations to the YOLO format. 
Because of this custom PASCAL VOC format we could not use a out-of-the-box VOC 2 YOLO converter.
### 1. Setting the day and night source and target 

In [2]:
base_path = "B:\\multispectral-ped-detection\\data\\RAW\\kaist-cvpr15\\"

voc_day_annotation_paths = f'{base_path}annotations-xml-new-sanitized\\day\\**\\*.xml'
voc_night_annotation_paths = f'{base_path}annotations-xml-new-sanitized\\night\\**\\*.xml'

day_image_path = f'{base_path}images\\day\\'
night_image_path = f'{base_path}images\\night\\'

voc_day_annotations = glob.glob(voc_day_annotation_paths, recursive=True)
voc_night_annotations = glob.glob(voc_night_annotation_paths, recursive=True)


In [3]:
coco_day_annotations_path = "B:/multispectral-ped-detection/data/RAW/kaist-cvpr15/annotations-coco/day.json"
coco_night_annotations_path = "B:/multispectral-ped-detection/data/RAW/kaist-cvpr15/annotations-coco/night.json"

### 2. Converting to COCO
We created a custom VOC2COCO converter based on code from this github repo:
[https://github.com/yukkyo/voc2coco](https://github.com/yukkyo/voc2coco)

We convert both day and night seperately because we want to control the ratio day/night images when splitting the dataset into train/val/test sets

In [4]:
voc2coco = VOC2COCO()
voc2coco.convert(voc_day_annotations, coco_day_annotations_path, day_image_path)
voc2coco.convert(voc_night_annotations, coco_night_annotations_path, night_image_path)

Converting started


100%|██████████| 54542/54542 [1:07:32<00:00, 13.46it/s]


Saved COCO annotations to B:/multispectral-ped-detection/data/RAW/kaist-cvpr15/annotations-coco/day.json.
Converting started


100%|██████████| 25550/25550 [20:42<00:00, 20.56it/s] 


Saved COCO annotations to B:/multispectral-ped-detection/data/RAW/kaist-cvpr15/annotations-coco/night.json.
