# COCO to YOLO Conversion

Useful in converting COCO-formatted datasets into YOLO-format.

Make sure to erase existing YOLOv5 dataset (if any) BEFORE running this notebook.

**Author:** Charlie Clark \
**Email:** cclark339@gatech.edu

In [20]:
from typing import Tuple, Dict, List
import os, json, shutil
import tqdm

import numpy as np
import cv2

import matplotlib.pyplot as plt

## Parameters

**SRC:** the *full* filepath to the coco JSON file. \
**IMGS:** the *full* path to the directory containing the images. \
**DEST:** the *full* path to the directory where the YOLO data will be stored.

In [21]:
SRC = '/home/hice1/cclark339/scratch/Data/Lindenthal_Camera_Traps/lindenthal-camera-traps/lindenthal_coco/train.json' # change as needed
IMGS = '/home/hice1/cclark339/scratch/Data/Lindenthal_Camera_Traps/lindenthal-camera-traps/lindenthal_coco/images/' # change as needed
DEST = '/home/hice1/cclark339/scratch/Data/Lindenthal_Camera_Traps/lindenthal-camera-traps/lindenthal_yolo/' # change as needed
mode = 'train' # change as needed

assert mode in ['train', 'val'], f'Error: mode must either be "train" or "val" (got "{mode}")'

## CAUTION!

Unless you've thoroughly reviewed the code and understand how it works, please do NOT change anything below this line.

For inquiries, reach out to me on Slack or via my email at the top of this notebook.

## Preparation and EDA

In [22]:
IMGS = IMGS.rstrip('/') + '/'
DEST = DEST.rstrip('/') + '/'

print(f'IMGS: {IMGS}')
print(f'DEST: {DEST}')

IMGS: /home/hice1/cclark339/scratch/Data/Lindenthal_Camera_Traps/lindenthal-camera-traps/lindenthal_coco/images/
DEST: /home/hice1/cclark339/scratch/Data/Lindenthal_Camera_Traps/lindenthal-camera-traps/lindenthal_yolo/


In [23]:
DEST_IMGS = DEST + f'images/'
DEST_LABS = DEST + f'labels/'

print(f'DEST_IMGS: {DEST_IMGS}')
print(f'DEST_LABS: {DEST_LABS}')

DEST_IMGS: /home/hice1/cclark339/scratch/Data/Lindenthal_Camera_Traps/lindenthal-camera-traps/lindenthal_yolo/images/
DEST_LABS: /home/hice1/cclark339/scratch/Data/Lindenthal_Camera_Traps/lindenthal-camera-traps/lindenthal_yolo/labels/


In [24]:
if not os.path.exists(DEST):
    os.mkdir(DEST)
    os.mkdir(DEST_IMGS)
    os.mkdir(DEST_LABS)

DEST_IMGS += f'{mode}/'
DEST_LABS += f'{mode}/'

if not os.path.exists(DEST_IMGS) and not os.path.exists(DEST_LABS):
    os.mkdir(DEST_IMGS)
    os.mkdir(DEST_LABS)

In [25]:
with open(SRC, 'r') as f:
    data = json.load(f)

print(f'JSON keys: {list(data.keys())}')

JSON keys: ['annotations', 'images', 'licenses', 'info', 'categories']


In [26]:
data_annotations = data['annotations']
data_images = data['images']
data_categories = data['categories']

In [27]:
data_annotations[0]

{'id': 458,
 'image_id': '20200807023726-31',
 'category_id': 2,
 'segmentation': [[432.21,
   240.17,
   433.6,
   239.66,
   433.85,
   238.15,
   432.97,
   236.38,
   432.46,
   234.61,
   434.74,
   234.86,
   435.49,
   236.76,
   435.87,
   240.55,
   435.72,
   243.71,
   435.22,
   246.01,
   435.42,
   248.9,
   435.82,
   251.81,
   434.62,
   253.91,
   432.42,
   254.4,
   431.42,
   253.2,
   429.43,
   251.61,
   428.43,
   249.8,
   426.52,
   248.8,
   424.82,
   247.41,
   422.42,
   245.4,
   419.22,
   243.1,
   418.53,
   244.71,
   419.02,
   246.91,
   419.53,
   248.31,
   420.73,
   250.21,
   422.22,
   251.9,
   423.72,
   253.2,
   423.83,
   255.01,
   421.92,
   255.5,
   420.82,
   254.4,
   419.53,
   252.61,
   418.02,
   250.41,
   416.92,
   248.11,
   416.03,
   245.71,
   416.03,
   245.71,
   415.13,
   243.61,
   412.73,
   242.81,
   411.32,
   241.9,
   409.43,
   240.81,
   408.92,
   243.71,
   408.82,
   245.61,
   408.12,
   248.01,
   408.2

In [28]:
data_images[0]

{'id': '20200807023726-31',
 'width': 848,
 'height': 480,
 'file_name': '20200807023726/color/000031.jpg',
 'license': 0,
 'flickr_url': '',
 'coco_url': '',
 'date_captured': 0,
 'seq_id': '20200807023726',
 'seq_num_frames': 451,
 'frame_num': 31,
 'corrupt': False,
 'location': 0,
 'datetime': '2020-08-07 02:37:26'}

In [29]:
data_categories

[{'id': 1, 'name': 'deer', 'supercategory': ''},
 {'id': 2, 'name': 'goat', 'supercategory': ''},
 {'id': 3, 'name': 'donkey', 'supercategory': ''},
 {'id': 4, 'name': 'goose', 'supercategory': ''}]

## Define Functions

In [30]:
def bbox_coco2yolo(coco_bbox: Tuple[float, float, float, float], img_dims: Tuple[float, float], epsilon=1) -> Tuple[float, float, float, float]:
    '''
    Converts a COCO JSON bbox into a YOLOv5 bbox.

    Inputs:
        coco_bbox: A tuple of form (x_tl, y_tl, w, h), where...
          - x_tl is the top-left x-coordinate
          - y_tl is the top-left y-coordinate
          - w is the width of the bbox
          - h is the height of the bbox
        img_dims: A tuple of form (W, H), where...
          - W is the width of the image
          - H is the height of the image
        epsilon: A small stabilization term that represents the number of pixels (horizontal and vertical) with which x_c and y_c are moved to prevent
                 bounding boxes from having coordinates outside the normalized range [0, 1] because of floating point instability.

    Returns:
        A YOLOv5-formatted bbox of form (x_c, y_c, w, h), where...
          - x_c is the (normalized) center x-coordinate
          - y_c is the (normalized) center y-coordinate
          - w is the (normalized) width of the bbox
          - h is the (normalized) height of the bbox
    '''
    
    # unpack arguments
    x_tl, y_tl, w, h = coco_bbox
    W, H = img_dims

    # compute bbox center coordinates
    x_br = x_tl + w
    y_br = y_tl + h

    x_c = (x_tl + x_br) / 2
    y_c = (y_tl + y_br) / 2

    # move bbox towards center using epsilon (for stability)
    W_mid = W / 2
    H_mid = H / 2

    dx_c = -1 if x_c > W_mid else 1 if x_c < W_mid else 0
    dy_c = -1 if y_c > H_mid else 1 if y_c < H_mid else 0

    x_c += dx_c * epsilon
    y_c += dy_c * epsilon

    # normalize
    x_c /= W
    y_c /= H

    w /= W
    h /= H

    # round coordinates
    x_c = round(x_c, 4)
    y_c = round(y_c, 4)

    w = round(w, 4)
    h = round(h, 4)

    # return YOLOv5 bbox coordinates
    return x_c, y_c, w, h

In [31]:
def find_img(images: List[Dict], target_id: str, debug=False) -> Dict:
    '''
    Given a target image ID, finds and returns the appropriate image.

    Inputs:
        images: A list of image objects, as obtained from the COCO JSON file.
        target_id: A string representing the target image ID to be returned.
        debug: A boolean flag which puts the function into debug mode (adds extra print statements) when True.

    Returns:
        img: the image associated with the entered target image ID (if one exists). If none exists, throws a ValueError.
    '''

    for img in images:
        img_id = img['id']

        if img_id == target_id:
            if debug:
                print(f'Found image with ID "{target_id}"!')

            return img

    raise ValueError(f'Error: No image with ID "{target_id}" found!')

In [32]:
def convert_annotations(images: List[Dict], annotations: List[Dict], debug=False) -> Dict[str, List[str]]:
    '''
    Converts all COCO JSON annotations into YOLOv5 annotations

    Inputs:
        images: A list of image objects, as obtained from the COCO JSON file.
        annotations: A list of annotation objects, as obtained from a COCO JSON file.
        debug: A boolean flag which puts the function into debug mode (adds extra print statements) when True.
    
    Returns:
        yolo_annotations: A dictionary containing all converted YOLOv5 annotations, with image filenames used as the keys.
    '''
    
    yolo_annotations = dict()

    for annotation in tqdm.tqdm(annotations):
        # unpack necessary annotation data
        annotation_id = annotation['id']
        target_id = annotation['image_id']
        category_id = annotation['category_id']
        coco_bbox = annotation['bbox']

        # find and unpack necessary image data
        try:
            img = find_img(images, target_id, debug=debug)
            img_dims = (img['width'], img['height'])
            img_name = img['file_name']

            # create and store YOLOv5 annotation
            x_c, y_c, w, h = bbox_coco2yolo(coco_bbox, img_dims)
            yolo_annotation = f'{category_id - 1} {x_c} {y_c} {w} {h}'

            if img_name in list(yolo_annotations.keys()):
                yolo_annotations[img_name].append(yolo_annotation)
            else:
                yolo_annotations[img_name] = [yolo_annotation]

            if debug:
                print(f'Converted and stored annotation with ID "{annotation_id}" as a YOLOv5 annotation!')
        except ValueError:
            if debug:
                print(f'Couldn\'t find image with ID "{target_id}"... skipping.')

    return yolo_annotations

In [33]:
def generate_empty_annotations(images: List[Dict], annotations: List[Dict], debug=False) -> Dict[str, List]:
    '''
    Converts all empty images into (empty) YOLOv5 annotations.

    Inputs:
        images: A list of image objects, as obtained from the COCO JSON file.
        annotations: A list of annotation objects, as obtained from a COCO JSON file.
        debug: A boolean flag which puts the function into debug mode (adds extra print statements) when True.

    Returns:
        empty_annotations: A dictionary containing all empty YOLOv5 annotations, with image filenames used as the keys.
    '''

    # get all annotated image IDs
    annotations_set = set()

    for annotation in annotations:
        img_id = annotation['image_id']
        annotations_set.add(img_id)

    if debug:
        print(f'{len(annotations_set)} Annotated Images.')

    # get all image IDs (annotated and unannotated)
    images_set = set()

    for img in images:
        img_id = img['id']
        images_set.add(img_id)

    if debug:
        print(f'{len(images_set)} Images.')

    # find difference between the sets to get empty images
    empties_set = images_set - annotations_set
    if debug:
        print(f'{len(empties_set)} Empty Images.')

    # store and return empty annotations
    empty_annotations = dict()

    for img in images:
        img_id = img['id']
        img_name = img['file_name']
        
        if img_id in empties_set:
            empty_annotations[img_name] = []

    return empty_annotations

In [34]:
def write_annotations(yolo_annotations: Dict[str, List[str]], empty_annotations: Dict[str, List], src_imgs: str, dest_imgs: str, dest_labels: str, debug=False) -> None:
    '''
    Builds a YOLOv5-formatted dataset in the destination folders.

    Inputs:
        yolo_annotations: a dictionary of generated YOLOv5 annotations, as returned by the "convert_annotations" function.
        empty_annotations: a dictionary of generated (empty) YOLOv5 annotations, as returned by the "generate_empty_annotations" function.
        src_imgs: the source images folder (from the COCO dataset).
        dest_imgs: the destination images folder (in the new YOLOv5 dataset).
        dest_labels: the destination annotations folder (in the new YOLOv5 dataset).
        debug: A boolean flag which puts the function into debug mode (adds extra print statements) when True.
    
    Returns: Nothing :)
    '''

    all_annotations = dict()
    all_annotations.update(yolo_annotations)
    all_annotations.update(empty_annotations)

    for img_name, annotations_list in all_annotations.items():
        # extract necessary information
        filename = img_name.split('/')[-1]
        basename = '.'.join(filename.split('.')[:-1])
        extension  = filename.split('.')[-1]

        # if collision with existing file, then rename to avoid overwriting
        count = 0

        prop_full_dest_img_path = dest_imgs + basename + f'-{count}.' + extension
        prop_full_dest_labels_path = dest_labels + basename + f'-{count}.txt'

        while os.path.exists(prop_full_dest_img_path) or os.path.exists(prop_full_dest_labels_path):
            count += 1
            
            prop_full_dest_img_path = dest_imgs + basename + f'-{count}.' + extension
            prop_full_dest_labels_path = dest_labels + basename + f'-{count}.txt'

        full_src_img_path = src_imgs + img_name
        full_dest_img_path = prop_full_dest_img_path
        full_dest_labels_path = prop_full_dest_labels_path

        del prop_full_dest_img_path
        del prop_full_dest_labels_path

        # copy images to destination images folder, and write annotations in destionation labels folder
        shutil.copy(full_src_img_path, full_dest_img_path)
        with open(full_dest_labels_path, 'a') as f:
            for annotations_str in annotations_list:
                f.write(annotations_str + '\n')

            if len(annotations_str) == 0:
                f.write('')

        if debug:
            print(f'Successfully wrote all annotations for image "{img_name}"!')

In [35]:
def write_labelmap(categories: List[Dict], dest: str, debug=False) -> None:
    '''
    Writes a labelmap.txt in the root of the YOLOv5 dataset (if it doesn't exist already).

    Inputs:
        categories: a dictionary containing all the possible classes and their respective natural language names.
        dest: the root of the destination YOLOv5 dataset.
        debug: A boolean flag which puts the function into debug mode (adds extra print statements) when True.
    
    Returns: Nothing :)
    '''
    
    filename = 'labelmap.txt'
    full_filepath = dest + filename

    if not os.path.exists(full_filepath):
        with open(full_filepath, 'a') as f:
            for category in categories:
                f.write(f'{category["name"]}\n')

        if debug:
            print('labelmap.txt written in root of dataset!')
    else:
        if debug:
            print('labelmap.txt already exists!')

## Use Functions to Generate a YOLOv5 Dataset from our Parameters.

Please run the cells below when you want to generate your YOLOv5 dataset using your previously-defined parameters.

In [36]:
debug = True

In [37]:
yolo_annotations = convert_annotations(data_images, data_annotations, debug=debug)
empty_annotations = generate_empty_annotations(data_images, data_annotations, debug=debug)

100%|██████████| 976/976 [00:00<00:00, 33173.21it/s]

Found image with ID "20200807023726-31"!
Converted and stored annotation with ID "458" as a YOLOv5 annotation!
Found image with ID "20200807023726-31"!
Converted and stored annotation with ID "459" as a YOLOv5 annotation!
Found image with ID "20200807023726-31"!
Converted and stored annotation with ID "460" as a YOLOv5 annotation!
Found image with ID "20200807023726-31"!
Converted and stored annotation with ID "461" as a YOLOv5 annotation!
Found image with ID "20200807023726-31"!
Converted and stored annotation with ID "462" as a YOLOv5 annotation!
Found image with ID "20200807023726-31"!
Converted and stored annotation with ID "463" as a YOLOv5 annotation!
Found image with ID "20200807023726-31"!
Converted and stored annotation with ID "464" as a YOLOv5 annotation!
Found image with ID "20200807023726-31"!
Converted and stored annotation with ID "465" as a YOLOv5 annotation!
Found image with ID "20200807023726-31"!
Converted and stored annotation with ID "466" as a YOLOv5 annotation!
F




In [38]:
write_annotations(yolo_annotations, empty_annotations, IMGS, DEST_IMGS, DEST_LABS, debug=debug)
write_labelmap(data_categories, DEST, debug=debug)

Successfully wrote all annotations for image "20200807023726/color/000031.jpg"!
Successfully wrote all annotations for image "20200807023726/color/000041.jpg"!
Successfully wrote all annotations for image "20200807023726/color/000051.jpg"!
Successfully wrote all annotations for image "20200807023726/color/000061.jpg"!
Successfully wrote all annotations for image "20200807023726/color/000071.jpg"!
Successfully wrote all annotations for image "20200807023726/color/000081.jpg"!
Successfully wrote all annotations for image "20200807023726/color/000091.jpg"!
Successfully wrote all annotations for image "20200807023726/color/000101.jpg"!
Successfully wrote all annotations for image "20200807023726/color/000111.jpg"!
Successfully wrote all annotations for image "20200807023726/color/000121.jpg"!
Successfully wrote all annotations for image "20200807023726/color/000131.jpg"!
Successfully wrote all annotations for image "20200807023726/color/000141.jpg"!
Successfully wrote all annotations for i