<span style="color: #000000; font-family: Tahoma; font-size: 3em;">Convert Annotations to COCO Format</span>

### **Hi Everyone! Welcome!**

**This noteboook demonstrates the conversion of Custom Segmentation Annotation formats to the widely supported COCO format to use in any Object Detection or Segmentation Task.**


#### With COCO Annotations, all Segmentation Frameworks such as AdelaiDet (BlendMask), Detectron2, TF Matterplot Mask RCNN etc. can be used.


I have converted & registered the HuBMAP Dataset created in this notebook as **'HuBMap COCO Dataset 512x512 Tiled'**

#### Please don't miss to visit and upvote if you find it useful
DATASET LINK - https://www.kaggle.com/sreevishnudamodaran/hubmap-coco-dataset-512x512-tiled


It is properly structured with images split into directories and no downscaling has been done to preserve the whole information.

#### Directory Format:

```
    - coco_train
        - images(contains images in jpg format)
            - original_tiff_image_name
               - tile_column_number
                   - image
                   .
                   .
                   .
                 .
                 .
                 .
              .
              .
              .
        - train.json (contains all the segmentation annotations in coco 
        -             format with proper relative path of the images)

```

A lot of thanks to Marcos Novaes (https://www.kaggle.com/marcosnovaes) for helping with the Tiling and TFRecords creation notebook and the TFRecords Dataset.


##### I will keep updating this notebook with further details of the implementation. Thanks to the whole Kaggle community and staff for all the support!

#### Please don't miss to upvote and comment if you like my work :)


##### Hope everyone finds this useful!

In [None]:
!pip install pycocotools

In [None]:
%matplotlib inline

import numpy as np
import os
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib import rcParams
sns.set(rc={"font.size":9,"axes.titlesize":15,"axes.labelsize":9,
            "axes.titlepad":11, "axes.labelpad":9, "legend.fontsize":7, "legend.title_fontsize":7, 'axes.grid' : False})
import cv2
import json
import pandas as pd
import tensorflow as tf
import glob
import os.path as osp
from path import Path

import re
import datetime
import numpy as np
from skimage import measure
from PIL import Image
import pycocotools
from pycocotools import mask

import collections
import sys
import uuid


In [None]:
BASE_PATH = "/kaggle/input/hubmap-kidney-segmentation"
df_info = pd.read_csv(
    osp.join(BASE_PATH, "HuBMAP-20-dataset_information.csv")
)
df_info.head(13)

## hubmap-tfrecord-512 Dataset Loading and Exploration

We wil be using the hubmap-tfrecord-512 Dataset which has the HuBMAP Kidney Segmenatation Dataset Images which are of very large varying sizes i.e 13kx18k pixels upto 49kx34k

#### **No downscaling of images have been done in this notebook with the intention of preserving information**

In [None]:
file_list = glob.glob('/kaggle/input/hubmap-tfrecord-512/train/*.csv')
file_list

## Combine Image Csv files to Dataframe

In [None]:
li = []
for filename in file_list:
    df = pd.read_csv(filename, index_col=None, header=0)
    li.append(df)

coco_helper_df = pd.concat(li, axis=0, ignore_index=True)
coco_helper_df.drop('Unnamed: 0', axis=1, inplace=True)
coco_helper_df.head(5)

## Define Functions to Parse & Load TFRecords

In [None]:
# read back a record to make sure it the decoding works
# Create a dictionary describing the features.
image_feature_description = {
    'img_index': tf.io.FixedLenFeature([], tf.int64),
    'height': tf.io.FixedLenFeature([], tf.int64),
    'width': tf.io.FixedLenFeature([], tf.int64),
    'num_channels': tf.io.FixedLenFeature([], tf.int64),
    'img_bytes': tf.io.FixedLenFeature([], tf.string),
    'mask': tf.io.FixedLenFeature([], tf.string),
    'tile_id': tf.io.FixedLenFeature([], tf.int64),
    'tile_col_pos': tf.io.FixedLenFeature([], tf.int64),
    'tile_row_pos': tf.io.FixedLenFeature([], tf.int64),
}

def _parse_image_function(example_proto):
  # Parse the input tf.Example proto using the dictionary above.
    single_example = tf.io.parse_single_example(example_proto, image_feature_description)
    img_index = single_example['img_index']
    img_height = single_example['height']
    img_width = single_example['width']
    num_channels = single_example['num_channels']
    
    img_bytes =  tf.io.decode_raw(single_example['img_bytes'],out_type='uint8')
   
    img_array = tf.reshape( img_bytes, (img_height, img_width, num_channels))
   
    mask_bytes =  tf.io.decode_raw(single_example['mask'],out_type='bool')
    
    mask = tf.reshape(mask_bytes, (img_height,img_width))
    mtd = dict()
    mtd['img_index'] = single_example['img_index']
    mtd['width'] = single_example['width']
    mtd['height'] = single_example['height']
    mtd['tile_id'] = single_example['tile_id']
    mtd['tile_col_pos'] = single_example['tile_col_pos']
    mtd['tile_row_pos'] = single_example['tile_row_pos']
    struct = {
        'img_array': img_array,
        'mask': mask,
        'mtd': mtd
    } 
    return struct

def read_tf_dataset(storage_file_path):
    encoded_image_dataset = tf.data.TFRecordDataset(storage_file_path, compression_type="GZIP")
    parsed_image_dataset = encoded_image_dataset.map(_parse_image_function)
    return parsed_image_dataset

## Exploration with a Sample

In [None]:
BASE_TRAIN_PATH = "/kaggle/input/hubmap-tfrecord-512/train"
sample_tile_path = osp.join(BASE_TRAIN_PATH,
                            coco_helper_df.loc[850,"img_id"],
                            'col'+str(coco_helper_df.loc[850,"tile_col_num"]),
                            str(Path(coco_helper_df.loc[850,"tile_rel_path"]).name))
sample_tile_path

## Convert Binary Mask to Polygons
### Define Functions for the Conversion

In [None]:
convert = lambda text: int(text) if text.isdigit() else text.lower()
natrual_key = lambda key: [ convert(c) for c in re.split('([0-9]+)', key) ]

def close_contour(contour):
    if not np.array_equal(contour[0], contour[-1]):
        contour = np.vstack((contour, contour[0]))
    return contour

def binary_mask_to_polygon(binary_mask, tolerance=0):
    """Converts a binary mask to COCO polygon representation
    Args:
        binary_mask: a 2D binary numpy array where '1's represent the object
        tolerance: Maximum distance from original points of polygon to approximated
            polygonal chain. If tolerance is 0, the original coordinate array is returned.
    """
    polygons = []
    poly_nogrp = []
    # pad mask to close contours of shapes which start and end at an edge
    padded_binary_mask = np.pad(binary_mask, pad_width=1, mode='constant', constant_values=0)
    contours = measure.find_contours(padded_binary_mask, 0.5)
    contours = np.subtract(contours, 1)
    for contour in contours:
        contour = close_contour(contour)
        contour = measure.approximate_polygon(contour, tolerance)
        if len(contour) < 3:
            continue
        contour = np.flip(contour, axis=1)
        segmentation = contour.ravel().tolist()
        # after padding and subtracting 1 we may get -0.5 points in our segmentation 
        segmentation = [0 if i < 0 else i for i in segmentation]
        poly_nogrp.append(segmentation)
        it = iter(segmentation)
        seg_grouped = list(map(list, zip(it, it)))
        
        polygons.append(seg_grouped)

    return polygons, poly_nogrp

### Visualize Before and After Conversion

In [None]:
ds = read_tf_dataset(sample_tile_path)

for struct in ds.as_numpy_iterator():
    #struct = g_dataset.get_next()
    img_mtd = struct["mtd"]
    img_array  = struct["img_array"]
    img_mask = struct["mask"]

    fig, ax = plt.subplots(1,3,figsize=(20,8))
    
    fig.suptitle("Tile ID = {} Xpos = {} Ypos = {}".format(img_mtd['tile_id'],
                  img_mtd['tile_col_pos'],img_mtd['tile_row_pos']), fontsize=25)

    
    ax[0].set_title("Tile Image")
    ax[0].imshow(img_array)
    #ax[1].set_title("Pixelarray distribution");
    #sns.distplot(img_array.flatten(), ax=ax[1]);
    ax[1].imshow(img_mask)
    ax[1].set_title("Tile Mask From Tfrecord")


    poly, _ = binary_mask_to_polygon(img_mask)
    im = np.zeros([512,512],dtype=np.uint8)
    for mask_pts in poly:
        pts = np.array(mask_pts, np.int32)
        pts = pts.reshape((-1,1,2))
        #cv2.polylines(im, pts, True, (0,255,255))
        cv2.fillConvexPoly(im, pts, 255)
    ax[2].imshow(im)
    ax[2].set_title("Tile Mask After Polygon Conversion")

## Create Output Directories

In [None]:
input_dir = osp.join(BASE_PATH, "train")
output_dir = "coco_train"

if osp.exists(output_dir):
    print('Output directory already exists:', output_dir)
    sys.exit(1)
os.makedirs(output_dir)
os.makedirs(osp.join(output_dir, 'images'))
print('Coco Dataset Directory:', output_dir)

## Analyzing the Popular COCO Format



## COCO Annotation Basic Syntax

```python
{
  "type": "instances",
  "images": [
    {
      "file_name": "<image_name.jpg>",
      "height": "<height>",
      "width": "<width>",
      "id": "<Used to reference each image and it should be unique for each image. This will be the 'image_id' used to tag each annotation>"
    }
#    .
#    .
#    .
      
  ],

  "categories": [
    {
      "supercategory": "none",
      "name": "<Class One>",
      "id": 0
    },
    {
      "supercategory": "none",
      "name": "<Class Two>",
      "id": 2
    }
#    .
#    .
#    .

  ],

  "annotations": [
    {
      "id": 1,
      "bbox": [
        "<xmin>",
        "<ymin>",
        "<xmax>",
        "<xmax>"
      ],
      "image_id": "<id of the image from which the polygon annotation is from as defined in the 'images' block above>",

      "segmentation": [
          "<x1>",
          "<y1>",
          "<x2>",
          "<y2>"
#          .
#          .
#          .

      ],
      "ignore": 0,
      "area": "<Area of the Polygon represented by the points in 'segmentation' block>",
      "iscrowd": 0,
      "category_id": "<Class category ID as an integer which will be defined below>"
    },

  ],
"categories": [
    {
        "supercategory": null,
        "id": "<Integer ID for the Class Label>",
        "name": "<Class One Label as a String>"
    },
#    .
#    .
#    .

]
}
```

## Defining Blocks in the Annotation
### Adding the Glomerule Class as '0'

In [None]:
labels =  ["__ignore__",
                "glomerule"
                ]
now = datetime.datetime.now()

### Define the Basic Structure

In [None]:
data = dict(
    info=dict(
        description=None,
        url=None,
        version=None,
        year=now.year,
        contributor=None,
        date_created=now.strftime('%Y-%m-%d %H:%M:%S.%f'),
    ),
    licenses=[dict(
        url=None,
        id=0,
        name=None,
    )],
    images=[
        # license, url, file_name, height, width, date_captured, id
    ],
    type='instances',
    annotations=[
        # segmentation, area, iscrowd, image_id, bbox, category_id, id
    ],
    categories=[
        # supercategory, id, name
    ],
)


In [None]:
class_name_to_id = {}
for i, each_label in enumerate(labels):
    class_id = i - 1  # starts with -1
    class_name = each_label
    if class_id == -1:
        assert class_name == '__ignore__'
        continue
    class_name_to_id[class_name] = class_id
    data['categories'].append(dict(
        supercategory=None,
        id=class_id,
        name=class_name,
    ))
data

In [None]:
out_ann_file = osp.join(output_dir, 'train.json')
tf_folders = glob.glob('/kaggle/input/hubmap-tfrecord-512/train/*/')
tf_folders

## Data Dict Before Adding Annotations

In [None]:
data

## Start the Conversion Job

In [None]:
for i in range(coco_helper_df.shape[0]):
#for i in range(850,852):  ##for testing
    tile_path = osp.join(BASE_TRAIN_PATH,
                        coco_helper_df.loc[i,"img_id"],
                        'col'+str(coco_helper_df.loc[i,"tile_col_num"]),
                        str(Path(coco_helper_df.loc[i,"tile_rel_path"]).name))
    print("tile_path", tile_path)

    ds = read_tf_dataset(tile_path)

    for struct in ds.as_numpy_iterator():
        #struct = g_dataset.get_next()
        img_mtd = struct["mtd"]
        img_array  = struct["img_array"]
        img_mask = struct["mask"]

# Code to Visualize Each Annotation  ##for testing
#         fig, ax = plt.subplots(1,3,figsize=(20,8))
#         fig.suptitle("Tile ID = {} Xpos = {} Ypos = {}".format(img_mtd['tile_id'],
#                      img_mtd['tile_col_pos'],img_mtd['tile_row_pos']), fontsize=25)

#         ax[0].set_title("Tile Image")
#         ax[0].imshow(img_array)
#         ax[1].imshow(img_mask)
#         ax[1].set_title("Tile Mask From Tfrecord")

        poly, poly_nogrp = binary_mask_to_polygon(img_mask)

#         im = np.zeros([512,512],dtype=np.uint8)
#         for mask_pts in poly:
#             pts = np.array(mask_pts, np.int32)
#             pts = pts.reshape((-1,1,2))
#             #cv2.polylines(im, pts, True, (0,255,255))
#             cv2.fillConvexPoly(im, pts, 255)
#         ax[2].imshow(im)
#         ax[2].set_title("Tile Mask After Polygon Conversion")
        
        print('Generating annotations from: {}'.format(coco_helper_df.loc[i,"img_id"]))

        jpg_folder_path = osp.join(output_dir, 'images/', coco_helper_df.loc[i,"img_id"],
                    'col'+str(coco_helper_df.loc[i,"tile_col_num"]))

        if not osp.exists(jpg_folder_path):
            os.makedirs(jpg_folder_path)

        #str(Path(train_df.loc[i,"tile_rel_path"]).stem)
        #print('JPG Directory:', jpg_folder_path)

        out_img_file = osp.join(jpg_folder_path,
                                str(Path(coco_helper_df.loc[i,"tile_rel_path"]).stem)+'.jpg'
                                )
        print('out_img_file:', out_img_file)
        im_out = Image.fromarray(img_array)
        im_out.save(out_img_file)

        data['images'].append(dict(
            license=0,
            url=None,
            file_name=osp.relpath(out_img_file, osp.dirname(out_ann_file)),
            height=img_array.shape[0],
            width=img_array.shape[1],
            date_captured=None,
            id=int(coco_helper_df.loc[i,'tile_id'])
        ))

        masks = {}

        for shape in poly_nogrp:

            group_id = uuid.uuid1()
            mask = img_mask.copy()

            mask = np.asfortranarray(mask.astype(np.uint8))
            mask = pycocotools.mask.encode(mask)
            area = float(pycocotools.mask.area(mask))
            bbox = pycocotools.mask.toBbox(mask).flatten().tolist()

            data['annotations'].append(dict(
                id=len(data['annotations']),
                image_id=int(coco_helper_df.loc[i,'tile_id']),
                category_id='0',
                segmentation=[shape],
                area=area,
                bbox=bbox,
                iscrowd=0,
            ))
    
    print("{}/{} Tile Image done.".format(i,coco_helper_df.shape[0]))
    
with open(out_ann_file, 'w') as f:
    json.dump(data, f)

## Verify Number of Files 

In [None]:
!find /kaggle/working/coco_train -type f | wc -l

In [None]:
coco_helper_df.shape

## Package Dataset into Zip for Upload

In [None]:
!zip -r hubmap-coco-512x512-tiled.zip ./coco_train

In [None]:
!rm -R coco_train

In [None]:
!ls -ahl