<a href="https://colab.research.google.com/github/amanjain487/panoptic-segmentation-using-DETR/blob/anubhav/Dataset_Creation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Create Dataset for Construction Material + COCO Classes

## Following Operations are performed using this colab file
- Given, construction dataset - things in our case
- use pretrained DETR panoptic model to predict stuff and things for all given images
- Add all stuff predictions as ground truth for our dataset and all things predictions as misc stuff class in ground truth
- Test - Train split 

## Install Requirements and Prepare the Notebook

In [None]:
import time
import glob
import torch
import os

from IPython.display import Image, clear_output 

use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")
print('PyTorch %s %s' % (torch.__version__, torch.cuda.get_device_properties(0) if torch.cuda.is_available() else 'CPU'))

!pip install git+https://github.com/cocodataset/panopticapi.git

PyTorch 1.10.0+cu111 _CudaDeviceProperties(name='Tesla K80', major=3, minor=7, total_memory=11441MB, multi_processor_count=13)
Collecting git+https://github.com/cocodataset/panopticapi.git
  Cloning https://github.com/cocodataset/panopticapi.git to /tmp/pip-req-build-x_08amhc
  Running command git clone -q https://github.com/cocodataset/panopticapi.git /tmp/pip-req-build-x_08amhc


## Mount Drive 

We need drive access for the following things
- Our dataset is stored in drive - to access the dataset
- To extract original dataset directly in drive - and use them for groud truth creation
- Create ground truths directly in drive

In [None]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


## Unzip the given original dataset

- The dataset is uploaded in zip format
- Unzip the dataset in drive itself
- Once, unzipped, delete the zip file, to manage drive's limited space

In [None]:
# import os
# os.chdir("/content/drive/MyDrive/Panoptic Segmentation using DETR/Original Dataset")

# import zipfile
# !unzip construction_materials_dataset.zip 
# os.remove("construction_materials_dataset.zip")

## Clone official DETR Repo to Drive
- We will be passing our dataset images through official pre-trained DETR model by Facebook for getting predictions of COCO Classes
- The output of DETR model will have lot of predictions ranging from 0% confidence to 100% confidence
- All the predictions with confidence greater than 85% will become our ground truth for COCO classes which will be later combined with `construction materials` annotations which will become our final Dataset. 

In [None]:
import os
import sys

os.chdir("/content/drive/MyDrive/Panoptic Segmentation using DETR/")
!git clone https://github.com/facebookresearch/detr.git
sys.path.append(os.path.join(os.getcwd(), "detr/"))


fatal: destination path 'detr' already exists and is not an empty directory.


## Define COCO Classes

- Define exisitng COCO classes which will be predicted by Facebook DETR
- Define COCO classes which we will be using
- Establish mapping from esciting to new ones

In [None]:
existing_coco_categories = [
    {"color": [220, 20, 60], "isthing": 1, "id": 1, "name": "person"},
    {"color": [119, 11, 32], "isthing": 1, "id": 2, "name": "bicycle"},
    {"color": [0, 0, 142], "isthing": 1, "id": 3, "name": "car"},
    {"color": [0, 0, 230], "isthing": 1, "id": 4, "name": "motorcycle"},
    {"color": [106, 0, 228], "isthing": 1, "id": 5, "name": "airplane"},
    {"color": [0, 60, 100], "isthing": 1, "id": 6, "name": "bus"},
    {"color": [0, 80, 100], "isthing": 1, "id": 7, "name": "train"},
    {"color": [0, 0, 70], "isthing": 1, "id": 8, "name": "truck"},
    {"color": [0, 0, 192], "isthing": 1, "id": 9, "name": "boat"},
    {"color": [250, 170, 30], "isthing": 1, "id": 10, "name": "traffic light"},
    {"color": [100, 170, 30], "isthing": 1, "id": 11, "name": "fire hydrant"},
    {"color": [220, 220, 0], "isthing": 1, "id": 13, "name": "stop sign"},
    {"color": [175, 116, 175], "isthing": 1, "id": 14, "name": "parking meter"},
    {"color": [250, 0, 30], "isthing": 1, "id": 15, "name": "bench"},
    {"color": [165, 42, 42], "isthing": 1, "id": 16, "name": "bird"},
    {"color": [255, 77, 255], "isthing": 1, "id": 17, "name": "cat"},
    {"color": [0, 226, 252], "isthing": 1, "id": 18, "name": "dog"},
    {"color": [182, 182, 255], "isthing": 1, "id": 19, "name": "horse"},
    {"color": [0, 82, 0], "isthing": 1, "id": 20, "name": "sheep"},
    {"color": [120, 166, 157], "isthing": 1, "id": 21, "name": "cow"},
    {"color": [110, 76, 0], "isthing": 1, "id": 22, "name": "elephant"},
    {"color": [174, 57, 255], "isthing": 1, "id": 23, "name": "bear"},
    {"color": [199, 100, 0], "isthing": 1, "id": 24, "name": "zebra"},
    {"color": [72, 0, 118], "isthing": 1, "id": 25, "name": "giraffe"},
    {"color": [255, 179, 240], "isthing": 1, "id": 27, "name": "backpack"},
    {"color": [0, 125, 92], "isthing": 1, "id": 28, "name": "umbrella"},
    {"color": [209, 0, 151], "isthing": 1, "id": 31, "name": "handbag"},
    {"color": [188, 208, 182], "isthing": 1, "id": 32, "name": "tie"},
    {"color": [0, 220, 176], "isthing": 1, "id": 33, "name": "suitcase"},
    {"color": [255, 99, 164], "isthing": 1, "id": 34, "name": "frisbee"},
    {"color": [92, 0, 73], "isthing": 1, "id": 35, "name": "skis"},
    {"color": [133, 129, 255], "isthing": 1, "id": 36, "name": "snowboard"},
    {"color": [78, 180, 255], "isthing": 1, "id": 37, "name": "sports ball"},
    {"color": [0, 228, 0], "isthing": 1, "id": 38, "name": "kite"},
    {"color": [174, 255, 243], "isthing": 1, "id": 39, "name": "baseball bat"},
    {"color": [45, 89, 255], "isthing": 1, "id": 40, "name": "baseball glove"},
    {"color": [134, 134, 103], "isthing": 1, "id": 41, "name": "skateboard"},
    {"color": [145, 148, 174], "isthing": 1, "id": 42, "name": "surfboard"},
    {"color": [255, 208, 186], "isthing": 1, "id": 43, "name": "tennis racket"},
    {"color": [197, 226, 255], "isthing": 1, "id": 44, "name": "bottle"},
    {"color": [171, 134, 1], "isthing": 1, "id": 46, "name": "wine glass"},
    {"color": [109, 63, 54], "isthing": 1, "id": 47, "name": "cup"},
    {"color": [207, 138, 255], "isthing": 1, "id": 48, "name": "fork"},
    {"color": [151, 0, 95], "isthing": 1, "id": 49, "name": "knife"},
    {"color": [9, 80, 61], "isthing": 1, "id": 50, "name": "spoon"},
    {"color": [84, 105, 51], "isthing": 1, "id": 51, "name": "bowl"},
    {"color": [74, 65, 105], "isthing": 1, "id": 52, "name": "banana"},
    {"color": [166, 196, 102], "isthing": 1, "id": 53, "name": "apple"},
    {"color": [208, 195, 210], "isthing": 1, "id": 54, "name": "sandwich"},
    {"color": [255, 109, 65], "isthing": 1, "id": 55, "name": "orange"},
    {"color": [0, 143, 149], "isthing": 1, "id": 56, "name": "broccoli"},
    {"color": [179, 0, 194], "isthing": 1, "id": 57, "name": "carrot"},
    {"color": [209, 99, 106], "isthing": 1, "id": 58, "name": "hot dog"},
    {"color": [5, 121, 0], "isthing": 1, "id": 59, "name": "pizza"},
    {"color": [227, 255, 205], "isthing": 1, "id": 60, "name": "donut"},
    {"color": [147, 186, 208], "isthing": 1, "id": 61, "name": "cake"},
    {"color": [153, 69, 1], "isthing": 1, "id": 62, "name": "chair"},
    {"color": [3, 95, 161], "isthing": 1, "id": 63, "name": "couch"},
    {"color": [163, 255, 0], "isthing": 1, "id": 64, "name": "potted plant"},
    {"color": [119, 0, 170], "isthing": 1, "id": 65, "name": "bed"},
    {"color": [0, 182, 199], "isthing": 1, "id": 67, "name": "dining table"},
    {"color": [0, 165, 120], "isthing": 1, "id": 70, "name": "toilet"},
    {"color": [183, 130, 88], "isthing": 1, "id": 72, "name": "tv"},
    {"color": [95, 32, 0], "isthing": 1, "id": 73, "name": "laptop"},
    {"color": [130, 114, 135], "isthing": 1, "id": 74, "name": "mouse"},
    {"color": [110, 129, 133], "isthing": 1, "id": 75, "name": "remote"},
    {"color": [166, 74, 118], "isthing": 1, "id": 76, "name": "keyboard"},
    {"color": [219, 142, 185], "isthing": 1, "id": 77, "name": "cell phone"},
    {"color": [79, 210, 114], "isthing": 1, "id": 78, "name": "microwave"},
    {"color": [178, 90, 62], "isthing": 1, "id": 79, "name": "oven"},
    {"color": [65, 70, 15], "isthing": 1, "id": 80, "name": "toaster"},
    {"color": [127, 167, 115], "isthing": 1, "id": 81, "name": "sink"},
    {"color": [59, 105, 106], "isthing": 1, "id": 82, "name": "refrigerator"},
    {"color": [142, 108, 45], "isthing": 1, "id": 84, "name": "book"},
    {"color": [196, 172, 0], "isthing": 1, "id": 85, "name": "clock"},
    {"color": [95, 54, 80], "isthing": 1, "id": 86, "name": "vase"},
    {"color": [128, 76, 255], "isthing": 1, "id": 87, "name": "scissors"},
    {"color": [201, 57, 1], "isthing": 1, "id": 88, "name": "teddy bear"},
    {"color": [246, 0, 122], "isthing": 1, "id": 89, "name": "hair drier"},
    {"color": [191, 162, 208], "isthing": 1, "id": 90, "name": "toothbrush"},
    {"color": [255, 255, 128], "isthing": 0, "id": 92, "name": "banner"},
    {"color": [147, 211, 203], "isthing": 0, "id": 93, "name": "blanket"},
    {"color": [150, 100, 100], "isthing": 0, "id": 95, "name": "bridge"},
    {"color": [168, 171, 172], "isthing": 0, "id": 100, "name": "cardboard"},
    {"color": [146, 112, 198], "isthing": 0, "id": 107, "name": "counter"},
    {"color": [210, 170, 100], "isthing": 0, "id": 109, "name": "curtain"},
    {"color": [92, 136, 89], "isthing": 0, "id": 112, "name": "door-stuff"},
    {"color": [218, 88, 184], "isthing": 0, "id": 118, "name": "floor-wood"},
    {"color": [241, 129, 0], "isthing": 0, "id": 119, "name": "flower"},
    {"color": [217, 17, 255], "isthing": 0, "id": 122, "name": "fruit"},
    {"color": [124, 74, 181], "isthing": 0, "id": 125, "name": "gravel"},
    {"color": [70, 70, 70], "isthing": 0, "id": 128, "name": "house"},
    {"color": [255, 228, 255], "isthing": 0, "id": 130, "name": "light"},
    {"color": [154, 208, 0], "isthing": 0, "id": 133, "name": "mirror-stuff"},
    {"color": [193, 0, 92], "isthing": 0, "id": 138, "name": "net"},
    {"color": [76, 91, 113], "isthing": 0, "id": 141, "name": "pillow"},
    {"color": [255, 180, 195], "isthing": 0, "id": 144, "name": "platform"},
    {"color": [106, 154, 176], "isthing": 0, "id": 145, "name": "playingfield"},
    {"color": [230, 150, 140], "isthing": 0, "id": 147, "name": "railroad"},
    {"color": [60, 143, 255], "isthing": 0, "id": 148, "name": "river"},
    {"color": [128, 64, 128], "isthing": 0, "id": 149, "name": "road"},
    {"color": [92, 82, 55], "isthing": 0, "id": 151, "name": "roof"},
    {"color": [254, 212, 124], "isthing": 0, "id": 154, "name": "sand"},
    {"color": [73, 77, 174], "isthing": 0, "id": 155, "name": "sea"},
    {"color": [255, 160, 98], "isthing": 0, "id": 156, "name": "shelf"},
    {"color": [255, 255, 255], "isthing": 0, "id": 159, "name": "snow"},
    {"color": [104, 84, 109], "isthing": 0, "id": 161, "name": "stairs"},
    {"color": [169, 164, 131], "isthing": 0, "id": 166, "name": "tent"},
    {"color": [225, 199, 255], "isthing": 0, "id": 168, "name": "towel"},
    {"color": [137, 54, 74], "isthing": 0, "id": 171, "name": "wall-brick"},
    {"color": [135, 158, 223], "isthing": 0, "id": 175, "name": "wall-stone"},
    {"color": [7, 246, 231], "isthing": 0, "id": 176, "name": "wall-tile"},
    {"color": [107, 255, 200], "isthing": 0, "id": 177, "name": "wall-wood"},
    {"color": [58, 41, 149], "isthing": 0, "id": 178, "name": "water-other"},
    {"color": [183, 121, 142], "isthing": 0, "id": 180, "name": "window-blind"},
    {"color": [255, 73, 97], "isthing": 0, "id": 181, "name": "window-other"},
    {"color": [107, 142, 35], "isthing": 0, "id": 184, "name": "tree-merged"},
    {"color": [190, 153, 153], "isthing": 0, "id": 185, "name": "fence-merged"},
    {"color": [146, 139, 141], "isthing": 0, "id": 186, "name": "ceiling-merged"},
    {"color": [70, 130, 180], "isthing": 0, "id": 187, "name": "sky-other-merged"},
    {"color": [134, 199, 156], "isthing": 0, "id": 188, "name": "cabinet-merged"},
    {"color": [209, 226, 140], "isthing": 0, "id": 189, "name": "table-merged"},
    {"color": [96, 36, 108], "isthing": 0, "id": 190, "name": "floor-other-merged"},
    {"color": [96, 96, 96], "isthing": 0, "id": 191, "name": "pavement-merged"},
    {"color": [64, 170, 64], "isthing": 0, "id": 192, "name": "mountain-merged"},
    {"color": [152, 251, 152], "isthing": 0, "id": 193, "name": "grass-merged"},
    {"color": [208, 229, 228], "isthing": 0, "id": 194, "name": "dirt-merged"},
    {"color": [206, 186, 171], "isthing": 0, "id": 195, "name": "paper-merged"},
    {"color": [152, 161, 64], "isthing": 0, "id": 196, "name": "food-other-merged"},
    {"color": [116, 112, 0], "isthing": 0, "id": 197, "name": "building-other-merged"},
    {"color": [0, 114, 143], "isthing": 0, "id": 198, "name": "rock-merged"},
    {"color": [102, 102, 156], "isthing": 0, "id": 199, "name": "wall-other-merged"},
    {"color": [250, 141, 255], "isthing": 0, "id": 200, "name": "rug-merged"},
]

categories_for_pp_project = [{'color': [220, 20, 60], 'isthing': 0, 'id': 1, 'name': 'misc'},
            {'color': [255, 255, 128], 'isthing': 0, 'id': 2, 'name': 'textile'},
            {'color': [150, 100, 100], 'isthing': 0, 'id': 3, 'name': 'building'},
            {'color': [168, 171, 172], 'isthing': 0, 'id': 4, 'name': 'rawmaterial'},
            {'color': [146, 112, 198], 'isthing': 0, 'id': 5, 'name': 'furniture'},
            {'color': [218, 88, 184], 'isthing': 0, 'id': 6, 'name': 'floor'},
            {'color': [241, 129, 0], 'isthing': 0, 'id': 7, 'name': 'plant'},
            {'color': [217, 17, 255], 'isthing': 0, 'id': 8, 'name': 'food'},
            {'color': [124, 74, 181], 'isthing': 0, 'id': 9, 'name': 'ground'},
            {'color': [193, 0, 92], 'isthing': 0, 'id': 10, 'name': 'structural'},
            {'color': [60, 143, 255], 'isthing': 0, 'id': 11, 'name': 'water'},
            {'color': [137, 54, 74], 'isthing': 0, 'id': 12, 'name': 'wall'},
            {'color': [183, 121, 142], 'isthing': 0, 'id': 13, 'name': 'window'},
            {'color': [146, 139, 141], 'isthing': 0, 'id': 14, 'name': 'ceiling'},
            {'color': [70, 130, 180], 'isthing': 0, 'id': 15, 'name': 'sky'},
            {'color': [64, 170, 64], 'isthing': 0, 'id': 16, 'name': 'solid'}]

old_to_new_category_mapping = {1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1, 10: 1, 11: 1, 13: 1, 14: 1, 15: 1, 16: 1, 17: 1, 18: 1,
           19: 1, 20: 1, 21: 1, 22: 1, 23: 1, 24: 1, 25: 1, 27: 1, 28: 1, 31: 1, 32: 1, 33: 1, 34: 1, 35: 1, 36: 1,
           37: 1, 38: 1, 39: 1, 40: 1, 41: 1, 42: 1, 43: 1, 44: 1, 46: 1, 47: 1, 48: 1, 49: 1, 50: 1, 51: 1, 52: 1,
           53: 1, 54: 1, 55: 1, 56: 1, 57: 1, 58: 1, 59: 1, 60: 1, 61: 1, 62: 1, 63: 1, 64: 1, 65: 1, 67: 1, 70: 1,
           72: 1, 73: 1, 74: 1, 75: 1, 76: 1, 77: 1, 78: 1, 79: 1, 80: 1, 81: 1, 82: 1, 84: 1, 85: 1, 86: 1, 87: 1,
           88: 1, 89: 1, 90: 1, 92: 2, 93: 2, 95: 3, 100: 4, 107: 5, 109: 2, 112: 5, 118: 6, 119: 7, 122: 8, 125: 9,
           128: 3, 130: 5, 133: 5, 138: 10, 141: 2, 144: 9, 145: 9, 147: 9, 148: 11, 149: 9, 151: 3, 154: 9, 155: 11,
           156: 5, 159: 9, 161: 5, 166: 3, 168: 2, 171: 12, 175: 12, 176: 12, 177: 12, 178: 11, 180: 13, 181: 13,
           184: 7, 185: 10, 186: 14, 187: 15, 188: 5, 189: 5, 190: 6, 191: 9, 192: 16, 193: 7, 194: 9, 195: 4, 196: 8,
           197: 3, 198: 16, 199: 12, 200: 2}

## Assign Category Names to COCO classes and "NA" to id numbers which are unassigned

In [None]:
coco_category_names = ['N/A'] * 201
for c in existing_coco_categories:
    coco_category_names[c['id']] = c['name']

## List of random colours

- Existing COCO classes have unique RGB value to represent a category in MASK Image
- But, below is the list of random colours which will be assigned to construction classes

In [None]:
# since we are treating all things as misc and that belongs to single color class, we can use colors of other things
available_colors_for_new_categories = [
    [119, 11, 32], 
    [0, 0, 142], 
    [0, 0, 230], 
    [106, 0, 228], 
    [0, 60, 100],
    [0, 80, 100], 
    [0, 0, 70],
    [0, 0, 192], 
    [250, 170, 30], 
    [100, 170, 30], 
    [220, 220, 0], 
    [175, 116, 175], 
    [250, 0, 30],
    [165, 42, 42], 
    [255, 77, 255], 
    [0, 226, 252], 
    [182, 182, 255], 
    [0, 82, 0], 
    [120, 166, 157],
    [110, 76, 0], 
    [174, 57, 255], 
    [199, 100, 0], 
    [72, 0, 118], 
    [255, 179, 240], 
    [0, 125, 92],
    [209, 0, 151], 
    [188, 208, 182], 
    [0, 220, 176],
    [255, 99, 164], 
    [92, 0, 73], 
    [133, 129, 255],
    [78, 180, 255], 
    [0, 228, 0], 
    [174, 255, 243], 
    [45, 89, 255], 
    [134, 134, 103], 
    [145, 148, 174],
    [255, 208, 186], 
    [197, 226, 255], 
    [171, 134, 1], 
    [109, 63, 54], 
    [207, 138, 255], 
    [151, 0, 95],
    [9, 80, 61], 
    [84, 105, 51], 
    [74, 65, 105], 
    [166, 196, 102], 
    [208, 195, 210], 
    [255, 109, 65],
    [0, 143, 149], 
    [179, 0, 194], 
    [209, 99, 106], 
    [5, 121, 0], 
    [227, 255, 205]
]

## Consturcution CLasses
- List of all construction classes on which the model will be trained.



In [None]:
new_categories = [
    "aac_blocks",
    "adhesives",
    "ahus",
    "aluminium_frames_for_false_ceiling",
    "chiller",
    "concrete_mixer_machine",
    "concrete_pump",
    "control_panel",
    "cu_piping",
    "distribution_transformer",
    "dump_truck_tipper_truck",
    "emulsion_paint",
    "enamel_paint",
    "fine_aggregate",
    "fire_buckets",
    "fire_extinguishers",
    "glass_wool",
    "grader",
    "hoist",
    "hollow_concrete_blocks",
    "hot_mix_plant",
    "hydra_crane",
    "interlocked_switched_socket",
    "junction_box",
    "lime",
    "marble",
    "metal_primer",
    "pipe_fittings",
    "rcc_hume_pipes",
    "refrigerant_gas",
    "river_sand",
    "rmc_batching_plant",
    "rmu_units",
    "sanitary_fixtures",
    "skid_steer_loader",
    "smoke_detectors",
    "split_units",
    "structural_steel_channel",
    "switch_boards_and_switches",
    "texture_paint",
    "threaded_rod",
    "transit_mixer",
    "vcb_panel",
    "vitrified_tiles",
    "vrf_units",
    "water_tank",
    "wheel_loader",
    "wood_primer"
]


## Final Categories

- Merge existing COCO classes (our format) + Construction classes
- Now we have final list of classes on which our model will be trained

In [None]:
category_id = 17
available_color_id = 0
for category in new_categories:
    categories_for_pp_project.append({'color': available_colors_for_new_categories[available_color_id], 'isthing': 1, 'id': category_id, 'name': category})
    category_id += 1
    available_color_id += 1

category_to_id = {
    category['name']: category['id'] for category in categories_for_pp_project
}

id_to_category = {
    id: name for id, name in category_to_id.items()
}
id_to_category

{'aac_blocks': 17,
 'adhesives': 18,
 'ahus': 19,
 'aluminium_frames_for_false_ceiling': 20,
 'building': 3,
 'ceiling': 14,
 'chiller': 21,
 'concrete_mixer_machine': 22,
 'concrete_pump': 23,
 'control_panel': 24,
 'cu_piping': 25,
 'distribution_transformer': 26,
 'dump_truck_tipper_truck': 27,
 'emulsion_paint': 28,
 'enamel_paint': 29,
 'fine_aggregate': 30,
 'fire_buckets': 31,
 'fire_extinguishers': 32,
 'floor': 6,
 'food': 8,
 'furniture': 5,
 'glass_wool': 33,
 'grader': 34,
 'ground': 9,
 'hoist': 35,
 'hollow_concrete_blocks': 36,
 'hot_mix_plant': 37,
 'hydra_crane': 38,
 'interlocked_switched_socket': 39,
 'junction_box': 40,
 'lime': 41,
 'marble': 42,
 'metal_primer': 43,
 'misc': 1,
 'pipe_fittings': 44,
 'plant': 7,
 'rawmaterial': 4,
 'rcc_hume_pipes': 45,
 'refrigerant_gas': 46,
 'river_sand': 47,
 'rmc_batching_plant': 48,
 'rmu_units': 49,
 'sanitary_fixtures': 50,
 'skid_steer_loader': 51,
 'sky': 15,
 'smoke_detectors': 52,
 'solid': 16,
 'split_units': 53,
 'st

## Utility Functions for Dataset Creation

- Resize Binary Mask
- Close the Contour
- Binary Mask to Polygon
- Create Image Data and JSON
- Create Annotation Data

In [None]:
#!/usr/bin/env python3

import os
import re
import datetime
import numpy as np
from itertools import groupby
from skimage import measure
from PIL import Image
from pycocotools import mask

convert = lambda text: int(text) if text.isdigit() else text.lower()
natrual_key = lambda key: [ convert(c) for c in re.split('([0-9]+)', key) ]

def resize_binary_mask(array, new_size):
    image = Image.fromarray(array.astype(np.uint8)*255)
    image = image.resize(new_size)
    return np.asarray(image).astype(np.bool_)

def close_contour(contour):
    if not np.array_equal(contour[0], contour[-1]):
        contour = np.vstack((contour, contour[0]))
    return contour

def binary_mask_to_rle(binary_mask):
    rle = {'counts': [], 'size': list(binary_mask.shape)}
    counts = rle.get('counts')
    for i, (value, elements) in enumerate(groupby(binary_mask.ravel(order='F'))):
        if i == 0 and value == 1:
                counts.append(0)
        counts.append(len(list(elements)))

    return rle

def binary_mask_to_polygon(binary_mask, tolerance=0):
    """Converts a binary mask to COCO polygon representation

    Args:
        binary_mask: a 2D binary numpy array where '1's represent the object
        tolerance: Maximum distance from original points of polygon to approximated
            polygonal chain. If tolerance is 0, the original coordinate array is returned.

    """
    polygons = []
    # pad mask to close contours of shapes which start and end at an edge
    padded_binary_mask = np.pad(binary_mask, pad_width=1, mode='constant', constant_values=0)
    contours = measure.find_contours(padded_binary_mask, 0.5)
    contours = np.subtract(contours, 1)
    for contour in contours:
        contour = close_contour(contour)
        contour = measure.approximate_polygon(contour, tolerance)
        if len(contour) < 3:
            continue
        contour = np.flip(contour, axis=1)
        segmentation = contour.ravel().tolist()
        # after padding and subtracting 1 we may get -0.5 points in our segmentation 
        segmentation = [0 if i < 0 else i for i in segmentation]
        polygons.append(segmentation)

    return polygons

def create_image_info(image_id, file_name, image_size, 
                      date_captured=datetime.datetime.utcnow().isoformat(' '),
                      license_id=1, coco_url="", flickr_url=""):

    image_info = {
            "id": image_id,
            "file_name": file_name,
            "width": image_size[0],
            "height": image_size[1],
            "date_captured": date_captured,
            "license": license_id,
            "coco_url": coco_url,
            "flickr_url": flickr_url
    }

    return image_info

def create_annotation_info(annotation_id, image_id, category_info, binary_mask, 
                           image_size=None, tolerance=2, bounding_box=None):

    if image_size is not None:
        binary_mask = resize_binary_mask(binary_mask, image_size)

    binary_mask_encoded = mask.encode(np.asfortranarray(binary_mask.astype(np.uint8)))

    area = mask.area(binary_mask_encoded)
    if area < 1:
        return None

    if bounding_box is None:
        bounding_box = mask.toBbox(binary_mask_encoded)

    if category_info["is_crowd"]:
        is_crowd = 1
        segmentation = binary_mask_to_rle(binary_mask)
    else :
        is_crowd = 0
        segmentation = binary_mask_to_polygon(binary_mask, tolerance)
        if not segmentation:
            return None

    annotation_info = {
        "id": annotation_id,
        "image_id": image_id,
        "category_id": category_info["id"],
        "iscrowd": is_crowd,
        "area": area.tolist(),
        "bbox": bounding_box.tolist(),
        "segmentation": segmentation,
        "width": binary_mask.shape[1],
        "height": binary_mask.shape[0],
    } 

    return annotation_info

## Super Impose Function

- To overlay our construction materials on top of COCO output predicted by Facebook DETR

In [None]:
import numpy as np
import cv2
from math import floor

def superimpose_thing(image_size, annotations):
    height, width = image_size

    # create a single channel black image
    superimposed_image = np.zeros((height, width))

    polygons_list = []
    # Add the polygon segmentation
    for segmentation_points in annotation['segmentation']:
        segmentation_points = np.multiply(segmentation_points, 1).astype(int)
        polygons_list.append(segmentation_points)

    # convert segmentation points to contour
    for x in polygons_list:
        end = []
        if len(x) % 2 != 0:
            print(x)
        for l in range(0, len(x), 2):
            coords = [floor(x[l]), floor(x[l + 1])]
            end.append(coords)
        contours = np.array(end)
        if end == []:
            continue

        # plot and fill the contour
        cv2.fillPoly(superimposed_image, pts=[contours], color=(1, 1, 1))
  
    return superimposed_image

In [None]:
from pycocotools import mask
from skimage import measure

def make_contours_closed(contour):
    if not np.array_equal(contour[0], contour[-1]):
        contour = np.vstack((contour, contour[0]))
    return contour
    

def binary_mask_to_polygon(binary_mask, tolerance=0):
    """Converts a binary mask to COCO polygon representation

    Args:
        binary_mask: a 2D binary numpy array where '1's represent the object
        tolerance: Maximum distance from original points of polygon to approximated
            polygonal chain. If tolerance is 0, the original coordinate array is returned.

    """
    polygons = []
    # pad mask to close contours of shapes which start and end at an edge
    padded_binary_mask = np.pad(binary_mask, pad_width=1, mode='constant', constant_values=0)
    contours = measure.find_contours(padded_binary_mask, 0.5)
    contours = np.subtract(contours, 1)
    for contour in contours:
        contour = make_contours_closed(contour)
        contour = measure.approximate_polygon(contour, tolerance)
        if len(contour) < 3:
            continue
        contour = np.flip(contour, axis=1)
        segmentation = contour.ravel().tolist()
        # after padding and subtracting 1 we may get -0.5 points in our segmentation 
        segmentation = [0 if i < 0 else i for i in segmentation]
        polygons.append(segmentation)

    return polygons

def convert(o):
    if isinstance(o, np.generic): return o.item()  
    raise TypeError



def get_annotation_info(binary_mask, image_size, image_id, class_id, segmentation_id, iscrowd):

    category_info = {'id': class_id, 'is_crowd': iscrowd}

    tolerance = 2
    bounding_box = None    

    if image_size is not None:
        binary_mask = resize_binary_mask(binary_mask, image_size)

    binary_mask_encoded = mask.encode(np.asfortranarray(binary_mask.astype(np.uint8)))

    area = mask.area(binary_mask_encoded)
    if area < 1:
        return None

    if bounding_box is None:
        bounding_box = mask.toBbox(binary_mask_encoded)

    if category_info["is_crowd"]:
        is_crowd = 1
        segmentation = binary_mask_to_rle(binary_mask)
    else :
        is_crowd = 0
        segmentation = binary_mask_to_polygon(binary_mask, tolerance)
        if not segmentation:
            return None

    annotation_info = {
        "id": annotation_id,
        "image_id": image_id,
        "category_id": category_info["id"],
        "iscrowd": is_crowd,
        "area": area.tolist(),
        "bbox": bounding_box.tolist(),
        "segmentation": segmentation,
        "width": binary_mask.shape[1],
        "height": binary_mask.shape[0],
    } 


    return annotation_info

## Import and Load FACEBOOK DETR Pre-Trained Model

In [None]:
import torchvision.transforms as T

# standard PyTorch mean-std input image normalization
transform = T.Compose([
    T.Resize(800),
    T.ToTensor(),
    T.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

# Load detr model
model, postprocessor = torch.hub.load('detr', 'detr_resnet101_panoptic', source='local', pretrained=True, return_postprocessor=True, num_classes=250)
# Convert to eval mode
model = model.to(device)
model.eval()

print("Model Loaded")

print(os.getcwd())

Model Loaded
/content/drive/MyDrive/Panoptic Segmentation using DETR


## Mask Image Creation and JSON Creation
1. For all categories and images
2. Pass Image to Facebbok model
3. Gather predictions
4. COnvert predictions to our mapping
5. Overlay construction materials on top of COCO classes
6. Adjust Mask images and mappings accordingly
7. Save Mask Image and JSON


In [None]:
from PIL import Image
import requests
import io
import panopticapi
from panopticapi.utils import id2rgb, rgb2id
import torch
import torchvision.transforms as T
import numpy
torch.set_grad_enabled(False)
import json



image_id = 1
annotation_id = 1
segment_id = 1

detection_coco = {
    "categories": categories_for_pp_project,
    "annotations": [],
    "images": []
}

panoptic_coco = {
    "categories": categories_for_pp_project,
    "annotations": [],
    "images": []
}

os.chdir("/content/drive/MyDrive/Panoptic Segmentation using DETR/Dataset")
cats = os.listdir()

os.makedirs("/content/drive/MyDrive/Panoptic Segmentation using DETR/data/train", exist_ok=True)
os.makedirs("/content/drive/MyDrive/Panoptic Segmentation using DETR/data/panoptic_train", exist_ok=True)

final_images_path = "/content/drive/MyDrive/Panoptic Segmentation using DETR/data/"


# run through all folders in dataset
for cat in cats:
    if os.path.isfile(cat):
      continue

    # get category name
    category_name = cat
    print("Category:", category_name)
    with open(os.path.join(cat, "coco.json"), "r") as json_file:
        category_json = json.load(json_file)
        
    images_path = os.path.join(cat, 'images')
        
    temporary_annotations = {}
    
    # Run over all images
    for image_info in category_json["images"]:
        image_info['annotations'] = []
        temporary_annotations[image_info['id']] = image_info
        
    for annnotation_info in category_json["annotations"]:
        temporary_annotations[annnotation_info['image_id']]["annotations"].append(annnotation_info)
        
    for i, image_item in temporary_annotations.items():

        output_file_name = category_name + "_" + str(image_id) + ".jpg"
        output_file_path = os.path.join(final_images_path, "train", output_file_name)
        
        output_mask_name = category_name + "_" + str(image_id) + ".png"
        output_mask_path = os.path.join(final_images_path, "panoptic_train", output_mask_name)
        
        try:

            # Read the image and get shape of image
            original_image = Image.open(os.path.join(images_path, image_item['file_name'])).convert('RGB')

            try:
                h, w, c = np.array(original_image).shape
            except:
                h, w = np.array(original_image).shape
                c = 1

            # if no of channels != 3, open the image and convert it to 3 channel - RGB
            if c == 4 or c == 1:
                original_image = original_image.convert('RGB')
                h, w, c = np.array(original_image).shape

            duplicate_image = original_image.copy()

            # Apply transform and convert image to batch
            # mean-std normalize the input image (batch-size: 1)
            img = transform(duplicate_image).unsqueeze(0).to(device)  # [h, w, c] -> [1, c, ht, wt]

            # Generate output for image
            out = model(img)

            # Generate score
            # compute the scores, excluding the "no-object" class (the last one)
            scores = out["pred_logits"].softmax(-1)[..., :-1].max(-1)[0]

            # threshold the confidence
            keep = scores > 0.85

            # Keep only ones above threshold
            pred_logits, pred_boxes = out["pred_logits"][keep][:, :len(
                coco_category_names) - 1], out["pred_boxes"][keep]

            # the post-processor expects as input the target size of the predictions (which we set here to the image size)
            result = postprocessor(out, torch.as_tensor(img.shape[-2:]).unsqueeze(0))[0]

            # The segmentation is stored in a special-format png
            panoptic_seg = Image.open(io.BytesIO(result['png_string'])).resize((w, h), Image.NEAREST)
            # (wp, hp) = panoptic_seg.size
            panoptic_seg = np.array(panoptic_seg, dtype=np.uint8).copy()

            # We retrieve the ids corresponding to each mask
            panoptic_seg_id = rgb2id(panoptic_seg)
            
            # get unique prediction ids
            unique_category_id = []
            for i, segment in enumerate(result['segments_info']):
                result['segments_info'][i]["category_id"] = old_to_new_category_mapping[result['segments_info'][i]["category_id"]]
                if result['segments_info'][i]["category_id"] not in unique_category_id:
                    unique_category_id.append(result['segments_info'][i]["category_id"])
            
            # Sort array
            unique_category_id.sort()
            
            unique_category_id_to_id =  {category_id: i for i, category_id in enumerate(unique_category_id)}
            unique_id_to_category_id =  {i: category_id for category_id, i in unique_category_id_to_id.items()}
            
            for i, segment in enumerate(result['segments_info']):
                result['segments_info'][i]["new_id"] = unique_category_id_to_id[result['segments_info'][i]["category_id"]]
            
            # Update original panoptic_seg_id array with new ids as the new segmentation combines different categories.
            custom_panoptic_seg_id = np.zeros((panoptic_seg_id.shape[0], panoptic_seg_id.shape[1]), dtype=np.uint8)
            
            # Update this custom panoptic seg matrix
            for i, segment in enumerate(result['segments_info']):
                custom_panoptic_seg_id[result['segments_info'][i]['id'] == panoptic_seg_id] = result['segments_info'][i]['new_id']
            
            custom_panoptic_segments_info = []
            for category_id in unique_category_id:
                custom_panoptic_segments_info.append({
                    'segment_id': unique_category_id_to_id[category_id], 
                    'category_id': category_id,
                    'bbox': [],
                    'area': 0,
                    'iscrowd': 0,
                    'isthing': 0
                })

            # annotations of our construction things
            original_mask = image_item['annotations']
            
            to_append_annotations = []
            
            # Overlay things mask one at a time
            for annotation in original_mask:
                # overlay mask of construction things on top of detr output
                omask_image_id = superimpose_thing((h, w), annotation)
                custom_panoptic_seg_id[omask_image_id.astype(np.bool_)] = custom_panoptic_seg_id.max() + 1
                custom_panoptic_segments_info.append({
                    'segment_id': custom_panoptic_seg_id.max(), 
                    'category_id': category_to_id[category_name], 
                    'bbox': annotation['bbox'],
                    'area': annotation['area'],
                    'iscrowd': 0,
                    'isthing': 1
                })

                # append annotation of construction things in json file
                annotation["category_id"] = category_to_id[category_name]
                annotation["image_id"] = image_id
                to_append_annotations.append(annotation)
            
            # Convert to binary segment
            binary_masks = np.zeros((
                custom_panoptic_seg_id.max() + 1,
                custom_panoptic_seg_id.shape[0],
                custom_panoptic_seg_id.shape[1]),
                dtype=np.uint8
            )

            # for each binary mask, detect contours and create annotation for those contours
            if len(unique_category_id):
                # Skip the onse which are added by us
                for category_id in unique_category_id:
                    binary_masks[unique_category_id_to_id[category_id], :, :] = custom_panoptic_seg_id == unique_category_id_to_id[category_id]
                    annotation_info = get_annotation_info(binary_masks[unique_category_id_to_id[category_id]], None, image_id, category_id, unique_category_id_to_id[category_id], 0)
                    if annotation_info is not None:
                        annotation_info["image_id"] = image_id
                        annotation_info["category_id"] = category_id
                        to_append_annotations.append(annotation_info)
                        
                        custom_panoptic_segments_info[unique_category_id_to_id[category_id]]['bbox'] = annotation_info['bbox']
                        custom_panoptic_segments_info[unique_category_id_to_id[category_id]]['area'] = annotation_info['area']
            else:
                pass
            
            # save image to new path as .jpg
            original_image.save(output_file_path)
            
            # save panoptic image
            Image.fromarray(id2rgb(custom_panoptic_seg_id), 'RGB').save(output_mask_path)

            # create image_info object and append it to original list
            image_info = {
                "id": image_id,
                "file_name": output_file_name,
                "width": original_image.size[0],
                "height": original_image.size[1]
                }
            
            detection_coco["images"].append(image_info)
            panoptic_coco["images"].append(image_info)

            for annotation in to_append_annotations:
                annotation["id"] = annotation_id
                detection_coco["annotations"].append(annotation)
                annotation_id += 1
                
            for segment_info in custom_panoptic_segments_info:
                segment_info["id"] = segment_id
                segment_id += 1
                
            panoptic_coco["annotations"].append({
                "segments_info": custom_panoptic_segments_info,
                "file_name": output_mask_name,
                "image_id": image_id
            })

            # increment the image_count
            image_id += 1
    
        except Exception as e:
            # print("Error ******** :", os.path.join(images_path, image_item['file_name']))
            continue    

    # open the final json, and commit changes in that file
    with open(os.path.join(final_images_path, "train.json"), 'w') as output_json_file:
        json.dump(detection_coco, output_json_file)
        
    with open(os.path.join(final_images_path, "panoptic_train.json"), 'w') as output_json_file:
        json.dump(panoptic_coco, output_json_file, default=convert)
        
    print(image_id, annotation_id, segment_id)

Category: distribution_transformer


  "See the documentation of nn.Upsample for details.".format(mode)
  "Palette images with Transparency expressed in bytes should be "


412 1817 1820
Category: aac_blocks
675 2853 2857
Category: ahus
808 3217 3227
Category: control_panel
1053 3905 3928
Category: cu_piping
1547 5254 5284
Category: concrete_pump_(50_)
1547 5254 5284
Category: concrete_mixer_machine
1597 5430 5461
Category: adhesives
1697 5638 5669
Category: aluminium_frames_for_false_ceiling
1747 5780 5812
Category: chiller
1792 5930 5966
Category: dump_truck___tipper_truck
1792 5930 5966
Category: glass_wool
1862 6238 6274
Category: hollow_concrete_blocks
1912 6398 6435
Category: hoist
2447 8392 8451
Category: emulsion_paint
2479 8471 8532
Category: fire_extinguishers
2707 9097 9161
Category: enamel_paint
2758 9217 9283
Category: grader
3058 10620 10692
Category: fire_buckets
3516 12073 12152
Category: fine_aggregate
4016 13053 13134
Category: junction_box
4068 13193 13275
Category: marble
4118 13378 13463
Category: metal_primer
4177 13499 13584
Category: lime
4689 15901 15989
Category: rcc_hume_pipes
4929 16613 16704
Category: hot_mix_plant
5029 17087 

## Pending

- JSON Creation
- Split 80 : 20
- Preprocessing Mask Images and JSON


In [None]:
import json
import os
import random

os.chdir("/content/drive/MyDrive/Panoptic Segmentation using DETR/data")

f = open("train.json")
train_json = json.load(f)
f.close()

f = open("panoptic_train.json")
panoptic_train_json = json.load(f)
f.close()

train_images = train_json["images"]
train_annotations = train_json["annotations"]

val_images = []
val_annotations = []

panoptic_train_images = panoptic_train_json["images"]
panoptic_train_annotations = panoptic_train_json["annotations"]

panoptic_val_images = []
panoptic_val_annotations = []


print(len(train_images), len(val_images), len(train_annotations), len(val_annotations), len(panoptic_train_images), len(panoptic_train_annotations), len(panoptic_val_images), len(panoptic_val_annotations))

count = 0
while True:
    if count >= 2000:
        break
    random.shuffle(train_images)
    to_del = train_images[0]
    found = False
    for x in panoptic_train_images:
        if x["id"] == to_del["id"]:
            panoptic_val_images.append(x)
            panoptic_train_images.remove(x)
            found = True
    if not found:
        print(to_del)
    if found:
        val_images.append(train_images[0])
        for x in panoptic_train_annotations:
            if x["image_id"] == to_del["id"]:
                panoptic_val_annotations.append(x)
                panoptic_train_annotations.remove(x)
        for x in train_annotations:
            if x["image_id"] == to_del["id"]:
                val_annotations.append(x)
                train_annotations.remove(x)
        train_images.remove(to_del)
        count += 1


print(len(train_images), len(val_images), len(train_annotations), len(val_annotations), len(panoptic_train_images), len(panoptic_train_annotations), len(panoptic_val_images), len(panoptic_val_annotations))


train_json["images"] = train_images
train_json["annotations"] = train_annotations


val_json["images"] = val_images
val_json["annotations"] = val_annotations

panoptic_train_json["images"] = panoptic_train_images
panoptic_train_json["annotations"] = panoptic_train_annotations

panoptic_val_json["images"] = panoptic_val_images
panoptic_val_json["annotations"] = panoptic_val_annotations

f = open("train.json", "w")
json.dump(train_json, f)
f.close()

f = open("val.json", "w")
json.dump(val_json, f)
f.close()

f = open("panoptic_train.json", "w")
json.dump(panoptic_train_json, f)
f.close()

f = open("panoptic_val.json", "w")
json.dump(panoptic_val_json, f)
f.close()

In [None]:
import json

os.chdir("/content/drive/MyDrive/Panoptic Segmentation using DETR/data")

files = ["train.json", "val.json"]

for file in files:
    f = open(file)
    j = json.load(f)

    for x in j["annotations"]:
        if x["segmentation"] == []:
            print(x)
            j["annotations"].remove(x)

    f = open(file, "w")
    json.dump(j, f)
    f.close()



files = ["panoptic_train.json", "panoptic_val.json"]

for file in files:
    f = open(file)
    j = json.load(f)

    for x in j["annotations"]:
        for y in x["segments_info"]:
            if len(y["bbox"]) != 4:
                x["segments_info"].remove(y)

    f = open(file, "w")
    json.dump(j, f)
    f.close()
