Triclops - from the DragonBallZ world - is ancient group of three-eyed aliens.
Projekt Triclops: A DNN for object detection, segmentation and depth estimation. The entire dataset was created from scraping images. The DNN was custom made inspired from Encoder-Decoder architecture.
The idea here is to create a single network that can perform 3 different tasks simultaneously:
- Able to perform Object Detection
- Able to perform Depth Map Generation
- Able to perform Plane Surface Identification
The data used for the training of the model is scraped from the internet for people wearing hardhat, masks, PPE and boots. The idea here is to use pre-trained networks and use their outputs as the ground truth data:
- MidasNet Network for depth maps
- Yolov3 Network for object detection
- PlanerCNN Network for identifying plane surfaces
The steps taken to create the dataset are:
- For object detection - use YoloV3 annotation tool to draw bounding box for the labels and generate required files as mentioned here.
- Use MidasNet by Intel to generate the depthmap for the above images.
- Use Planercnn to generate plane segmentations of the above images.
A Youtube Video of indoor surfaces will be used create additional data by generating frames from video and then used them to generate the PlanerCNN output.
The model is based on Encoder-Decoder Architecture. The strategy is to use a common backbone and pass the final activations from the encoder to three different decoders.
Each of the three different networks were using three different backbones :
MidasNet - ResNext101_32x8d_wsl
Planercnn - ResNet101
Yolov3 - Darknet-53
ResNext101 - This has been finalised as the encoder as this is being offered by facebook and is trained on millions of images.
-
The Dataset and Dataloaders are in developement. After this the data(images) can be loaded in dataloaders for each type of decoder.
-
Then the training loop will be setup.
-
Loss functions will be identified and experimented.
-
Inference loop will be developed.
-
Code documentation and project documentation.