<center>

# 6D Pose Object Detector and Refiner

## Introduction
6D pose estimation of an object is a ubiquitous problem in Robotics. We can find its applications in Pick and Place, Service robotics, autonomous driving, etc. The program below is our attempt to solve the problem using deep learning, in tandem to image processing algorithms like Point n perspective and RANSAC. We are using LineMOD dataset for training and testing. LineMOD dataset has various images of cluttered images of the objects saperated in various classes. The images are accompanied by the true 6D poses in rotation and translation, 3D meshes and point cloud data for the class.     

In [None]:
!conda create --name 6POD --file requirements.txt

In [None]:
from Helper import *
from ground_truth import create_GT_masks
from UV_mapping import create_UV_XYZ_dictionary
from LineMOD import LineMODDataset
from PoseRefinement import *
from Correspondence import *
from Pose_estimation import *
from Test import test
import argparse

np.random.seed(50)

## Dataset Download

The dataset is downloaded for the first time when the program is executed. The download links for each class in accompanied in dataset_install.txt. User can include additional classes to what we have used by altering this file. For our purposes we have used 15 classes of objects. The test-train split can be changed by varying the default 0.2 setting in the argument parser. 

In [None]:
try:
    os.mkdir("LineMOD_Dataset")
    file1 = open('dataset_install.txt', 'r') 
    Lines = file1.readlines()
    for url in Lines[:-1]:
        zipresp = urlopen(url)
        tempzip = open("tempfile.zip", "wb")
        tempzip.write(zipresp.read())
        tempzip.close()

        zf = ZipFile("tempfile.zip")
        zf.extractall(path = 'LineMOD_Dataset')
        zf.close()
        
    zipresp = urlopen(Lines[-1])
    tempzip = open("tempfile.zip", "wb")
    tempzip.write(zipresp.read())
    tempzip.close()

    zf = ZipFile("tempfile.zip")
    zf.extractall()
    zf.close()
except FileExistsError:
    print("Data set exists")

In [None]:
parser = argparse.ArgumentParser(description='Script to create the Ground Truth masks')
parser.add_argument("--root_dir", default="LineMOD_Dataset/",
                    help="path to dataset directory")

parser.add_argument("--bgd_dir", default="val2017/",
                    help="path to background images dataset directory")
parser.add_argument("--split", default=0.20, help="train:test split ratio")

args, unknown = parser.parse_known_args()

In [None]:
root_dir = args.root_dir
background_dir = args.bgd_dir

imageList = []
for root, dirs, files in os.walk(root_dir):
    for file in files:
        if file.endswith(".jpg"):  # images that exist
            imageList.append(os.path.join(root, file))

nImages = len(imageList)
ind = list(range(nImages))

np.random.shuffle(ind)

split = int(args.split * nImages)
trainInd, testInd = ind[:split], ind[split:]
print("Training Samples:", len(trainInd))
print("Testing Samples:", len(testInd))

save_obj(imageList, root_dir + "all_images_adr")
save_obj(trainInd, root_dir + "train_images_indices")
save_obj(testInd, root_dir + "test_images_indices")

In [None]:
classes = {'ape': 1, 
           'phone':2, 
           'cam': 3, 
           'duck': 4,
           'can': 5, 
           'cat': 6, 
           'driller': 7,
           'iron': 8, 
           'eggbox': 9, 
           'glue': 10, 
           'holepuncher': 11, 
           'benchviseblue': 12, 
           'lamp': 13 
           }
class_names = list(classes.keys())
dataset_dir_structure(root_dir, class_names)

## Directory Structure
After executing the above blocks, directory structure of the LineMOD_dataset should look somethink the tree below. A saperate directory for masks, pose predictions, refinement, eyc. The processes described below are time consuming thus the caches are made for debugging purposes as well as saving the progress.

In [None]:
fx = 572.41140
px = 325.26110
fy = 573.57043
py = 242.04899

intrinsicCameraMatrix = np.zeros((3, 3))
intrinsicCameraMatrix[0, 0] = fx
intrinsicCameraMatrix[0, 2] = px
intrinsicCameraMatrix[1, 1] = fy
intrinsicCameraMatrix[1, 2] = py
intrinsicCameraMatrix[2, 2] = 1

In [None]:
print("===================Creating Ground Truth Masks=========================")
create_GT_masks(background_dir, root_dir, classes, intrinsicCameraMatrix)
print("====================Creating UV Dictionary=============================")
create_UV_XYZ_dictionary(root_dir)
print("Done")
print("===========================Finished====================================")

In [None]:
print("------ Started training of the correspondence block ------")
torch.cuda.empty_cache()
train_correspondence_block(root_dir, classes, numEpoch=5, batchSize=5, validationSplit = 0.2)
print("==================== Training Finished ===================")

After each epoch the validation loss is compared with the one with previously minimum validation error and the model is saved if there is improvement in the loss. This saves the model from over-training and saves the progress in case of disruption during training. The model is saved by the name "correspondance_block.pt" after the original nomenclature used by the author.

In [None]:
print("========== Pose Estimation Started ==========")
torch.cuda.empty_cache()
initial_pose_estimation(root_dir, classes, intrinsicCameraMatrix)
print("========== Pose Estimation Finished =========")

In [None]:
print("=========== Pose Refinement Started ===========")
create_refinement_inputs(root_dir, classes, intrinsicCameraMatrix)
train_pose_refinement(root_dir, classes, epochs=3)
print("======== Pose Refinement Finished =============")

In [None]:
classScore, classInst = test(50, intrinsicCameraMatrix, classes)
classPerformance = {}
for key in classScore:
    classPerformance[key] = classScore[key]/classInst[key]

print(classPerformance)