# Project 4 - Mc907/Mo651 - Mobile Robotics

### Student:
Luiz Eduardo Cartolano - RA: 183012

### Instructor:
Esther Luna Colombini

### Github Link:
[Project Repository](https://github.com/luizcartolano2/mc907-mobile-robotics)

### Youtube Link:
[Link to Video](https://youtu.be/uqNeEhWo0dA)

### Subject of this Work:
The general objective of this work is to implement a deep learning approach for solve the Visual Odometry problem.

### Goals:
1. Implement and evaluate a Deep VO strategy using images from the [AirSim](https://github.com/microsoft/AirSim) simulator.

In [126]:
import pandas as pd
import glob
import numpy as np
import os
import cv2
import torch
import torch.nn as nn
import torch.nn.functional as F
import time
from torch.autograd import Function
from torch.autograd import Variable
from torchvision import models
import math
from scipy.spatial.transform import Rotation as R

## Data Pre-Processing

### Clean wrong images

While upload images obtained from the AirSim simulator were noted that some of them had failure, so, we have to clean this data to avoid noise in the dataset.

In [47]:
for dt in ['1','2','3','4','5','6']:
    path = 'dataset/'+'seq'+dt+'/'
    print("-------------------------------------------")
    print('|    '+path)
    all_images = glob.glob(path+'images'+'/*')
    df_poses = pd.read_csv(path+'poses.csv')[['ImageFile']].values
    for img in df_poses:
        if not (path+'images/'+img) in all_images:
            print('|        '+img[0])
    print("-------------------------------------------")          

-------------------------------------------
|    dataset/seq1/
|        img__0_1574447220765996000.png
-------------------------------------------
-------------------------------------------
|    dataset/seq2/
-------------------------------------------
-------------------------------------------
|    dataset/seq3/
-------------------------------------------
-------------------------------------------
|    dataset/seq4/
-------------------------------------------
-------------------------------------------
|    dataset/seq5/
-------------------------------------------
-------------------------------------------
|    dataset/seq6/
-------------------------------------------


### Images

In [99]:
def get_image(path,img_size=(1280,384)):
    """
        Function to read an image from a given path.
        
        :param: path - image path
        :param: img_size - image size
        
        :return: img - numpy array with the images pixels (converted to grayscale and normalized)
        
    """
    # read image from path
    img = cv2.imread(path)
    # resize image to a given size
    img = cv2.resize(img, img_size, cv2.INTER_LINEAR)
    # convert image to grayscale
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    # normalize image pixels
    img = cv2.normalize(img, None, alpha=0, beta=1, norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_32F)

    return img

In [107]:
def load_images(img_dir, img_size):

    """
        Function to coordinate the load of all the images that are going to be used.
        
        :param: img_dir - path to the directory containing the images
        :param: img_size - image size
        
        :return: images_set - numpy array with all images at the set
        
    """
    print("----------------------------------------------------------------------")
    print ("|    Loading images from: ", img_dir)
    # create two empty list that are going to be used for save the images
    images= []
    images_set =[]
    # loop to read all the images of the directory
    for img in glob.glob(img_dir+'/*'):
        images.append(get_image(img,img_size))
    # loop on the read images agrupping them two by two
    for i in range(len(images)-1):
        img1 = images[i]
        img2 = images[i+1]
        # concatenate the two images
        img = np.concatenate([img1, img2],axis = -1)
        images_set.append(img)
    print("|    Images count : ",len(images_set))

    # reshape the array of all images
    images_set = np.reshape(images_set, (-1, 2, 384, 1280))
    print("----------------------------------------------------------------------")

    return images_set

### Pose

The next three functions are used to the Kitti Dataset poses.

In [108]:
def isRotationMatrix(R):
    """ 
        Checks if a matrix is a valid rotation matrix referred from 
        https://www.learnopencv.com/rotation-matrix-to-euler-angles/
        
        :param: R - rotation matrix
        
        :return: True or False
        
    """
    # calc the transpose
    Rt = np.transpose(R)

    # check identity
    shouldBeIdentity = np.dot(Rt, R)
    I = np.identity(3, dtype = R.dtype)
    n = np.linalg.norm(I - shouldBeIdentity)
    
    return n < 1e-6

In [109]:
def rotationMatrixToEulerAngles(R):
    """ 
        Calculates rotation matrix to euler angles
        referred from https://www.learnopencv.com/rotation-matrix-to-euler-angles
        
        :param: R - rotation matrix
        
        :return: rotation matrix for Euler angles
    """
    assert(isRotationMatrix(R))
    sy = math.sqrt(R[0,0] * R[0,0] +  R[1,0] * R[1,0])
    singular = sy < 1e-6

    if  not singular :
        x = math.atan2(R[2,1] , R[2,2])
        y = math.atan2(-R[2,0], sy)
        z = math.atan2(R[1,0], R[0,0])
    else :
        x = math.atan2(-R[1,2], R[1,1])
        y = math.atan2(-R[2,0], sy)
        z = 0

    return np.array([x, y, z])

In [110]:
def getMatrices(all_poses):
    """
        Function to extract matrices from poses
        
        :param: all_poses - list with all poses from the sequence
        
        :return: all_matrices - list with all matrices obtained from the poses
    """
    all_matrices = []
    for i in range(len(all_poses)):
        #print("I: ",i)
        j = all_poses[i]
        #print("J:   ",j)
        p = np.array([j[3], j[7], j[11]])
        #print("P:   ", p)
        R = np.array([[j[0],j[1],j[2]],
                      [j[4],j[5],j[6]],
                      [j[8],j[9],j[10]]
                     ])
        #print("R:   ", R)
        angles = rotationMatrixToEulerAngles(R)
        #print("Angles: ",angles)
        matrix = np.concatenate((p,angles))
        #print("MATRIX: ", matrix)
        all_matrices.append(matrix)
    return all_matrices

In [113]:
def load_kitti_images(pose_file):
    poses = []
    poses_set = []

    with open(pose_file, 'r') as f:
        lines = f.readlines()
        for line in lines:
            pose = np.fromstring(line, dtype=float, sep=' ')
            poses.append(pose)
    
    poses = getMatrices(poses)
    for i in range(len(poses)-1):
        pose1 = poses[i]
        pose2 = poses[i+1]
        finalpose = pose2-pose1
        poses_set.append(finalpose)

    return poses_set

The next two functions are used for the poses obtained from the AirSim simulator.

In [152]:
def quat_to_euler_angles(quat_matrix):
    # create a scipy object from the quaternion angles
    rot_mat = R.from_quat(quat_matrix)
    # convert the quaternion to euler (in degrees)
    euler_mat = rot_mat.as_euler('zxy', degrees=False)

    #TODO: convert from (-pi,pi) to (0,2pi) ?
    
    return euler_mat

In [153]:
def load_airsim_pose(pose_file):
    poses = []
    poses_set = []
    
    df_poses = pd.read_csv(pose_file)
    for index, row in df_poses.iterrows():
        # get the (x,y,z) positions of the camera
        position = np.array([row['POS_X'],row['POS_Y'],row['POS_Z']])
        # get the quaternions angles of the camera
        quat_matrix = np.array([row['Q_W'],row['Q_X'],row['Q_Y'], row['Q_Z']])
        # call the func that convert the quaternions to euler angles
        euler_matrix = quat_to_euler_angles(quat_matrix)
        # concatenate both vectors
        poses.append(np.concatenate((position,euler_matrix)))
        
    for i in range(len(poses)-1):
        pose1 = poses[i]
        pose2 = poses[i+1]
        finalpose = pose2-pose1
        poses_set.append(finalpose)
        
    return poses_set

In [154]:
def load_poses(pose_file, pose_format):
    """
        Function to load the image poses.
        
        :param: pose_file - path to the pose file
        :param: pose_format - where the pose were obtained from (AirSim, VREP, Kitti, etc...)
        
        :return: pose_set - set of the poses for the sequence
    """
    print("----------------------------------------------------------------------")
    print ("|    Pose from: ",pose_file)
    
    if pose_format.lower() == 'kitti':
        poses_set = load_kitti_images(pose_file)
    elif pose_format.lower() == 'airsim':
        poses_set = load_airsim_pose(pose_file)
        
    print("|        Poses count: ",len(poses_set))
    print("----------------------------------------------------------------------")    
    return poses_set

### General

Function that acquire all data that will be used for training.

In [171]:
def VODataLoader(datapath,img_size=(1280,384), test=False):
    if test:
        sequences = ['4']
    else:
        sequences = ['1','2','3','5','6']
        
    images_set = []
    odometry_set = []
    
    for sequence in sequences:
        dir_path = os.path.join(datapath,'seq'+sequence)
        image_path = os.path.join(dir_path,'images')
        pose_path = os.path.join(dir_path,'poses.csv')
        print("-----------------------------------------------------------------------")
        print("|Load from: ", dir_path)
        images_set.append(torch.FloatTensor(load_images(image_path,img_size)))
        odometry_set.append(torch.FloatTensor(load_poses(pose_path, 'AirSim')))
        print("-----------------------------------------------------------------------")
    
    print("---------------------------------------------------")
    print("|   Total Images: ", len(images_set))
    print("|   Total Odometry: ", len(odometry_set))
    print("---------------------------------------------------")    
    return images_set, odometry_set

In [173]:
X,y = VODataLoader(datapath='dataset', test=False)

-----------------------------------------------------------------------
|Load from:  dataset/seq1
----------------------------------------------------------------------
|    Loading images from:  dataset/seq1/images
|    Images count :  326
----------------------------------------------------------------------
----------------------------------------------------------------------
|    Pose from:  dataset/seq1/poses.csv
|        Poses count:  326
----------------------------------------------------------------------
-----------------------------------------------------------------------
-----------------------------------------------------------------------
|Load from:  dataset/seq2
----------------------------------------------------------------------
|    Loading images from:  dataset/seq2/images
|    Images count :  283
----------------------------------------------------------------------
----------------------------------------------------------------------
|    Pose from:  dataset

## Data Acquire

Converting lists containing tensors to tensors as per the batchsize (10)

In [192]:
X_train = [item for x in X for item in x]
Y_train = [item for a in y for item in a]

Some info about the training data

In [193]:
print("---------------------------------")
print("Details of X :")
print(type(X_train)) 
print(type(X_train[0]))
print(len(X_train)) 
print(X_train[0].size())
print("---------------------------------")
print("Details of y :")
print(type(Y_train))
print(type(Y_train[0]))
print(len(Y_train))
print(Y_train[0].size())
print("---------------------------------")

---------------------------------
Details of X :
<class 'list'>
<class 'torch.Tensor'>
1435
torch.Size([2, 384, 1280])
---------------------------------
Details of y :
<class 'list'>
<class 'torch.Tensor'>
1435
torch.Size([6])
---------------------------------


In [205]:
X_stack = torch.stack(X_train)

In [213]:
y_stack = torch.stack(Y_train)

In [228]:
X_batch = X_stack.view(205,7,2,384,1280)
y_batch = y_stack.view(205,7,6)

In [229]:
print("Details of X :")
print(X_batch.size())
print("Details of y :")
print(y_batch.size())

Details of X :
torch.Size([205, 7, 2, 384, 1280])
Details of y :
torch.Size([205, 7, 6])
