# Assignment 1: Transformations and Representations

Roll number: \<Roll number here\>


# Instructions

- Code must be written in Python in Jupyter Notebooks. We highly recommend using anaconda distribution or at the minimum, virtual environments for this assignment. 
- Save all your results in ```results/<question_number>/<sub_topic_number>/```
- For this assignment, you will be using Open3D extensively. Refer to [Open3D Documentation](http://www.open3d.org/docs/release/): you can use the in-built methods and **unless explicitly mentioned**, don't need to code from scratch for this assignment. 
- Make sure your code is modular since you may need to reuse parts for future assignments.
- Make sure any extra files you that you need to submit, place it in *'results'* folder.
- Answer the descriptive questions in your own words with context & clarity. Do not copy answers from online resources or lecture notes.
- The **deadline** for this assignment is on **23/08/2022 at 11:55pm**. Please note that there will be no extensions.
- Plagiarism is **strictly prohibited**.


# Submission Instructions

1. Make sure your code runs without any errors after reinitializing the kernel and removing all saved variables.
2. After completing your code and saving your results, zip the folder with name as ``<roll_number>_MR2022_<assignment_number>.zip``

## 1. 3D Data and Open3D

1. Please find mesh files in **data/Q1** folder. Using these mesh files and your own creativity/visualisation, create a "Table" **pointcloud** scene. The table scene should be realistic, scaled appropiately. Use all the meshes given in the folder and treat them as objects kept on the table. 

    You are expected to perform different functions on the individual mesh files: first convert the mesh files to pointclouds and on each pointcloud perform operations such as scaling, rotation, translation. Next, visualize them together. The visualization should represent a pointcloud of a realistic table scene. Save the scene as **.pcd** file. 

    **Please do not copy as we may use your contribution to create a table top dataset.**

    Refer below image for example of a table-top scene:

    <img src="img/1.jpeg"  width="500" >
<br>
<br>

2. Use the final table scene pointcloud obtained from part 1. 
    - Use Open3D to generate partial pointclouds from different camera views (at least 4 views). This means, that you need to crop or capture the points in the pointcloud that are visible only from a given viewpoint. 
    - Using these partial pointclouds, you are now expected to generate the full scene pointcloud back by registering the pointclouds to a global frame. Save the partial and reconstructed pointclouds in different files. 
    - **[ BONUS ]** Finally, compute the error using "Chamfer's Distance (CD)" between the ground truth scene pointcloud and the reconstructed pointcloud. Perform an analysis: 
      1. Why is the CD not 0?
      2. How does the CD change as the number of viewpoints increase / decrease? 
      3. Can we optimize the viewpoints (by hit and trial) such that the CD reduces?

Refer the following link for solving Q1:
- Hidden-Point-Removal Open3D API: http://www.open3d.org/docs/latest/tutorial/Basic/pointcloud.html#Hidden-point-removal

- Chamfer's Distance:  https://pytorch3d.readthedocs.io/en/latest/modules/loss.html



In [None]:
import open3d as o3d
import numpy as np

def create_pcd(file_name, translate, rotate, scale):
    mesh = o3d.io.read_triangle_mesh(file_name)
    # print(mesh)
    # o3d.visualization.draw_geometries([textured_mesh])
    pcd =  mesh.sample_points_uniformly(number_of_points=20000, use_triangle_normal=False, seed=- 1)
    # pcd.colors = mesh.vertex_colors
    # pcd.normals = mesh.vertex_normals
    # print(pcd)
    pcd.translate(translate)
    pcd.rotate(pcd.get_rotation_matrix_from_xyz(rotate),
              center=pcd.get_center())
    pcd.scale(scale, center = pcd.get_center())
    return pcd

q1_dirname = "./data/Q1/"
boat_pcd = create_pcd(q1_dirname + "boat.obj", (-1, 0, -1), (0, np.pi/2, 0), 2)
car_pcd = create_pcd(q1_dirname + "car.obj", (2, 0, -5), (0, 0, 0), 2)
laptop_pcd = create_pcd(q1_dirname + "laptop.obj", (2, 0.4, 1), (0, -np.pi/2, 0), 3)
plane_pcd = create_pcd(q1_dirname + "plane.obj", (-1.5, 0, -4), (0, np.pi/4, 0), 4)
table_pcd = create_pcd(q1_dirname + "table.obj", (-90, -92.25, -27), (-np.pi/2, 0, 0), 0.3)
trashcan_pcd = create_pcd(q1_dirname + "trashcan.obj", (-0.5, 1, 2), (0, 0, 0), 3)
mesh = o3d.io.read_triangle_mesh(q1_dirname + "table.obj")
mesh1 = o3d.geometry.TriangleMesh.create_coordinate_frame()

tabletop_scene_pcd = boat_pcd + car_pcd + laptop_pcd + plane_pcd + table_pcd + trashcan_pcd

o3d.visualization.draw_geometries([tabletop_scene_pcd])
o3d.io.write_point_cloud("tabletop.pcd", tabletop_scene_pcd)


In [None]:
diameter = np.linalg.norm(
    np.asarray(tabletop_scene_pcd.get_max_bound()) - np.asarray(tabletop_scene_pcd.get_min_bound()))

print("Define parameters used for hidden_point_removal")
camera = [0,20, diameter]
radius = diameter * 100

print("Get all points that are visible from given view point")
_, pt_map = tabletop_scene_pcd.hidden_point_removal(camera, radius)

print("Visualize result")
pcd = tabletop_scene_pcd.select_by_index(pt_map)
# o3d.visualization.draw_geometries([pcd])

print("Define parameters used for hidden_point_removal")
camera = [0, 20, -diameter]
radius = diameter * 100

print("Get all points that are visible from given view point")
_, pt_map = tabletop_scene_pcd.hidden_point_removal(camera, radius)

print("Visualize result")
pcd1 = tabletop_scene_pcd.select_by_index(pt_map)
# o3d.visualization.draw_geometries([pcd1])

camera = [-20, 0, -diameter]
radius = diameter * 100

print("Get all points that are visible from given view point")
_, pt_map = tabletop_scene_pcd.hidden_point_removal(camera, radius)

print("Visualize result")
pcd2 = tabletop_scene_pcd.select_by_index(pt_map)
# o3d.visualization.draw_geometries([pcd2, mesh1])

camera = [20, 0, -diameter]
radius = diameter * 100

print("Get all points that are visible from given view point")
_, pt_map = tabletop_scene_pcd.hidden_point_removal(camera, radius)

print("Visualize result")
pcd3 = tabletop_scene_pcd.select_by_index(pt_map)
# o3d.visualization.draw_geometries([pcd2, mesh1])

final_pcd = pcd + pcd1 + pcd2 + pcd3

o3d.visualization.draw_geometries([final_pcd, mesh1])

## 2. Euler Angles, Rotation Matrices, and Quaternions
1. Write a function (do not use inbuilt libraries for this question):
    - that returns a rotation matrix given the angles $\alpha$, $\beta$, and $\gamma$ in radians (X-Y-Z).
    - to convert a rotation matrix to quaternion and vice versa. 

2. What is a Gimbal lock? Suppose an airplane increases its pitch from $0°$ to $90°$. 

    - Let $R_{gmb\beta}$ be the rotation matrix for $\beta=90°$. Find $R_{gmb\beta}$.
    - Consider the point $p = [0, 1, 0]ᵀ $ on the pitched airplane, i.e. the tip of the wing. Does there exist any $α , γ$ such that $p_{new} = R_{gmb\beta}\; p$ for:
      1. $p_{new} = [1, 0, 0]ᵀ $
      2. For some  $p_{new}$ on the XY unit circle?
      3. For some  $p_{new}$ on the YZ unit circle?
      
      Show your work for all three parts and briefly explain your reasoning. Why is $\beta=90°$  a “certain problematic value”?

    <img src="img/2.3.jpeg"  width="500" ><br>
    
    <img src="img/2.1.jpeg"  width="500" ><br>

    <img src="img/2.2.jpeg"  width="500" >
    


## 3. Transformations and Homogeneous Coordinates

1. Watch this [video](https://www.youtube.com/watch?v=PvEl63t-opM) to briefly understand homogeneous coordinates. 
    1. What are points at infinity? What type of transformation can you apply to transform a point from infinity to a point that is not at infinity? 
    2. Find the vanishing point for the given images in the **data/Q3** folder. Complete function **FilterLines()** and  **GetVanishingPoint()** in the given starter code.

<br>

2. Using homogeneous coordinates we can represent different types of transformation as point transforms vs. frame transforms. Concatenation of transforms (whether you post multiply transformation matrices or pre-multiply transformation matrices) depends on the problem and how you are viewing it. Try to understand the difference between frame vs. point transformations from this [video](https://youtu.be/Za7Sdegf8m8?t=1834). We have 5 camera frames A, B, C, D and E. Given *frame* transformation $A \rightarrow B$ ,  $B \rightarrow C$ ,  $D \rightarrow C$ ,  $D \rightarrow E$. Compute *frame transformation*  $D \rightarrow E$. Also, given the co-ordinates of a point *x* in *D's* frame, what transformation is required to get *x's*  co-ordinates in *E's* frame? 

    <img src="img/3.jpeg"  width="500" >



In [None]:
import os
import cv2
import math
import numpy as np
import matplotlib.pyplot as plt

def ReadImage(InputImagePath):
    Images = []                     # Input Images will be stored in this list.
    ImageNames = []                 # Names of input images will be stored in this list.
    
    # Checking if path is of file or folder.
    if os.path.isfile(InputImagePath):						    # If path is of file.
        InputImage = cv2.imread(InputImagePath)                 # Reading the image.
        
        # Checking if image is read.
        if InputImage is None:
            print("Image not read. Provide a correct path")
            exit()
        
        Images.append(InputImage)                               # Storing the image.
        ImageNames.append(os.path.basename(InputImagePath))     # Storing the image's name.

	# If path is of a folder contaning images.
    elif os.path.isdir(InputImagePath):
		# Getting all image's name present inside the folder.
        for ImageName in os.listdir(InputImagePath):
			# Reading images one by one.
            InputImage = cv2.imread(InputImagePath + "/" + ImageName)
			
            Images.append(InputImage)							# Storing images.
            ImageNames.append(ImageName)                        # Storing image's names.
        
    # If it is neither file nor folder(Invalid Path).
    else:
        print("\nEnter valid Image Path.\n")
        exit()

    return Images, ImageNames
        
def GetLines(Image):
    # Converting to grayscale
    GrayImage = cv2.cvtColor(Image, cv2.COLOR_BGR2GRAY)
    # Blurring image to reduce noise.
    BlurGrayImage = cv2.GaussianBlur(GrayImage, (5, 5), 1)
    # Generating Edge image
    EdgeImage = cv2.Canny(BlurGrayImage, 40, 255)

    # Finding Lines in the image
    Lines = cv2.HoughLinesP(EdgeImage, 1, np.pi / 180, 50, 10, 15)

    # print(Lines)

    # from matplotlib import pyplot as plt
    # plt.imshow(EdgeImage, interpolation='nearest')
    # plt.show()

    # Check if lines found and exit if not.
    if Lines is None:
        print("Not enough lines found in the image for Vanishing Point detection.")
        exit(0)
    
    return Lines
    
REJECT_DEGREE_TH = 4.0

def FilterLines(Lines):
    pass

def GetVanishingPoint(FilteredLines):
    pass


Images, ImageNames = ReadImage("./data/Q3/hyperspace.jpeg")            # Reading all input images

# print(Images)
print(ImageNames)

for i in range(len(Images)):
    Image = Images[i]

    # Getting the lines form the image
    Lines = GetLines(Image)

    FilteredLines = FilterLines(Lines)
    # Get vanishing point
    VanishingPoint = GetVanishingPoint(FilteredLines)

    # Checking if vanishing point found
    if VanishingPoint is None:
        print("Vanishing Point not found. Possible reason is that not enough lines are found in the image for determination of vanishing point.")
        continue

    # Drawing lines and vanishing point
    # print(Lines)
    for Line in Lines:
        cv2.line(Image, (Line[0][0], Line[0][1]), (Line[0][2], Line[0][3]), (0, 255, 0), 2)
    cv2.circle(Image, (int(VanishingPoint[0]), int(VanishingPoint[1])), 10, (0, 0, 255), -1)

    # Showing the final image
    cv2.imshow("OutputImage", Image)
    cv2.waitKey(0)

## 4. LiDAR and Registration

Point clouds are a collection of points that represent a 3D shape or feature. Each point has its own set of X, Y and Z coordinates and in some cases additional attributes. A popular way to obtain this is by photogrammetry, though here we will use LiDAR data.

LiDAR is a remote sensing process which collects measurements used to create 3D models and maps of objects and environments. Using ultraviolet, visible, or near-infrared light, LiDAR gauges spatial relationships and shapes by measuring the time it takes for signals to bounce off objects and return to the scanner.

Download the data from [here](https://iiitaphyd-my.sharepoint.com/:f:/g/personal/venkata_surya_students_iiit_ac_in/EnYAMaTVIhJItzKYqtahE30BRKB6p6UfHN3TyJzvo6Mw0g?e=PegWds). It contains the LIDAR sensor output and odometry information per frame.

  The .bin files contain the 3D point cloud captured by the LIDAR in this format - x, y, z, and reflectance. 

  The odometry information is given in the `odometry.txt` file, which is a 12 element vector. Reshape each of the first 77 rows to a 3x4 matrix to obtain the pose.
    
The point cloud obtained is with respect to the LiDAR frame. The poses however, are in the camera frame. If we want to combine the point clouds from various frames, we need to bring them to the camera frame. 

1. Refer to the image below and apply the required transformation to the point cloud. 
<br>

    <img src="img/4.jpeg"  width="500" >

<br>

2. Then, register all point clouds into a common reference frame and visualise it (Open3D). It is helpful to use homogeneous coordinates to keep track of the different frames.

3. Write a function to transform the registered point cloud from the world to the $i^{th}$ camera frame, wherein $i$ is the input to the function.


