## SHREC 2022 - PRIMITIVE FITTING

### Table of Contents

* [Preliminaries](#prelims)
* [Data Loading](#dl)
* [Dataset](#dset)
* [Baseline Model](#bmodel)
* [Losses](#losses)
* [Training Loop](#tloop)
* [Sources](#refs)

## Preliminaries <a class="anchor" id="prelims"></a>

The _"SHREC 2022: Fitting and recognition of simple geometric primitives on point clouds"_ track poses the challenge of recovering primitive shape parameters from 3d point clouds. The data consists of unordered points sampled from various primitives, possibly by adding some form of perturbation like noise, dropout, deformation etc. The set of all possible primitives and their parameters are as follows:

  * **Plane**, represented as its normal vector and a point sampled from the surface of the plane,
  * **Cylinder**, represented as its radius, rotation axis and a point sampled along said axis,
  * **Sphere**, represented as its radius and center,
  * **Cone**, represented as the rotational axis, half the aperture (the angle $\theta$ between the axis and any generatrix line) as well as a vertex.
  * **Torus**, represented as the major and minor radii, the rotational axis and the center.

The task is to build a framework that, given a point cloud, predicts the parameters of the primitive it was sampled from.

### Data Loading <a class="anchor" id="dl"></a>

In [1]:
import open3d as o3d
import os
import torch
import einops
from tqdm import tqdm

Having imported the necessary libary for displaying and manipulating the data as tensors, let us proceed to create appropriate functions for loading, conversions and visualizations.

In [2]:
def parse_point_cloud(fname):
    
    file = open(fname)
    points = []
    
    for line in file.readlines():
        pts = torch.Tensor(list(map(float, line.split(","))))
        points.append(pts)
    
    return einops.rearrange(points, "n d -> n d")

def tensor_to_o3d(pcloud):
    
    #sanity check
    assert pcloud.dim() == 2
    assert pcloud.shape[1] == 3
    
    #converting to numpy and removing device and associated gradients
    pcloud = pcloud.cpu().detach().numpy()
    
    #converting to open3d's data structures
    pcloud = o3d.utility.Vector3dVector(pcloud)
    pcloud = o3d.geometry.PointCloud(pcloud)
    
    return pcloud

#Displays a given point cloud using WebGL functionality
def show_point_cloud_o3d(pcloud):
    
    if isinstance(pcloud, torch.Tensor):
        pcloud = tensor_to_o3d(pcloud)
    
    o3d.visualization.draw_geometries([pcloud])

#Displays a given point cloud using open3d's JVisualizer
def show_point_cloud_jupyter(pcloud):
    
    if isinstance(pcloud, torch.Tensor):
        pcloud = tensor_to_o3d(pcloud)
    
    o3d.web_visualizer.draw(pcloud)


## Dataset <a class="anchor" id="dset"></a>

For a learning approach to this problem, we are going to need a dataset object and a dataloader that can be iterated. To do that we are going to be using the parser we created earlier as well as pytorch's `Dataset` class as a template. The data for this problem is unfortunately highly irregular, even if the task at hand seems easy at first glance.

There are 5 different types of primitives, each with its own set of parameters. The parameters themselves can have wildly different values which are had for a typical neural network to predict. Not only that, but the length of each set of parameters is different for each primitive, not allowing a "normal" representation of the labels.

The following dataset class will return the data and labels as a dictionary. The labels are their own dictionary, containing a string that specifies the type of primitive, its individual parameters as key-value pairs (for example `"radius" : 2.18`) and a torch tensor containing all of the parameters as one.

Naturally, a set of transformations are also included. A particularly important one is the unit-sphere normalization, which makes sure each point vector in the point cloud has a maximum length of one. This is accomplished by dividing the coordinates by the largest point vector's length. At this step it is important to save that normalization factor for later use.

In [3]:
from torch_geometric.data import Data

class SHREC2022Dataset(torch.utils.data.Dataset):
    
    def __init__(self, path, train=True):
        
        self.path = os.path.join(path, "training" if train else "test")
        self.pc_prefix = "pointCloud"
        self.gt_prefix = "GTpointCloud"
        self.format = ".txt"
        
        self.size = len(os.listdir(os.path.join(self.path, self.pc_prefix)))

        
    def parse_point_cloud(self, fname):
    
        file = open(fname)
        points = []

        for line in file.readlines():
            pts = torch.Tensor(list(map(float, line.split(","))))
            points.append(pts)

        return einops.rearrange(points, "n d -> n d")
    
    def parse_label(self, fname):
        
        file = open(fname)
        
        #assigning a distinct function to handle each type of primitive
        handlers ={
                   "1": self.parse_plane,
                   "2": self.parse_cylinder,
                   "3": self.parse_sphere,
                   "4": self.parse_cone,
                   "5": self.parse_torus
                  }
        
        #parsing the contents of the file. The first character corresponds to a specific type of primitive
        contents =  file.readlines()
        
        #handling the primitive and returning the label
        return handlers[contents[0][0]](contents)
    
    def parse_plane(self, lines):
        
        normal = torch.Tensor(list(map(float, lines[1:4])))
        vertex = torch.Tensor(list(map(float, lines[4:])))
        data = torch.Tensor([0] + list(map(float, lines[1:])) + [-1, -1])
    
        return {"type": "plane", "class": 0, "vertex": vertex, "normal":normal, "data": data}
    
    def parse_cylinder(self, lines):
        
        radius = float(lines[1])
        axis = torch.Tensor(list(map(float, lines[2:5])))
        vertex = torch.Tensor(list(map(float, lines[5:])))
        data = torch.Tensor([1] + list(map(float, lines[1:])) + [-1])
    
        return {"type": "cylinder", "class": 1, "radius": radius, "axis": axis, "vertex": vertex, "data": data}
    
    def parse_sphere(self, lines):
        
        radius = float(lines[1])
        center = torch.Tensor(list(map(float, lines[2:])))
        data = torch.Tensor([2] + list(map(float, lines[1:])) + [-1]*4)
    
        return {"type": "sphere", "class": 2, "radius": radius, "center": center, "data": data}
    
    def parse_cone(self, lines):
        
        angle = float(lines[1])
        axis = torch.Tensor(list(map(float, lines[2:5])))
        vertex = torch.Tensor(list(map(float, lines[5:])))
        data = torch.Tensor([3] + list(map(float, lines[1:])) + [-1])
    
        return {"type": "cone", "class": 3, "angle": angle, "axis": axis, "vertex": vertex, "data": data}
    
    def parse_torus(self, lines):
        
        major_radius = float(lines[1])
        minor_radius = float(lines[2])
        axis = torch.Tensor(list(map(float, lines[3:6])))
        center = torch.Tensor(list(map(float, lines[6:])))
        data = torch.Tensor([4] + list(map(float, lines[1:])))
    
        return {"type": "torus", "class": 4, "major_radius": major_radius, "minor_radius": minor_radius, "axis": axis, "center": center, "data": data}
    
    def __getitem__(self, index):
        
        #adding 1 because the files are not 0-indexed
        index += 1
        
        #assembling the file name for the data and labels
        pc_name = os.path.join(self.path, self.pc_prefix, self.pc_prefix + str(index) + self.format)
        gt_name = os.path.join(self.path, self.gt_prefix, self.gt_prefix + str(index) + self.format)
        
        #parsing the point cloud
        pcloud = self.parse_point_cloud(pc_name)
        label = self.parse_label(gt_name)
        
        #data = {"x": pcloud, "y": label['data']}
        data = Data(x=pcloud, y=label['data'].unsqueeze(0))
        
        return self.transform(data)
        
    def __len__(self):
        
        return self.size
    
    def transform(self, data):
        
        return self.unit_sphere_normalize(data)
        
    def unit_sphere_normalize(self, x):
        
        max_norm = (x["x"]*x["x"]).sum(-1).max().sqrt()
        x["x"] /= max_norm
        
        x["norm_factor"] = max_norm
        
        return x


def batch_collate_fn(batch_list):
    
    max_sz = int(torch.Tensor([item['x'].shape[0] for item in batch_list]).max().item())
    pad = torch.zeros(1, 3)
    
    print(max_sz)
    x = torch.stack([torch.cat((item['x'], pad.repeat((max_sz - item['x'].shape[0], 1))), dim=0) for item in batch_list])
    
    return {
           "x":   x,
           "y":   torch.Tensor([item['y']['class'] for item in batch_list]),
           #"z":   torch.stack([item['y']['data'] for item in batch_list]),
           "w":   [item['y'] for item in batch_list] 
    }
        

In [4]:
from torch_geometric.loader import DataLoader

path = "/home/ioannis/Desktop/programming/data/SHREC/SHREC2022/dataset"
dataset = SHREC2022Dataset(path, train=True)

batch_size = 16

train_loader = DataLoader(dataset, batch_size=batch_size, shuffle=True )
next(iter(train_loader))

DataBatch(x=[51696, 3], y=[16, 9], norm_factor=[16], batch=[51696], ptr=[17])

### Baseline Model <a class="anchor" id="model"></a>

For the base model we will be testing out the PointNet we implemented in <a href="../../implementations/notebooks/pointnet.ipynb">this</a> notebook. Knowing the data has been affected by dropouts there are bound to be point clouds with considerably different cardinality. PointNet can consume point clouds of any number of input points, it is unnaffected by permutations and it is resistant to transformations such as rotation. This makes it suitable for a baseline model before we can move on to a more complex architecture.

Our implementation of PointNet outputs $k$ unnormalized scores which can be used for k-class classification. Since our data consists of 5 distinct primitives, each with its own set of parameters we are going to use PointNet as a feature extractor and feed the $k$ outputs to 5 regression heads, each one responsible for predicting a different set of parameters. Since we have no way of knowing in advance which primitive will be predicted, but we do know that each data sample contains exactly one, we have several choices:

  * Stack a 6th linear head on top of the model, with 5 outputs, responsible for classifying the input shape into one of the 5 categories. We can then predict parameters for all types of primitives, but only use the set corresponding to the classified primitive.
  
  * We can se all 5 of the regressors, and measure the distance between the input shape and each one of the predicted primitives. Then we choose the primitive that yielded the lowest distance as the correct one. 

In [5]:
regressor_outputs = {
    "plane": 6, "cylinder": 7, "cone": 7,
    "sphere": 4, "torus": 8
}

class Regressor(torch.nn.Module):
    
    def __init__(self, in_channels, out_channels):
        super(Regressor, self).__init__()
        
        self.reg = torch.nn.Sequential(
            torch.nn.Linear(in_channels, out_channels),
            torch.nn.LeakyReLU(),
            torch.nn.Linear(in_channels, in_channels)
        )
        
    def forward(self, x):
        
        return self.reg(x)

# PointNet backbone

In [6]:
import torch
import einops
import math
import torch_geometric.nn as gnn


#Convenience module, includes a sharedMLP (conv1d or fully connected)
#Includes batch normalization and relu non-linearity if specified
class SharedMLP(torch.nn.Module):
    
    def __init__(self, in_channels, out_channels, conv = False, include_bn = True, include_relu = True):
        super(SharedMLP, self).__init__()
        
        modules = [torch.nn.Conv1d(in_channels, out_channels, 1)] if conv else [torch.nn.Linear(in_channels, out_channels)]
        
        if include_bn:
            modules.append(torch.nn.BatchNorm1d(out_channels))
            
        if include_relu:
            modules.append(torch.nn.ReLU())
            
        self.net = torch.nn.Sequential(*modules)
    
    def forward(self, x):
        
        return self.net(x)

    
#PointNet architecture
class PointNet(torch.nn.Module):
    
    def __init__(self, in_channels, num_classes):
        super(PointNet, self).__init__()
        
        self.in_channels = in_channels
        self.k = num_classes
                        
        #Shared MLP layers 3->64, 64->64
        self.seq1 = torch.nn.Sequential(
            SharedMLP(3, 64),
            SharedMLP(64, 64),
        )
                        
        
        self.seq2 = torch.nn.Sequential(
            #Shared MLP layers 64->64, 64->128, 128->1024
            SharedMLP(64, 64),
            SharedMLP(64, 128),
            SharedMLP(128, 1024),
        )
        
        self.seq3 = torch.nn.Sequential(
            #Linear layers
            SharedMLP(1024, 512, conv=False),
            SharedMLP(512, 256, conv=False),
            nn.Dropout(p=0.3),
            SharedMLP(256, self.k, conv=False, include_bn=False, include_relu=False)
        )
            
    def forward(self, x):
        
        #Input: 
        #    x -> Tensor (B, F, N)
        #
        #Output:
        #    x  -> Tensor(B, k)
        #    t1 -> Tensor(B, 3, 3)
        #    t2 -> Tensor(B, 64, 64)
            
        #applying the input transform
        
        #applying shared mlp
        f = self.seq1(x.x)
                
        #applying shared mlp
        f = self.seq2(f)
        
        f = gnn.global_max_pool(f, x.batch)
        
        f = self.seq3(f)
                
        return f


## Training Loop <a class="anchor" id="tloop"></a>

In [7]:
from tqdm import tqdm
import torch.nn as nn
#cross entropy criterion for classification
cls_loss = torch.nn.CrossEntropyLoss()

device = torch.device("cuda")
print(device)
param_loss = torch.nn.MSELoss()

num_epochs = 100
model = PointNet(3, 5).to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)



for i in range(num_epochs):
    
    m_loss = 0
    acc = 0
    for batch in tqdm(train_loader):
        
        batch = batch.to(device)        
        optimizer.zero_grad()
        pred = model(batch)
        labels = batch.y[:, 0].long()
        
        loss = cls_loss(pred, labels)
        
        loss.backward()
        
        optimizer.step()
        
        m_loss += loss.item() / (batch.batch[-1]+1)
        
        acc += (torch.max(pred, dim=-1).indices == labels).sum() / (batch.batch[-1]+1)
        
    m_loss /= len(train_loader)
    acc /= len(train_loader)
    
    print(f"epoch {i} - loss: {m_loss}")
    print(f"---- accuracy {acc}")

cuda


100%|██████████| 2875/2875 [11:59<00:00,  4.00it/s]


epoch 0 - loss: 0.07429158687591553
---- accuracy 0.5091738700866699


100%|██████████| 2875/2875 [11:43<00:00,  4.09it/s]


epoch 1 - loss: 0.06451764702796936
---- accuracy 0.5784565210342407


 66%|██████▌   | 1903/2875 [07:28<03:48,  4.25it/s]


KeyboardInterrupt: 

In [1]:
# TODO: Data analytics (number of point clouds in each class)

### Sources <a class="anchor" id="refs"></a>