# 15.3 - Adapting VoteNet to ALS Data

This notebook shows how to adapt VoteNet for its application on ALS point cloud data. This mainly concerns the change of the hyperparameters of the layers of the neural network model. Another small aspect is the weighting of the individual parts of the multi-task loss function.

In this notebook, individual files of the cloned GitHub repository that are located in the "models" folder are changed for this purpose, of which we have made backup copies in the first notebook. Just double click in the file browser of JupyterLab on the respective Python files. After you have made modifications to a Python file, make sure to save it with "Save Python File" in the menu File before you use it for training the model in the terminal window. (Unfortunately, Python files are not automatically saved and there is also no convenient button above the editor.)

**Disclaimer: Please be aware that the exercise on VoteNet as well as the provided files are the result of a rather quick hack and are probably not free of errors.**

# Backbone Module

The backbone of the network is the PointNet++ module for feature learning and extraction, which also outputs the seed points that are used in the remainder of VoteNet. (See the lecture notes for the complete VoteNet architecture.) Open the Python file "backbone_module.py" and study it for a moment. 

In the *\_\_init\_\_()* constructor method, the layers of the PointNet++ module are constructed and their hyperparameters set. These hyperparameter values have been chosen for indoor scenes, where the extents of the objects and the scene itself are much smaller than in urban scenes with buildings, trees, cars, streets, etc. 

The second major method is the *forward()* method that is used in the forward pass of network training. Here, you can see how the outputs of one layer (x,y,z-coordinates and features) are forwarded to the next layer. Since PyTorch keeps track of how the computation is performed in the forward pass, it can automatically derive from the forward pass the computations of the backwards pass. Therefore, the backwards pass does not need to be provided by the programmer. 

(In the dictionary called "end_points", VoteNet collects the different outputs of all the layers for further use and debugging. This dictionary also contains a lot of data that is not really needed when operating the network. And one could implement the VoteNet network also without the many lines of code in which the variable "end_points" occur.)

To adapt the backbone module, perform the following changes:

**Change the hyperparameters of the four set abstraction layers (PointnetSAModuleVotes) to the following values:**

Layer 1: Number of points: 8192, radius:  1.0, number of samples: 16, mlp: input_feature_dim, 64, 64, 128.  
Layer 2: Number of points: 4096, radius:  5.0, number of samples: 64, mlp: 128, 128, 256.  
Layer 3: Number of points: 2048, radius: 15.0, number of samples: 64, mlp: 256, 128, 256.  
Layer 4: Number of points:  512, radius: 20.0, number of samples: 32, mlp: 256, 128, 256.

(Leave the parameters "use_xyz" and "normalize_xyz" to the Boolean value True.)

**Change the hyperparameters of the two feature propagation layers (PointnetFPModule) to the following values:**

Layer 1: mlp: 256+256, 256  
Layer 2: mlp: 256+256, 256

The parameter values were chosen following the PointNet++ implementation for ALS data. After you have trained the network once and evaluated the quality metrics, feel free to choose other hyperparameters and try to find better ones. You can also try to use only 3 set abstraction layers, but then remember that the output from the last set abstraction layer is passed into the first feature propagation layer. In addition, the number of feature propagation layers must then also be reduced, so as not to generate too many seed points. Note that the feature propagation layers restore the number of points that are sampled in the set abstraction layers and that these are the seed points.

# Proposal Radius

In the second part of the VoteNet architecture, the vote points are clustered to get a fixed number (defined by a hyperparameter) of object centers as object proposals. For this clustering process, a radius for a spherical neighborhood must be specified. This is necessary as the network is not capable to predict the coordinates of vote points that vote for the same object to be exactly at the same position. There will rather be some deviations from it and the clustering with a certain radius will compensate for it.

For indoor scenes, the clustering radius is set to 0.3m, which might be sufficient for smaller objects (of sizes 1m to 3m) like furniture. However, it is too small for larger outdoor objects like buildings or trees. 

The clustering of vote points in the object proposal module of VoteNet is implemented as a special set abstraction layer called "PointnetSAModuleVotes". Open the "proposal_module.py" file and look for this layer in constructor (*\_\_init()\_\_*) of the "ProposalModule" class.

**Change the radius of the set abstraction layer (PointnetSAModuleVotes) of the vote module to 5.0m.**

A radius of 5.0m should be sufficient to cluster the vote points for larger objects like buildings. (If the radius is chosen too large, then the votes for several objects might be clustered to one object proposal. Therefore, the clustering radius should also not be chosen too large.)

# Loss Function

The loss function consists of several parts (multi-task loss). One of its parts is the objectness loss, where the positions of the predicted object centers are compared with the true object centers. As a simplified explanation, there are two thresholds: if the predicted center of an object is closer than NEAR_THRESHOLD from the center of a ground truth object, the prediction gets an objectness label of 1 (and otherwise a label of 0). However, not all predictions actually contribute to the loss function. Only votes that are correctly predicted as objects and votes that should be predict as object, but are farther located than FAR_THRESHOLD, contribute to the loss value. Votes in between NEAR_THRESHOLD and FAR_THRESHOLD are not considered to be wrong with respect of the loss function.

The default threshold values are again for indoor scenes and defined as 0.3m and 0.6m for the near and far thresholds. These values are obviously too small for outdoor scenes.

Open the file "loss_helper.py" and **change the thresholds of NEAR_THRESHOLD to 5.0m and FAR_THRESHOLD to 10.0m.**

The multi-task loss weights the different parts of the loss function with a number of hyperparameters. From experiments, we noticed better results when increasing the influence of the heading classification in the loss for the oriented bounding boxes (box loss). 

**Find the code line that calculates the box_loss (box_loss = ...) in the *get_loss()* function and change the weight of the heading class loss (heading_cls_loss) from 0.1 to 1.0.** This should improve the prediction of the orientation class of the oriented bounding boxes.

**Final words:**

This concludes the changes that are needed to adapt VoteNet to ALS data. Once more, make sure you saved all the modified Python files with "Save Python File" from the File menu.

**Continue now with the next notebook (15.4).**