# Shape Parts Segmentation using PointNet

Here we will train Pointnet to perform point-wise semantic segmentation on the ShapeNet dataset. The semantically labelled point cloud data was taken from 
https://shapenet.cs.stanford.edu/ericyi/shapenetcore_partanno_segmentation_benchmark_v0.zip. 

<img src="src/images/pc_seg_plane.png" alt="airplane" style="width: 500px;"/><img src="src/images/pc_seg_bike.png" alt="bike" style="width: 500px;"/>

### Install Pre-requisites

Install required tools for the notebook here

In [None]:
!pip install -r requirements.txt

### Defining the model

The model architecture of PointNet is visualized below:


<img src="src/images/pointnet.png" alt="pointnet_architecture" style="width: 800px;"/>



We now go one step further: We do not just want to learn the overall class label for a given shape but instead for each point in a shape the part it belongs to. We call this Part Segmentation.

### Download the ShapeNetPart dataset

In terms of data layout, the general idea of shape class identifiers and shape IDs is the same; we just have slightly different shape categories now. Also, each point cloud now has a correponding file specifying the part class for every point.

We put the shape class labels for this dataset in `PointNet_Segmentation/data/shape_parts_info.json`, analogous to `shape_info.json` from exercise parts 2.3 and 2.4.

The point cloud data is stored as pts files which is basically an even simpler version of obj. It omits the v in front of each line that represents a point and does not support faces. Each line therefore represents one point with its xyz coordinates, separated by a space.

```
# contents of exercise_2/data/shapenetcore_partanno_segmentation_benchmark_v0

02691156/                                         # Shape category folder with all its shapes
    ├── points                                    # All point clouds go here
        ├── 1a04e3eab45ca15dd86060f189eb133.pts   # Point cloud data
        ├── 1a32f10b20170883663e90eaf6b4ca52.pts  # Another point cloud
        :
        :
    ├── points_label                              # Part labels for each point in the corresponding pts file
        ├── 1a04e3eab45ca15dd86060f189eb133.seg   # Each line represents the local part class of a point
        ├── 1a32f10b20170883663e90eaf6b4ca52.seg  # Another segmentation file
        :
        :
    ├── seg_img                                   # Visualizations of the original mesh part segmentation
02773838/                                         # Another shape category folder
02954340/                                         # In total you should have 16 shape category folders
:
:
train_test_split/                                 # Official split IDs
```

In [2]:
print('Downloading ...')
!wget https://shapenet.cs.stanford.edu/ericyi/shapenetcore_partanno_segmentation_benchmark_v0.zip --no-check-certificate -P exercise_2/data
print('Extracting ...')
!unzip -q exercise_2/data/shapenetcore_partanno_segmentation_benchmark_v0.zip -d exercise_2/data
!rm exercise_2/data/shapenetcore_partanno_segmentation_benchmark_v0.zip
print('Done.')

Downloading ...
--2024-07-28 15:11:50--  https://shapenet.cs.stanford.edu/ericyi/shapenetcore_partanno_segmentation_benchmark_v0.zip
Resolving shapenet.cs.stanford.edu (shapenet.cs.stanford.edu)... failed: Temporary failure in name resolution.
wget: unable to resolve host address ‘shapenet.cs.stanford.edu’
Extracting ...
unzip:  cannot find or open exercise_2/data/shapenetcore_partanno_segmentation_benchmark_v0.zip, exercise_2/data/shapenetcore_partanno_segmentation_benchmark_v0.zip.zip or exercise_2/data/shapenetcore_partanno_segmentation_benchmark_v0.zip.ZIP.
rm: cannot remove 'exercise_2/data/shapenetcore_partanno_segmentation_benchmark_v0.zip': No such file or directory
Done.


### Dataset implementation

In [3]:
%load_ext autoreload
%autoreload 2
from src.data.shapenet_parts import ShapeNetParts

# Create a dataset with train split
train_dataset = ShapeNetParts('train')
val_dataset = ShapeNetParts('val')
overfit_dataset = ShapeNetParts('overfit')

# Get length, which is a call to __len__ function
print(f'Length of train set: {len(train_dataset)}')  # expected output: 12137
# Get length, which is a call to __len__ function
print(f'Length of val set: {len(val_dataset)}')  # expected output: 1870
# Get length, which is a call to __len__ function
print(f'Length of overfit set: {len(overfit_dataset)}')  # expected output: 64


Length of train set: 12137
Length of val set: 1870
Length of overfit set: 64


### Initializing the PointNet Model

In [4]:
import torch
from src.model.pointnet import PointNetSegmentation
from src.util.model import summarize_model

pointnet = PointNetSegmentation(50)
print(summarize_model(pointnet))  # Expected: Rows 0-40 and TOTAL = 3533563

input_tensor = torch.randn(8, 3, 1024)
predictions = pointnet(input_tensor)

print('Output tensor shape: ', predictions.shape)  # Expected: 8, 1024, 50
num_trainable_params = sum(p.numel() for p in pointnet.parameters() if p.requires_grad) / 1e6
print(f'Number of traininable params: {num_trainable_params:.2f}M')  # Expected: ~3M

   | Name                                    | Type                 | Params 
-----------------------------------------------------------------------------------
0  | encoder                                 | PointNetEncoder      | 2803529
1  | encoder.conv1                           | Conv1d               | 256    
2  | encoder.bn1                             | BatchNorm1d          | 128    
3  | encoder.conv2                           | Conv1d               | 8320   
4  | encoder.bn2                             | BatchNorm1d          | 256    
5  | encoder.conv3                           | Conv1d               | 132096 
6  | encoder.bn3                             | BatchNorm1d          | 2048   
7  | encoder.input_transform_net             | TNet                 | 803081 
8  | encoder.input_transform_net.conv_1      | Conv1d               | 256    
9  | encoder.input_transform_net.bn_conv_1   | BatchNorm1d          | 128    
10 | encoder.input_transform_net.conv_2      | Conv1d     

### Training Script and Overfitting


In [7]:
from src.training import train_pointnet_segmentation
config = {
    'experiment_name': 'pointnet_segmentation_overfitting',
    'device': 'cuda:0',                   # change this to cpu if you do not have a GPU
    'is_overfit': True,                   # True since we're doing overfitting
    'batch_size': 32,
    'resume_ckpt': None,#f'./src/runs/pointnet_segmentation_overfitting/model_best.ckpt',
    'learning_rate': 0.001,
    'max_epochs': 500,
    'print_every_n': 100,
    'validate_every_n': 100,
}

train_pointnet_segmentation.main(config)  # should be able to get <0.1 loss, >97% accuracy, >0.95 iou

Using device: cuda:0
[049/00001] train_loss: 1.034
[049/00001] val_loss: 0.377, val_accuracy: 88.017%, val_iou: 0.749
[099/00001] train_loss: 0.361
[099/00001] val_loss: 0.252, val_accuracy: 90.613%, val_iou: 0.785
[149/00001] train_loss: 0.222
[149/00001] val_loss: 0.185, val_accuracy: 93.692%, val_iou: 0.837
[199/00001] train_loss: 0.162
[199/00001] val_loss: 0.162, val_accuracy: 93.980%, val_iou: 0.853
[249/00001] train_loss: 0.130
[249/00001] val_loss: 0.115, val_accuracy: 95.808%, val_iou: 0.898
[299/00001] train_loss: 0.111
[299/00001] val_loss: 0.152, val_accuracy: 94.737%, val_iou: 0.867
[349/00001] train_loss: 0.193
[349/00001] val_loss: 0.132, val_accuracy: 94.987%, val_iou: 0.882
[399/00001] train_loss: 0.123
[399/00001] val_loss: 0.096, val_accuracy: 96.445%, val_iou: 0.919
[449/00001] train_loss: 0.104
[449/00001] val_loss: 0.087, val_accuracy: 96.519%, val_iou: 0.928
[499/00001] train_loss: 0.106
[499/00001] val_loss: 0.074, val_accuracy: 97.066%, val_iou: 0.936


### Training over the entire training set

Once your overfitting completes successfully, you can move on to training on the entire dataset again.

In [8]:
from src.training import train_pointnet_segmentation
config = {
    'experiment_name': 'pointnet_segmentation_generalization',
    'device': 'cuda:0',                   # change this to cpu if you do not have a GPU
    'is_overfit': False,
    'batch_size': 28,
    'resume_ckpt': None,
    'learning_rate': 0.001,
    'max_epochs': 6,
    'print_every_n': 100,
    'validate_every_n': 250,
}

train_pointnet_segmentation.main(config)  # Should be able to get > 90% accuracy and > 0.8 iou on the val set

Using device: cuda:0
[000/00099] train_loss: 1.575
[000/00199] train_loss: 0.898
[000/00249] val_loss: 0.652, val_accuracy: 80.119%, val_iou: 0.672
[000/00299] train_loss: 0.737
[000/00399] train_loss: 0.716
[001/00065] train_loss: 0.609
[001/00065] val_loss: 0.745, val_accuracy: 78.263%, val_iou: 0.667
[001/00165] train_loss: 0.523
[001/00265] train_loss: 0.502
[001/00315] val_loss: 0.426, val_accuracy: 85.878%, val_iou: 0.754
[001/00365] train_loss: 0.430
[002/00031] train_loss: 0.432
[002/00131] train_loss: 0.393
[002/00131] val_loss: 0.378, val_accuracy: 87.405%, val_iou: 0.770
[002/00231] train_loss: 0.389
[002/00331] train_loss: 0.393
[002/00381] val_loss: 0.323, val_accuracy: 89.085%, val_iou: 0.789
[002/00431] train_loss: 0.377
[003/00097] train_loss: 0.358
[003/00197] train_loss: 0.327
[003/00197] val_loss: 0.302, val_accuracy: 89.862%, val_iou: 0.812
[003/00297] train_loss: 0.358
[003/00397] train_loss: 0.334
[004/00013] val_loss: 0.274, val_accuracy: 90.782%, val_iou: 0.820


### (f) Inference using the trained model

In [16]:
from src.inference.infer_pointnet_segmentation import InferenceHandlerPointNetSegmentation
from src.util.visualization import visualize_pointcloud
from matplotlib import cm, colors
import numpy as np

# create a handler for inference using a trained checkpoint
inferer = InferenceHandlerPointNetSegmentation('src/runs/pointnet_segmentation_generalization/model_best.ckpt')

In [18]:
# Get shape point cloud, predict labels, and visualize colored point cloud
shapenet_obj = ShapeNetParts('val')
shape_points = shapenet_obj.get_point_cloud_with_labels('02691156/1c4b8662938adf41da2b0f839aba40f9')[0]
point_labels = inferer.infer_single(shape_points)
point_labels = (point_labels - min(point_labels)) / (max(point_labels) - min(point_labels))
point_colors = cm.get_cmap('hsv')(point_labels)[:, :3]
point_colors = np.sum((point_colors * 255).astype(int) * [255*255, 255, 1], axis=1)
visualize_pointcloud(shape_points.T, colors=point_colors, point_size=0.025, flip_axes=True)

  point_colors = cm.get_cmap('hsv')(point_labels)[:, :3]


Output()

In [19]:
# Get shape point cloud, predict labels, and visualize colored point cloud
shape_points = shapenet_obj.get_point_cloud_with_labels('03948459/e017cf5dac1e39b013d74211a209ce')[0]
point_labels = inferer.infer_single(shape_points)
point_labels = (point_labels - min(point_labels)) / (max(point_labels) - min(point_labels))
point_colors = cm.get_cmap('hsv')(point_labels)[:, :3]
point_colors = np.sum((point_colors * 255).astype(int) * [255*255, 255, 1], axis=1)
visualize_pointcloud(shape_points.T, colors=point_colors, point_size=0.025, flip_axes=True)

  point_colors = cm.get_cmap('hsv')(point_labels)[:, :3]


Output()

In [20]:
# Get shape point cloud, predict labels, and visualize colored point cloud
shape_points = shapenet_obj.get_point_cloud_with_labels('03790512/86b6dc954e1ca8e948272812609617e2')[0]
point_labels = inferer.infer_single(shape_points)
point_labels = (point_labels - min(point_labels)) / (max(point_labels) - min(point_labels))
point_colors = cm.get_cmap('hsv')(point_labels)[:, :3]
point_colors = np.sum((point_colors * 255).astype(int) * [255*255, 255, 1], axis=1)
visualize_pointcloud(shape_points.T, colors=point_colors, point_size=0.025, flip_axes=True)

  point_colors = cm.get_cmap('hsv')(point_labels)[:, :3]


Output()

## References



[1] Qi, C. et al. “PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation.” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017): 77-85.