<a href="https://colab.research.google.com/github/nicolas-chaulet/torch-points3d/blob/new_notebook/notebooks/PartSegmentationPointNet2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
# Setup packages
!pip install torch==1.3.1 pyvista torchvision==0.4.2 pytorch-lightning
!pip install --upgrade jsonschema
!pip install torch-points3d
!apt-get install -qq xvfb libgl1-mesa-glx

In [0]:
import os
import sys
from omegaconf import OmegaConf
import pyvista as pv
import torch
import numpy as np
import torch.nn.functional as F
from omegaconf import OmegaConf

In [0]:
DIR = "" # Replace with your root directory, the data will go in DIR/data.

<p align="center">
  <img width="40%" src="https://raw.githubusercontent.com/nicolas-chaulet/torch-points3d/master/docs/logo.png" />
</p>

# Segmenting objects in part with PointNet2
In this notebook we will solve the task of segmenting an object into its sub parts by using a [Pointnet2](https://arxiv.org/abs/1706.02413) deep neural network.
We will work on [ShapeNet](https://www.shapenet.org/) dataset which contains 48,600 3D models over 55 common categories with part annotations. We will show you how you can use Torch Points3D to setup a Pointnet2 backbone with a multi head classifier and train it on ShapeNet.

## The dataset
We use Torch Points3D version of ShapeNet that provides automatic download (be patient, it takes some time...) of the data, a tested metric tracker as well as methods for pre computing the spatial operations such as neighbour search and grid sampling on CPU.

Let's start with the data config (if you want more details about that part of Torch Points3D please refer to this notebook).

In [0]:
shapenet_yaml = """
class: shapenet.ShapeNetDataset
task: segmentation
dataroot: %s
normal: True                                    # Use normal vectors as features
first_subsampling: 0.02                       # Grid size of the input data
category: "Airplane"
pre_transforms:                               # Offline transforms, done only once
    - transform: NormalizeScale           
    - transform: GridSampling
      params:
        size: ${first_subsampling}
train_transforms:                             # Data augmentation pipeline
    - transform: RandomNoise
      params:
        sigma: 0.01
        clip: 0.05
    - transform: RandomScaleAnisotropic
      params:
        scales: [0.9,1.1]
    - transform: FixedPoints
      lparams: [2048]
test_transforms:
    - transform: FixedPoints
      lparams: [2048]
""" % (os.path.join(DIR,"data")) 

params = OmegaConf.create(shapenet_yaml)

In [9]:
from torch_points3d.datasets.segmentation import ShapeNetDataset
dataset = ShapeNetDataset(params)
dataset

Dataset: ShapeNetDataset 
[0;95mpre_transform [0m= Compose([
    NormalizeScale(),
    GridSampling(grid_size=0.02, quantize_coords=False, mode=mean),
])
[0;95mtest_transform [0m= Compose([
    FixedPoints(2048, replace=True),
])
[0;95mtrain_transform [0m= Compose([
    RandomNoise(sigma=0.01, clip=0.05),
    RandomScaleAnisotropic([0.9, 1.1]),
    FixedPoints(2048, replace=True),
])
[0;95mval_transform [0m= None
[0;95minference_transform [0m= Compose([
    NormalizeScale(),
    GridSampling(grid_size=0.02, quantize_coords=False, mode=mean),
    FixedPoints(2048, replace=True),
])
Size of [0;95mtrain_dataset [0m= 2349
Size of [0;95mtest_dataset [0m= 341
Size of [0;95mval_dataset [0m= 0
[0;95mBatch size =[0m None

In [10]:
dataset.class_to_segments

{'Airplane': [0, 1, 2, 3]}

Get the tracker which has already built-in metric for this **dataset**

In [0]:
tracker = dataset.get_tracker(False, True)

## Model for part segmentation


**The** model we implement here follows the main architecture proposed in the [original repo](https://github.com/charlesq34/pointnet2):

It is built using the U-Net architecture with an encoder and decoder.

The encoder is a succession of 2 steps

### The encoder

* sampling and grouping: First, it downsamples the pointcloud using [Farthest Point Sampling method](https://en.wikipedia.org/wiki/Farthest-first_traversal) and secondly, finds local neighbours using radius search (could be knn, random or anything else)

<p align="center">
  <img width="70%" src="http://www.open3d.org/docs/release/_images/kdtree.png">
</p>

* pointnet: It is a shared MLP applied on all the local neighborhood independently generating the features for the next layer

### The decoder

For each point coming from the skip link concatenation, we find the k closest in the previously downsampled point cloud.

The features are then interpolated using the distance squared as follow:

$\mathbf{f}(y) = \frac{\sum_{i=1}^k w(x_i) \mathbf{f}(x_i)}{\sum_{i=1}^k
        w(x_i)} \textrm{, where } w(x_i) = \frac{1}{d(\mathbf{p}(y),
        \mathbf{p}(x_i))^2}$


<p align="center">
  <img width="70%" src="https://raw.githubusercontent.com/charlesq34/pointnet2/master/doc/teaser.jpg" />
</p>

Our `sample` contains as expected a batch of 16 samples with 2048 3d points.

## Training loop

# Using the BaseModel

The BaseModel is inspired from the excellent [junyanz/pytorch-CycleGAN-and-pix2pix](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix/blob/master/models/base_model.py) project on github.

We extended its core functionality to be adapted to pointcloud.

The user will have to implement only 

* ```set_input(self, data, device)```: This function will receive an object data from the dataloader with Pytorch Tensors as attributes as well as the current devide to be used. The user will have to implement its own logic to handle the sample data brought to its model.

* ```forward(self, *args, **kwargs)```: Forward data through the model and compute loss

* ```backward(self)```: Backward loss





In [0]:
from torch_points3d.models.base_model import BaseModel
from torch_points3d.applications.pointnet2 import PointNet2

class Model(BaseModel):
    def __init__(self, params):
        super(Model, self).__init__(params)

        self.unet = PointNet2(
            architecture="unet", 
            input_nc=3, 
            num_layers=3, 
            multiscale=True,
            output_nc = int(params.num_classes)
            )
        self.log_softmax = torch.nn.LogSoftmax(dim=-1)
        self.loss_names = ["loss_seg"]

    def set_input(self, data, device):
        """Unpack input data from the dataloader and perform necessary pre-processing steps.
        Parameters:
            input: a dictionary that contains the data itself and its metadata information.
        """
        data = data.to(device)
        self.data = data
        self.batch_size = data.pos.shape[0]
        self.labels = data.y
        self.batch_idx = data.batch

    def forward(self, *args, **kwargs):
        """Run forward pass. This will be called by both functions <optimize_parameters> and <test>."""

        # Forward through unet and classifier
        data_out = self.unet(self.data)
        self.output = self.log_softmax(data_out.x.squeeze()).permute((0, 2, 1))

        # Set loss for the backward pass
        self.loss_seg = F.nll_loss(self.output.contiguous().view((-1, 4)), self.labels.flatten())
        return self.output

    def backward(self):
        """Calculate losses, gradients, and update network weights; called in every training iteration"""
        # caculate the intermediate results if necessary; here self.output has been computed during function <forward>
        # calculate loss given the input and intermediate results
        self.loss_seg.backward()  # calculate gradients of network G w.r.t. loss_G



In [0]:
opt = """conv_type: "dense"
"""
params = OmegaConf.create(opt)
params.cat_to_seg = dataset.class_to_segments
params.num_classes = dataset.num_classes
model = Model(params)

In [25]:
model.unet

PointNet2Unet(
  (down_modules): ModuleList(
    (0): PointNetMSGDown(
      (mlps): ModuleList(
        (0): MLP2D(
          (0): Conv2D(
            (0): Conv2d(6, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
            (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
            (2): LeakyReLU(negative_slope=0.01)
          )
          (1): Conv2D(
            (0): Conv2d(32, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
            (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
            (2): LeakyReLU(negative_slope=0.01)
          )
          (2): Conv2D(
            (0): Conv2d(32, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
            (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
            (2): LeakyReLU(negative_slope=0.01)
          )
        )
        (1): MLP2D(
          (0): Conv2D(
            (0): Conv2d(6, 64, kernel_size=(1, 1), str

## The data loaders
Pointnet2 has been implemented using the "dense" data format.

In [0]:
NUM_WORKERS = 4
BATCH_SIZE = 4
dataset.create_dataloaders(
    model,
    batch_size=BATCH_SIZE, 
    num_workers=NUM_WORKERS, 
    shuffle=True, 
    precompute_multi_scale=False 
    )

In [19]:
sample = next(iter(dataset.train_dataloader))
sample.keys

['x', 'y', 'pos', 'category']

Let's create an Adam optimizer for training our model

In [0]:
model._optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

In [0]:
from torch_points3d.metrics.colored_tqdm import Coloredtqdm as Ctq
from torch_points3d.utils.colors import COLORS

def train_epoch(epoch, device):
    model.to(device)
    model.train()
    tracker.reset("train")
    train_loader = dataset.train_dataloader
    iter_data_time = time.time()
    with Ctq(train_loader) as tq_train_loader:
        for i, data in enumerate(tq_train_loader):
            t_data = time.time() - iter_data_time
            iter_start_time = time.time()
            model.set_input(data, device)
            model.optimize_parameters(epoch, dataset.batch_size)

            tq_train_loader.set_postfix(
                **tracker.get_metrics(),
                data_loading=float(t_data),
                iteration=float(time.time() - iter_start_time),
                color=COLORS.TRAIN_COLOR
            )

            iter_data_time = time.time()

def test_epoch(epoch, device):
    model.to(device)
    model.eval()
    tracker.reset("test")
    test_loader = dataset.test_dataloaders[0]
    iter_data_time = time.time()
    with Ctq(test_loader) as tq_test_loader:
        for i, data in enumerate(tq_test_loader):
            t_data = time.time() - iter_data_time
            iter_start_time = time.time()
            model.set_input(data, device)
            model.optimize_parameters(epoch, dataset.batch_size)

            tq_test_loader.set_postfix(
                **tracker.get_metrics(),
                data_loading=float(t_data),
                iteration=float(time.time() - iter_start_time),
                color=COLORS.TRAIN_COLOR
            )

            iter_data_time = time.time()

In [27]:
import time
EPOCHS = 50
for epoch in range(EPOCHS):
  print("=========== EPOCH %i ===========" % epoch)
  time.sleep(0.5)
  train_epoch(epoch, 'cuda')
  tracker.publish(epoch)
  test_epoch(epoch, 'cuda')
  tracker.publish(epoch)



  3%|▎         | 19/588 [00:02<01:00,  9.37it/s, [0;92mdata_loading=0.003, iteration=0.083, train_Cmiou=0    , train_Imiou=0    [0m)]


KeyboardInterrupt: ignored

In [0]:
%load_ext tensorboard
%tensorboard --logdir lightning_logs/version_4/ # Change for your log location