# Introduction
In this project, you will be asked to implement [PointNet](https://arxiv.org/abs/1612.00593) architecture and train a classification network (left) and a segmentation network (middle).
![title](img/cls_sem.jpg)

### Grading Points
* Task 1.1 - 5
* Task 1.2 - 5
* Task 2.1 - 10
* Task 2.2 - 5
* Task 2.3 - 5
* Task 2.4 - 5
* Task 2.5 - 5
* Task 2.6 - 5
* Task 2.7 - 5
* Task 2.8 - 10
* Task 2.9 - 10
* Task 2.10 - 5 
* Task 2.11 - 5
* Task 2.12 - 5
* Task 2.13 - 10
* Task 2.14 - 5

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import random
import math
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.utils.data as data

from torchvision.transforms import Compose

import dataset # custom dataset for ModelNet10 and ShapeNet

# 1. Data Loading

Usually, we write the point cloud as $X\in\mathbb{R}^{N\times 3}$. While in programming, we use `B x 3 x N` layout, where `B` is the batch-size and `N` is the number of points in a single point cloud.

## 1.1 Jitter the position of each points by a zero mean Gaussian
For input $X\in\mathbb{R}^{N\times 3}$, we transform $X$ by $X \leftarrow X + \mathcal{N}(0, \sigma^2)$.

In [3]:
class RandomJitter(object):
    def __init__(self, sigma):
        self.sigma = sigma
        
    def __call__(self, data):
        ## hint: useful function `torch.randn` and `torch.randn_like`
        ## TASK 1.1
        ## This function takes as input a point cloud of layout `3 x N`, 
        ## and output the jittered point cloud of layout `3 x N`.
        
        noise = torch.randn(data.shape)*self.sigma
        data = data + noise
      
        
        return data

In [4]:
## random generate data and test your transform here
randomJitter = RandomJitter(0.5)
jitter_data = randomJitter.__call__(torch.randn(3,16))
torch.std(jitter_data)   #std jitter_data = sqrt(var(data)+sigma^2)

tensor(1.1593)

## 1.2 Rotate the object along the z-axis randomly
For input $X\in\mathbb{R}^{N\times 3}$, we rotate all points along z-axis (up-axis) by a degree $\theta$.


Suppose $T$ is the transformation matrix,
$$X\leftarrow XT,$$
where $$T=\begin{bmatrix}\cos\theta & \sin\theta & 0 \\ -\sin\theta & \cos\theta & 0 \\ 0 & 0 & 1 \end{bmatrix}.$$

In [5]:
class RandomZRotate(object):
    def __init__(self, degrees):
        ## here `self.degrees` is a tuple (0, 360) which defines the range of degree
        self.degrees = degrees
        
    def __call__(self, data):
        ## TASK 1.2
        ## This function takes as input a point cloud of layout `3 x N`, 
        ## and output the rotated point cloud of layout `3 x N`.
        ##
        ## The rotation is along z-axis, and the degree is uniformly distributed
        ## between [0, 360]
        ##
        ## hint: useful function `torch.randn`， `torch.randn_like` and `torch.matmul`
        ##
        ## Notice:   
        ## Different from its math notation `N x 3`, the input has size of `3 x N`
        
        theta = random.random() * 2. * math.pi 
        T = torch.Tensor([[np.cos(theta), np.sin(theta),0],
                         [-np.sin(theta),np.cos(theta),0],
                         [0,0,1]])
        data = torch.matmul(T,data)
        
        return data

In [6]:
## random generate data and test your transform here
data_ = torch.randn(3,16)
print(data_)
randomZRotate = RandomZRotate((0,360))
randomZRotate.__call__(data_)

tensor([[-0.2353,  0.7301, -0.4874,  0.0224,  0.6520,  0.4408, -1.4210, -0.6609,
         -0.7314,  0.8657,  2.4664, -0.6494, -2.6774, -0.0770, -1.2609, -0.2580],
        [-0.4670,  0.6016, -1.4237,  0.3318,  2.6455,  1.3588, -1.0875,  0.4159,
          0.8073, -1.1750, -0.1032,  1.9778,  0.5814, -2.0564, -0.5687, -0.5860],
        [ 1.5230,  0.5756, -0.7889, -1.1402, -0.2148, -0.5296, -0.7789, -0.2542,
         -1.4677,  0.3994, -0.2039, -0.5506,  0.4056, -1.7912,  0.2908, -1.6654]])


tensor([[ 0.4754, -0.9459,  1.2697, -0.2247, -2.1610, -1.1927,  1.7890,  0.2566,
          0.0672,  0.0574, -1.8622, -0.7276,  1.7285,  1.3441,  1.3400,  0.5674],
        [ 0.2179, -0.0141,  0.8078, -0.2452, -1.6594, -0.7862, -0.0377, -0.7375,
         -1.0873,  1.4583,  1.6205, -1.9504, -2.1258,  1.5582, -0.3430,  0.2967],
        [ 1.5230,  0.5756, -0.7889, -1.1402, -0.2148, -0.5296, -0.7789, -0.2542,
         -1.4677,  0.3994, -0.2039, -0.5506,  0.4056, -1.7912,  0.2908, -1.6654]])

## 1.3 Load dataset ModelNet10 for Point Cloud Classification

### ModelNet10
By loading this dataset, we have data of shape `B x 3 x N` and label of shape `B`.

In [7]:
# It may taske some time to download and pre-process the dataset.
train_transform = Compose([RandomZRotate((0, 360)), RandomJitter(0.02)])
train_cls_dataset = dataset.ModelNet(root='./ModelNet10', transform=train_transform, train=True)
test_cls_dataset = dataset.ModelNet(root='./ModelNet10', train=False)
train_cls_loader = data.DataLoader(
    train_cls_dataset,
    batch_size=16,
    shuffle=True,
    num_workers=1,
)
test_cls_loader = data.DataLoader(
    test_cls_dataset,
    batch_size=1,
    shuffle=False,
    num_workers=1,
)

In [8]:
print(train_cls_dataset.num_classes)

10


## ShapeNet
By loading this dataset, we have data of shape `B x 3 x N` and target of shape `B x N`.

Here is the list of categories:
['Airplane', 'Bag', 'Cap', 'Car', 'Chair', 'Earphone', 'Guitar', 'Knife', 'Lamp', 'Laptop', 'Motorbike', 'Mug', 'Pistol', 'Rocket', 'Skateboard', 'Table']

In [9]:
## Here as an example, we choose the cateogry 'Airplane'
category = 'Airplane'
train_seg_dataset = dataset.ShapeNet(root='./ShapeNet', category=category, train=True)
test_seg_dataset = dataset.ShapeNet(root='./ShapeNet', category=category, train=False)
train_seg_loader = data.DataLoader(
    train_seg_dataset,
    batch_size=16,
    shuffle=True,
    num_workers=1,
)
test_seg_loader = data.DataLoader(
    test_seg_dataset,
    batch_size=1,
    shuffle=False,
    num_workers=1,
)

Downloading https://shapenet.cs.stanford.edu/iccv17/partseg/train_data.zip
Extracting ./ShapeNet/raw/train_data.zip
Downloading https://shapenet.cs.stanford.edu/iccv17/partseg/train_label.zip
Extracting ./ShapeNet/raw/train_label.zip
Downloading https://shapenet.cs.stanford.edu/iccv17/partseg/val_data.zip
Extracting ./ShapeNet/raw/val_data.zip
Downloading https://shapenet.cs.stanford.edu/iccv17/partseg/val_label.zip
Extracting ./ShapeNet/raw/val_label.zip
Downloading https://shapenet.cs.stanford.edu/iccv17/partseg/test_data.zip
Extracting ./ShapeNet/raw/test_data.zip
Downloading https://shapenet.cs.stanford.edu/iccv17/partseg/test_label.zip
Extracting ./ShapeNet/raw/test_label.zip
Done!
Processing...
Done.


In [10]:
print(train_seg_dataset.num_classes)

5


# 2 PointNet Architecture (Read PointNet Section 4.2 and Appendix C)
In this section, you will be asked to implement classification and segmentation step by step.
![pointnet](img/pointnet.jpg)

## 2.1 Joint Alignment Network 
This mini-network takes as input matrix of size $N \times K$, and outputs a transformation matrix of size $K \times K$. 

In programming, the input size of this module is `B x K x N` and output size is `B x K x K`.

For the shared MLP, use structure like this `(FC(64), BN, ReLU, FC(128), BN, ReLU, FC(1024), BN, ReLU)`.

For the MLP after global max pooling, use structure like this `(FC(512), BN, ReLU, FC(256), BN, ReLU, FC(K*K)`.


In [21]:
#To ensure invariance to transformations
class Transformation(nn.Module):
    def __init__(self, k=3):
        super(Transformation, self).__init__()
        
        self.k = k
        
        ## TASK 2.1
        
        ## define your network layers here
        ## shared mlp
        ## input size: B x K x N
        ## output size: B x 1024 x N
        ## hint: you may want to use `nn.Conv1d` here. Why?
        ## First of all, our tensors will have size (B, K, 3). 
        ## In this case MLP with shared weights is just 1-dim convolution with a kernel of size 1.
        self.share_mlp = nn.Sequential(nn.Conv1d(k, 64, 1, stride=1),
                                      nn.BatchNorm1d(64),
                                      nn.ReLU(),
                                      nn.Conv1d(64, 128,1, stride=1),
                                      nn.BatchNorm1d(128),
                                      nn.ReLU(),
                                      nn.Conv1d(128, 1024, 1, stride=1),
                                      nn.BatchNorm1d(1024),
                                      nn.ReLU())

        ## define your network layers here
        ## mlp
        ## input size: B x 1024
        ## output size: B x (K*K)
        
        self.mlp = nn.Sequential(nn.Linear(1024, 512), 
                                 nn.BatchNorm1d(512),
                                 nn.ReLU(),
                                 nn.Linear(512, 256),
                                 # batch size should be larger than 1, otherwise there will have an error
                                 nn.BatchNorm1d(256),
                                 nn.ReLU(),
                                 nn.Linear(256, k*k))
    
    def forward(self, x):
        B, K, N = x.shape # batch-size, dim, number of points
        ## TASK 2.1

        ## forward of shared mlp
        # input - B x K x N
        # output - B x 1024 x N
        x = self.share_mlp(x)
        
        ## global max pooling
        # input - B x 1024 x N
        # output - B x 1024
        #To provide permutation invariance, we apply a symmetric function (max pooling) to the extracted
        #and transformed features so the result does not depend on the order of input points anymore.
        x = nn.MaxPool1d(N)(x)
        x = x.view(-1,1024)
        
        ## mlp
        # input - B x 1024
        # output - B x (K*K)
        x = self.mlp(x)
        
        ## reshape the transformation matrix to B x K x K
        identity = torch.eye(self.k, device=x.device)
        x = x.view(B, self.k, self.k) + identity[None]
        return x

In [23]:
## random generate data and test this network
transformation = Transformation()
transformation(torch.randn(5,3,8)).shape

torch.Size([5, 3, 3])

## 2.2 Regularization Loss
$$L_{reg}=\|I-TT^\intercal\|^2_F$$

The output of `Transformation` network is of size `B x K x K`. The module `OrthoLoss` has no trainable parameters, only computes this norm.

In [24]:
class OrthoLoss(nn.Module):
    def __init__(self):
        super(OrthoLoss, self).__init__()
        
    def forward(self, x):
        ## hint: useful function `torch.bmm` and `torch.matmul`
        
        ## TASK 2.2
        ## compute the matrix product
        prod = torch.bmm(x,torch.transpose(x, 1, 2))

        norm = torch.norm(prod - torch.eye(x.shape[1], device=x.device)[None], dim=(1,2))
        return norm.mean()

In [27]:
## random generate data and test this module
orthoLoss = OrthoLoss()
orthoLoss(torch.randn(5,3,3))

tensor(5.1356)

## 2.3 Feature Network
In this subsection, you will be asked to implement the feature network (the top branch).

Local features are a matrix of size `B x 64 x N`, which will be used in the segmentation task.

Global features are a matrix of size `B x 1024`, which will be used in the classification task.

In [32]:
class Feature(nn.Module):
    def __init__(self, alignment=False):
        super(Feature, self).__init__()
        
        self.alignment = alignment
        
        ## `input_transform` calculates the input transform matrix of size `3 x 3`
        if self.alignment:
            self.input_transform = Transformation(3)
        
        ## TASK 2.3
        ## define your network layers here
        ## local feature
        ## shared mlp
        ## input size: B x 3 x N
        ## output size: B x 64 x N
        ## hint: you may want to use `nn.Conv1d` here.
        self.local_feature = nn.Sequential(nn.Conv1d(3, 64, 1, stride=1),
                                      nn.BatchNorm1d(64),
                                      nn.ReLU(),
                                      nn.Conv1d(64, 64, 1, stride=1),
                                      nn.BatchNorm1d(64),
                                      nn.ReLU())
        
        ## `feature_transform` calculates the feature transform matrix of size `64 x 64`
        if self.alignment:
            self.feature_transform = Transformation(64)
        
        ## TASK 2.4
        ## define your network layers here
        ## global feature
        ## shared mlp
        ## input size: B x 64 x N
        ## output size: B x 1024 x N      
        self.global_feature = nn.Sequential(
                              nn.Conv1d(64, 128,1, stride=1),
                              nn.BatchNorm1d(128),
                              nn.ReLU(),
                              nn.Conv1d(128, 1024, 1, stride=1),
                              nn.BatchNorm1d(1024),
                              nn.ReLU())
    
    def forward(self, x):
        
        ## apply the input transform
        if self.alignment:
            transform = self.input_transform(x)
            ## TASK 2.5
            ## apply the input transform
            x = torch.bmm(transform,x)

        ## TASK 2.3
        ## forward of shared mlp
        # input - B x K x N
        # output - B x 64 x N
        x = self.local_feature(x)
        
        if self.alignment:
            transform = self.feature_transform(x)
            ## TASK 2.5
            ## apply the feature transform
            x = torch.bmm(transform,x)
        else:
            ## do not modify this line
            transform = None
        
        local_feature = x
        
        ## TASK 2.4
        ## forward of shared mlp
        # input - B x 64 x N
        # output - B x 1024 x N
        x = self.global_feature(x)
        
        ## TASK 2.4
        ## global max pooling
        # input - B x 1024 x N
        # output - B x 1024
        x = nn.MaxPool1d(x.shape[2])(x)
        x = x.view(-1,1024)
        
        global_feature = x
        
        ## summary:
        ## global_feature: B x 1024
        ## local_feature: B x 64 x N
        ## transform: B x K x K
        return global_feature, local_feature, transform

In [36]:
## random generate data and test this network
feature = Feature(True)
global_feature,local_feature,transform = feature(torch.randn(3,3,5))
global_feature.shape,local_feature.shape,transform.shape

(torch.Size([3, 1024]), torch.Size([3, 64, 5]), torch.Size([3, 64, 64]))

## 2.4 Classification Network
In this network, you will use the global features generated by the `Feature` network defined above.

In [37]:
class Classification(nn.Module):
    def __init__(self, num_classes, alignment=False):
        super(Classification, self).__init__()
                
        self.feature = Feature(alignment=alignment)
        
        ## TASK 2.6
        ## define your network layers here
        ## mlp
        ## input size: B x 1024
        ## output size: B x num_classes
        self.mlp = nn.Sequential(nn.Linear(1024, 512),
                         nn.BatchNorm1d(512),
                         nn.ReLU(),
                         nn.Linear(512, 256),
                         nn.Dropout(0.3),
                         nn.BatchNorm1d(256),
                         nn.ReLU(),
                         nn.Linear(256, 128),
                         nn.BatchNorm1d(128),
                         nn.ReLU(),
                         nn.Linear(128, num_classes))
        self.logsoftmax = nn.LogSoftmax(dim=1)
        
    def forward(self, x):
        # x is the global feature matrix
        # here we don't use local feature matrix
        x, _, trans = self.feature(x)
        
        ## TASK 2.6
        ## forward of mlp
        # input - B x 1024
        # output - B x num_classes        
        x = self.mlp(x)
        x = self.logsoftmax(x)
        #The log-softmax penalty has a exponential nature 
        #compared to the linear penalisation of softmax. 
        #i.e More heavy peanlty for being more wrong.  
        ## x: B x num_classes
        ## trans: B x K x K
        return x, trans

In [48]:
## random generate data and test this network
classification = Classification(10,alignment=True)
output,trans = classification(torch.randn(3,3,5))
output.shape,trans.shape

(torch.Size([3, 10]), torch.Size([3, 64, 64]))

### 2.4.1 Train this network on ModelNet10

In [52]:
# main train function for classification
def train_cls(train_loader, test_loader, network, optimizer, epochs, scheduler):
    reg = OrthoLoss()
    for epoch in range(epochs):
        print('Epoch:[{:02d}/{:02d}]'.format(epoch+1, epochs))
        print('Training...')
        network.train()
        train_loss = 0
        correct = 0
        for batch, (pos, label) in enumerate(train_loader):
            network.zero_grad()
            pos, label = pos.cuda(), label.cuda()
            
            ## TASK 2.7
            ## forward propagation
            output, trans = network(pos)
            #As we used LogSoftmax for stability, 
            #we should apply NLLLoss instead of CrossEntropyLoss
            loss = nn.NLLLoss()(output,label) 
            ##########
            
            ## regularizer
            if trans is not None:
                loss += reg(trans) * 0.001

            pred = output.max(1)[1]
            correct += pred.eq(label).sum().item()

            loss.backward()
            optimizer.step()
            train_loss += loss.item()
            print('\rIter: [{:03d}/{:03d}] Loss: {:.4f}'.format(batch+1, len(train_loader), loss.item()), end='', flush=True)
        
        scheduler.step()
        print('\nAverage Train Loss: {:.4f}; Train Acc: {:.4f}'.format(train_loss/len(train_loader), correct/len(train_loader.dataset) * 100))
        
        print('\nTesting...')
        with torch.no_grad():
            network.eval()
            test_loss = 0
            correct = 0
            for batch, (pos, label) in enumerate(test_loader):
                pos, label = pos.cuda(), label.cuda()
    
                ## TASK 2.7
                ## forward propagation
                output, trans = network(pos)
                loss = nn.NLLLoss()(output,label) 
                ##########

                if trans is not None:
                    loss += reg(trans) * 0.001

                pred = output.max(1)[1]
                correct += pred.eq(label).sum().item()

                test_loss += loss.item()
                print('\rIter: [{:03d}/{:03d}] Loss: {:.4f}'.format(batch+1, len(test_loader), loss.item()), end='', flush=True)

            print('\nAverage Test Loss: {:.4f}; Test Acc: {:.4f}'.format(test_loss/len(test_loader), correct/len(test_loader.dataset) * 100))
        print('-------------------------------------------')


In [75]:
network = Classification(10, alignment=True).cuda()
epochs = 100 # you can change the value to a small number for debugging

## TASK 2.8
# see Appendix C
# choose an optimizer and an initial learning rate
optimizer = torch.optim.Adam(network.parameters(), lr=0.001)
# choose a lr scheduler
scheduler = torch.optim.lr_scheduler.StepLR(optimizer,20,gamma=0.5)
#######3

# start training
train_cls(train_cls_loader, test_cls_loader, network, optimizer, epochs, scheduler)

Epoch:[01/100]
Training...
Iter: [250/250] Loss: 1.5615
Average Train Loss: 1.5048; Train Acc: 49.5866

Testing...
Iter: [908/908] Loss: 3.2281
Average Test Loss: 1.4132; Test Acc: 48.0176
-------------------------------------------
Epoch:[02/100]
Training...
Iter: [250/250] Loss: 2.4022
Average Train Loss: 1.0857; Train Acc: 64.4701

Testing...
Iter: [908/908] Loss: 0.4890
Average Test Loss: 1.1109; Test Acc: 62.0044
-------------------------------------------
Epoch:[03/100]
Training...
Iter: [250/250] Loss: 2.1093
Average Train Loss: 0.9585; Train Acc: 68.3538

Testing...
Iter: [908/908] Loss: 2.2361
Average Test Loss: 1.0871; Test Acc: 57.5991
-------------------------------------------
Epoch:[04/100]
Training...
Iter: [250/250] Loss: 1.9409
Average Train Loss: 0.8604; Train Acc: 72.3127

Testing...
Iter: [908/908] Loss: 2.3093
Average Test Loss: 0.8047; Test Acc: 70.8150
-------------------------------------------
Epoch:[05/100]
Training...
Iter: [250/250] Loss: 2.9482
Average Trai

Iter: [250/250] Loss: 0.0762
Average Train Loss: 0.1498; Train Acc: 94.9136

Testing...
Iter: [908/908] Loss: 0.01119
Average Test Loss: 0.3779; Test Acc: 88.4361
-------------------------------------------
Epoch:[72/100]
Training...
Iter: [250/250] Loss: 0.3298
Average Train Loss: 0.1319; Train Acc: 95.2643

Testing...
Iter: [908/908] Loss: 0.01324
Average Test Loss: 0.3792; Test Acc: 87.4449
-------------------------------------------
Epoch:[73/100]
Training...
Iter: [250/250] Loss: 0.0712
Average Train Loss: 0.1386; Train Acc: 95.4397

Testing...
Iter: [908/908] Loss: 0.01663
Average Test Loss: 0.3601; Test Acc: 88.9868
-------------------------------------------
Epoch:[74/100]
Training...
Iter: [250/250] Loss: 2.0074
Average Train Loss: 0.1433; Train Acc: 95.2894

Testing...
Iter: [908/908] Loss: 0.03186
Average Test Loss: 0.3425; Test Acc: 89.9780
-------------------------------------------
Epoch:[75/100]
Training...
Iter: [250/250] Loss: 0.0852
Average Train Loss: 0.1349; Train A

### Report the best test accuracy you can get.
You can change the architecture, batch size, epochs and the scheduler

## 2.5 Segmentation Network
In this network, you will use the global features and local features generated by the `Feature` network defined above.

The global feature matrix is of size `B x 1024` and the local feature matrix is of size `B x 64 x N`.

They need to be stacked together to a new matrix of size `B x 1088 x n` (How?). 

In [69]:
# main train function for classification
class Segmentation(nn.Module):
    def __init__(self, num_classes, alignment=False):
        super(Segmentation, self).__init__()
               
        self.feature = Feature(alignment=alignment)

        ## TASK 2.9
        ## shared mlp
        ## input size: B x 1088 x N
        ## output size: B x num_classes x N
        self.shared_mlp = nn.Sequential(nn.Conv1d(1088, 512, 1, stride=1),
                              nn.BatchNorm1d(512),
                              nn.ReLU(),
                              nn.Conv1d(512, 256, 1, stride=1),
                              nn.BatchNorm1d(256),
                              nn.ReLU(),
                              nn.Conv1d(256, 128, 1, stride=1), 
                              nn.BatchNorm1d(128), 
                              nn.ReLU(), 
                              nn.Conv1d(128, num_classes, 1, stride=1),
                              nn.Softmax(dim=1))
        
        self.logsoftmax = nn.LogSoftmax(dim=1)
        
    def forward(self, x):
        g, l, trans = self.feature(x)
        
        ## TASK 2.10
        # concat global features and local features to a single matrix
        # g - B x 1024, global features
        # l - B x 64 x N, local features
        # x - B x 1088 x N, concatenated features
        g = torch.repeat_interleave(g.view(-1,1024,1), repeats=l.shape[2], dim=2)
        x = torch.cat((l, g),dim=1)
            
        ## TASK 2.9
        ## forward of shared mlp
        # input - B x 1088 x N
        # output - B x num_classes x N  
        x = self.shared_mlp(x)
        x = self.logsoftmax(x)
        
        return x, trans

In [70]:
## random generate data and test this network
segmentation = Segmentation(5,alignment=True)
# B x num_classes x N 
x,trans = segmentation(torch.randn(2,3,10))
x.shape,trans.shape

(torch.Size([2, 5, 10]), torch.Size([2, 64, 64]))

### 2.5.1 Calculating Intersection over Union (IoU) 
For 2D image, the IoU is calculated as follows,
![iou](img/iou.png)

How is it used in the literature of point clouds?

In [71]:
## TASK 2.11

## If you don't my template, you are free to write your own function.

# implement the helper functions to calculate the IoU
def get_i_and_u(pred, target, num_classes):
    """Calculate intersection and union between pred and target.
    
    pred -- B x N matrix
    target -- B x N matrix
    num_classes -- number of classes
    
    return i, u
    i -- B x C binary matrix, intersection, i[b, c] equals 1 if and only if it is a true-positive.
    u -- B x C binary matrix, union, u[b, c] equals 0 if and only if it is a true-negative
    """
    ## TASK 2.11
    ## calculate i and u here
    ## hint: useful function `F.one_hot`    
    ## hint: use element-wise logical tensor operation (`&` and `|`)
    target_onehot = F.one_hot(target, num_classes=num_classes)
    pre_onehot = F.one_hot(pred, num_classes=num_classes)
    i = torch.sum((target_onehot & pre_onehot).type(torch.float64), dim=1)
    u = torch.sum((target_onehot | pre_onehot).type(torch.float64), dim=1)

    return i, u

def get_iou(pred, target, num_classes):
    """Calculate IoU
    pred -- B x N matrix
    target -- B x N matrix
    num_classes -- number of classes
    
    return iou
    iou -- B matrix, iou[b] is the IoU of b-th point cloud in this batch
    """
    
    ## use the helper function `i_and_u` defined above
    i, u = get_i_and_u(pred, target, num_classes)
    
    ## TASK 2.11
    ## calculate iou
    iou = torch.sum(i,dim=1) / torch.sum(u,dim=1)
    
    return iou

### 2.5.2 Train this network on ShapeNet

In [72]:
# main train function for segmentation
def train_seg(train_loader, test_loader, network, optimizer, epochs, scheduler):  
    reg = OrthoLoss()
    for epoch in range(epochs):
        print('Epoch:[{:02d}/{:02d}]'.format(epoch+1, epochs))
        print('Training...')
        network.train()
        train_loss = 0
        correct = 0
        total = 0
        ious = []
        for batch, (pos, label) in enumerate(train_loader):
            network.zero_grad()
            pos, label = pos.cuda(), label.cuda()
            ## TASK 2.12
            ## forward propagation
            output, trans = network(pos)
            loss = nn.NLLLoss()(output,label) 
            ##########
            if trans is not None:
                loss += reg(trans) * 0.001        

            pred = output.max(1)[1]
            correct += pred.eq(label).sum().item()
            total += label.numel()

            loss.backward()
            optimizer.step()
            train_loss += loss.item()

            ious += [get_iou(pred, label, train_loader.dataset.num_classes)]
            print('\rIter: [{:03d}/{:03d}] Loss: {:.4f}'.format(batch+1, len(train_loader), loss.item()), end='', flush=True)
        
        scheduler.step()
        print('\nAverage Train Loss: {:.4f}; Train Acc: {:.4f}; Train mean IoU: {:.4f}'.format(train_loss/len(train_loader), correct/total * 100, torch.cat(ious, dim=0).mean().item()))

        print('\nTesting...')
        with torch.no_grad():
            network.eval()
            test_loss = 0
            correct = 0
            total = 0
            ious = []
            for batch, (pos, label) in enumerate(test_loader):
                pos, label = pos.cuda(), label.cuda()
                
                ## TASK 2.12
                ## forward propagation
                output, trans = network(pos)
                loss = nn.NLLLoss()(output,label) 
                ##########
                
                if trans is not None:
                    loss += reg(trans) * 0.001   

                pred = output.max(1)[1]
                correct += pred.eq(label).sum().item()
                total += label.numel()

                test_loss += loss.item()

                ious += [get_iou(pred, label, train_loader.dataset.num_classes)]
                print('\rIter: [{:03d}/{:03d}] Loss: {:.4f}'.format(batch+1, len(test_loader), loss.item()), end='', flush=True)

            print('\nAverage Test Loss: {:.4f}; Test Acc: {:.4f}; Test mean IoU: {:.4f}'.format(test_loss/len(test_loader), correct/total * 100, torch.cat(ious, dim=0).mean().item()))
        print('-------------------------------------------')

In [74]:
network = Segmentation(train_seg_dataset.num_classes, alignment=True).cuda()
epochs = 100 # you can change the value to a small number for debugging

## TASK 2.13
# see Appendix C
# choose an optimizer and an initial learning rate
optimizer = torch.optim.Adam(network.parameters(),lr=0.001)
# choose a lr scheduler
scheduler = torch.optim.lr_scheduler.StepLR(optimizer,20,gamma=0.5)
#######3

train_seg(train_seg_loader, test_seg_loader, network, optimizer, epochs, scheduler)

Epoch:[01/100]
Training...
Iter: [147/147] Loss: 1.0927
Average Train Loss: 1.1598; Train Acc: 79.1493; Train mean IoU: 0.6681

Testing...
Iter: [341/341] Loss: 1.0434
Average Test Loss: 1.1083; Test Acc: 81.8806; Test mean IoU: 0.7072
-------------------------------------------
Epoch:[02/100]
Training...
Iter: [147/147] Loss: 1.0423
Average Train Loss: 1.0614; Train Acc: 86.3619; Train mean IoU: 0.7661

Testing...
Iter: [341/341] Loss: 1.0254
Average Test Loss: 1.0580; Test Acc: 85.8185; Test mean IoU: 0.7588
-------------------------------------------
Epoch:[03/100]
Training...
Iter: [147/147] Loss: 1.0300
Average Train Loss: 1.0376; Train Acc: 88.0358; Train mean IoU: 0.7918

Testing...
Iter: [341/341] Loss: 1.0487
Average Test Loss: 1.0727; Test Acc: 85.2770; Test mean IoU: 0.7555
-------------------------------------------
Epoch:[04/100]
Training...
Iter: [147/147] Loss: 1.0257
Average Train Loss: 1.0316; Train Acc: 88.3621; Train mean IoU: 0.7973

Testing...
Iter: [341/341] Loss:

Iter: [341/341] Loss: 0.9953
Average Test Loss: 1.0460; Test Acc: 85.7850; Test mean IoU: 0.7574
-------------------------------------------
Epoch:[31/100]
Training...
Iter: [147/147] Loss: 0.9749
Average Train Loss: 0.9873; Train Acc: 91.7603; Train mean IoU: 0.8514

Testing...
Iter: [341/341] Loss: 0.9625
Average Test Loss: 1.0163; Test Acc: 88.7891; Test mean IoU: 0.8071
-------------------------------------------
Epoch:[32/100]
Training...
Iter: [147/147] Loss: 0.9745
Average Train Loss: 0.9853; Train Acc: 91.9575; Train mean IoU: 0.8547

Testing...
Iter: [341/341] Loss: 0.9615
Average Test Loss: 1.0089; Test Acc: 89.5607; Test mean IoU: 0.8188
-------------------------------------------
Epoch:[33/100]
Training...
Iter: [147/147] Loss: 0.9775
Average Train Loss: 0.9839; Train Acc: 92.1167; Train mean IoU: 0.8572

Testing...
Iter: [341/341] Loss: 1.1202
Average Test Loss: 1.1621; Test Acc: 74.0807; Test mean IoU: 0.5971
-------------------------------------------
Epoch:[34/100]
Trai

Iter: [147/147] Loss: 0.9634
Average Train Loss: 0.9625; Train Acc: 94.2382; Train mean IoU: 0.8929

Testing...
Iter: [341/341] Loss: 0.9607
Average Test Loss: 1.0016; Test Acc: 90.2477; Test mean IoU: 0.8309
-------------------------------------------
Epoch:[61/100]
Training...
Iter: [147/147] Loss: 0.9556
Average Train Loss: 0.9606; Train Acc: 94.4288; Train mean IoU: 0.8963

Testing...
Iter: [341/341] Loss: 0.9521
Average Test Loss: 1.0008; Test Acc: 90.3327; Test mean IoU: 0.8327
-------------------------------------------
Epoch:[62/100]
Training...
Iter: [147/147] Loss: 0.9538
Average Train Loss: 0.9601; Train Acc: 94.4796; Train mean IoU: 0.8972

Testing...
Iter: [341/341] Loss: 0.9624
Average Test Loss: 0.9963; Test Acc: 90.7927; Test mean IoU: 0.8408
-------------------------------------------
Epoch:[63/100]
Training...
Iter: [147/147] Loss: 0.9549
Average Train Loss: 0.9595; Train Acc: 94.5480; Train mean IoU: 0.8983

Testing...
Iter: [341/341] Loss: 0.9646
Average Test Loss: 

Iter: [341/341] Loss: 0.9885
Average Test Loss: 0.9927; Test Acc: 91.1554; Test mean IoU: 0.8461
-------------------------------------------
Epoch:[90/100]
Training...
Iter: [147/147] Loss: 0.9460
Average Train Loss: 0.9538; Train Acc: 95.1115; Train mean IoU: 0.9082

Testing...
Iter: [341/341] Loss: 1.0051
Average Test Loss: 0.9925; Test Acc: 91.1823; Test mean IoU: 0.8462
-------------------------------------------
Epoch:[91/100]
Training...
Iter: [147/147] Loss: 0.9575
Average Train Loss: 0.9538; Train Acc: 95.1170; Train mean IoU: 0.9084

Testing...
Iter: [341/341] Loss: 0.9964
Average Test Loss: 0.9959; Test Acc: 90.8286; Test mean IoU: 0.8403
-------------------------------------------
Epoch:[92/100]
Training...
Iter: [147/147] Loss: 0.9689
Average Train Loss: 0.9537; Train Acc: 95.1258; Train mean IoU: 0.9085

Testing...
Iter: [341/341] Loss: 0.9930
Average Test Loss: 0.9923; Test Acc: 91.2012; Test mean IoU: 0.8464
-------------------------------------------
Epoch:[93/100]
Trai

### Report the best test mIoU you can get.

## 2.7 PointNet++ and DGCNN (https://arxiv.org/abs/1801.07829)
Read these two papers and answer:
1. What's the major difference over PointNet?
2. If you are going to implement PointNet++ or DGCNN, describe your plan briefly (what kind of modules/layers or functions you need based on this project)