# Introduction
In this project, you will be asked to implement [PointNet](https://arxiv.org/abs/1612.00593) architecture and train a classification network (left) and a segmentation network (middle).
![title](img/cls_sem.jpg)

### Grading Points
* Task 1.1 - 5
* Task 1.2 - 5
* Task 2.1 - 10
* Task 2.2 - 5
* Task 2.3 - 5
* Task 2.4 - 5
* Task 2.5 - 5
* Task 2.6 - 10
* Task 2.7 - 5
* Task 2.8 - 10
* Task 2.9 - 10
* Task 2.10 - 5 
* Task 2.11 - 5
* Task 2.12 - 5
* Task 2.13 - 10

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import random
import math

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.utils.data as data

from torchvision.transforms import Compose

import dataset # custom dataset for ModelNet10 and ShapeNet

# 1. Data Loading

Usually, we write the point cloud as $X\in\mathbb{R}^{N\times 3}$. While in programming, we use `B x 3 x N` layout, where `B` is the batch-size and `N` is the number of points in a single point cloud.

## 1.1 Jitter the position of each points by a zero mean Gaussian
For input $X\in\mathbb{R}^{N\times 3}$, we transform $X$ by $X \leftarrow X + \mathcal{N}(0, \sigma^2)$.

In [40]:
class RandomJitter(object):
    def __init__(self, sigma):
        self.sigma = sigma
        
    def __call__(self, data):
        ## hint: useful function `torch.randn` and `torch.randn_like`
        ## TASK 1.1
        ## This function takes as input a point cloud of layout `3 x N`, 
        ## and output the jittered point cloud of layout `3 x N`.
        data += self.sigma * torch.randn_like(data)
        
        return data

### RandomJitter test

In [41]:
data_points = torch.rand(3, 10)
data_points

tensor([[0.8555, 0.7744, 0.3557, 0.0290, 0.9944, 0.7991, 0.8695, 0.2459, 0.7106,
         0.1056],
        [0.4840, 0.7698, 0.3784, 0.5257, 0.8475, 0.6497, 0.7240, 0.7742, 0.5801,
         0.5337],
        [0.7942, 0.6461, 0.0714, 0.7804, 0.7026, 0.5663, 0.8791, 0.2466, 0.0454,
         0.3172]])

In [42]:
random_jitter = RandomJitter(1)
random_jitter(data_points)

tensor([[ 3.6268e-01, -2.7050e-01, -1.0870e+00,  1.0459e+00,  1.9071e+00,
          4.5823e-01, -5.5846e-01, -1.1910e+00, -4.1022e-01, -7.5829e-02],
        [-9.8406e-01, -1.7992e-03,  1.6487e+00, -6.1576e-02,  1.2490e+00,
          2.7764e+00,  2.6892e+00,  2.1901e+00,  7.6386e-01,  1.0697e+00],
        [ 3.4452e-01,  4.1316e-01,  2.2089e-01, -9.3460e-01,  3.3523e-01,
         -1.2419e-01,  2.1317e-01, -1.6605e+00,  4.1668e-01, -9.2074e-01]])

## 1.2 Rotate the object along the z-axis randomly
For input $X\in\mathbb{R}^{N\times 3}$, we rotate all points along z-axis (up-axis) by a degree $\theta$.


Suppose $T$ is the transformation matrix,
$$X\leftarrow XT,$$
where $$T=\begin{bmatrix}\cos\theta & \sin\theta & 0 \\ -\sin\theta & \cos\theta & 0 \\ 0 & 0 & 1 \end{bmatrix}.$$

In [33]:
import numpy as np

In [50]:
class RandomZRotate(object):
    def __init__(self, degrees):
        ## here `self.degrees` is a tuple (0, 360) which defines the range of degree
        self.degrees = degrees
        
    def __call__(self, data):
        ## TASK 1.2
        ## This function takes as input a point cloud of layout `3 x N`, 
        ## and output the rotated point cloud of layout `3 x N`.
        ##
        ## The rotation is along z-axis, and the degree is uniformly distributed
        ## between [0, 360]
        ##
        ## hint: useful function `torch.randn`， `torch.randn_like` and `torch.matmul`
        ##
        ## Notice:   
        ## Different from its math notation `N x 3`, the input has size of `3 x N`a
        degree = self.degrees[0] + np.random.random(1).item() * (self.degrees[1] - self.degrees[0])
        sin_ = np.sin(np.deg2rad(degree))
        cos_ = np.cos(np.deg2rad(degree))
        rotation_matrix = np.array([[cos_, -sin_, 0], [sin_, cos_, 0], [0, 0, 1]])
        rotation_matrix = torch.from_numpy(rotation_matrix).float()
        data = torch.matmul(rotation_matrix, data)
        
        return data

### RandomZRotate test

In [51]:
data_points = torch.randn(3, 5)
data_points

tensor([[-0.9378, -1.5279, -0.6808, -0.6366, -0.8037],
        [ 0.8694,  1.7657, -0.1443, -0.8529, -1.5374],
        [ 1.4797,  1.3284,  1.0826, -0.5144,  0.3143]])

In [53]:
rndZrot = RandomZRotate((0, 0))
rndZrot(data_points)

tensor([[-0.9378, -1.5279, -0.6808, -0.6366, -0.8037],
        [ 0.8694,  1.7657, -0.1443, -0.8529, -1.5374],
        [ 1.4797,  1.3284,  1.0826, -0.5144,  0.3143]])

In [54]:
rndZrot = RandomZRotate((180, 180))
rndZrot(data_points)

tensor([[ 0.9378,  1.5279,  0.6808,  0.6366,  0.8037],
        [-0.8694, -1.7657,  0.1443,  0.8529,  1.5374],
        [ 1.4797,  1.3284,  1.0826, -0.5144,  0.3143]])

## 1.3 Load dataset ModelNet10 for Point Cloud Classification

### ModelNet10
By loading this dataset, we have data of shape `B x 3 x N` and label of shape `B`.

In [55]:
# It may taske some time to download and pre-process the dataset.
train_transform = Compose([RandomZRotate((0, 360)), RandomJitter(0.02)])
train_cls_dataset = dataset.ModelNet(root='./ModelNet10', transform=train_transform, train=True)
test_cls_dataset = dataset.ModelNet(root='./ModelNet10', train=False)
train_cls_loader = data.DataLoader(
    train_cls_dataset,
    batch_size=16,
    shuffle=True,
    num_workers=1,
)
test_cls_loader = data.DataLoader(
    test_cls_dataset,
    batch_size=1,
    shuffle=False,
    num_workers=1,
)

Downloading http://vision.princeton.edu/projects/2014/3DShapeNets/ModelNet10.zip
Extracting ./ModelNet10/ModelNet10.zip
Done!
Processing...
Done!


In [56]:
print(train_cls_dataset.num_classes)

10


## ShapeNet
By loading this dataset, we have data of shape `B x 3 x N` and target of shape `B x N`.

Here is the list of categories:
['Airplane', 'Bag', 'Cap', 'Car', 'Chair', 'Earphone', 'Guitar', 'Knife', 'Lamp', 'Laptop', 'Motorbike', 'Mug', 'Pistol', 'Rocket', 'Skateboard', 'Table']

In [57]:
## Here as an example, we choose the category 'Airplane'
category = 'Airplane'
train_seg_dataset = dataset.ShapeNet(root='./ShapeNet', category=category, train=True)
test_seg_dataset = dataset.ShapeNet(root='./ShapeNet', category=category, train=False)
train_seg_loader = data.DataLoader(
    train_seg_dataset,
    batch_size=16,
    shuffle=True,
    num_workers=1,
)
test_seg_loader = data.DataLoader(
    test_seg_dataset,
    batch_size=1,
    shuffle=False,
    num_workers=1,
)

Downloading https://shapenet.cs.stanford.edu/iccv17/partseg/train_data.zip
Extracting ./ShapeNet/raw/train_data.zip
Downloading https://shapenet.cs.stanford.edu/iccv17/partseg/train_label.zip
Extracting ./ShapeNet/raw/train_label.zip
Downloading https://shapenet.cs.stanford.edu/iccv17/partseg/val_data.zip
Extracting ./ShapeNet/raw/val_data.zip
Downloading https://shapenet.cs.stanford.edu/iccv17/partseg/val_label.zip
Extracting ./ShapeNet/raw/val_label.zip
Downloading https://shapenet.cs.stanford.edu/iccv17/partseg/test_data.zip
Extracting ./ShapeNet/raw/test_data.zip
Downloading https://shapenet.cs.stanford.edu/iccv17/partseg/test_label.zip
Extracting ./ShapeNet/raw/test_label.zip
Done!
Processing...
Done.


In [58]:
print(train_seg_dataset.num_classes)

5


# 2 PointNet Architecture (Read Section 4.2 and Appendix C)
In this section, you will be asked to implement classification and segmentation step by step.
![pointnet](img/pointnet.jpg)

## 2.1 Joint Alignment Network 
This mini-network takes as input matrix of size $N \times K$, and outputs a transformation matrix of size $K \times K$. 

In programming, the input size of this module is `B x K x N` and output size is `B x K x K`.

For the shared MLP, use structure like this `(FC(64), BN, ReLU, FC(128), BN, ReLU, FC(1024), BN, ReLU)`.

For the MLP after global max pooling, use structure like this `(FC(512), BN, ReLU, FC(256), BN, ReLU, FC(K*K)`.


In [136]:
def shared_fc(in_dimension, out_dimension, batch_norm=True):
    layers = [nn.Conv1d(in_dimension, out_dimension, 1)]
    
    if batch_norm:
        layers.append(nn.BatchNorm1d(out_dimension))

    return nn.Sequential(*layers)

In [74]:
def final_fc(in_dimension, out_dimension, batch_norm=True):
    layers = [nn.Linear(in_dimension, out_dimension)]
    
    if batch_norm:
        layers.append(nn.BatchNorm1d(out_dimension))
        
    return nn.Sequential(*layers)

In [93]:
class Transformation(nn.Module):
    def __init__(self, k=3):
        super(Transformation, self).__init__()
        
        self.k = k
        self.shared_mlp_s1 = 64
        self.shared_mlp_s2 = 128
        self.shared_mlp_s3 = 1024
        
        self.final_mlp_s1 = 512
        self.final_mlp_s2 = 256
        
        ## TASK 2.1
        
        ## define your network layers here
        ## shared mlp
        ## input size: B x K x N
        ## output size: B x 1024 x N
        ## hint: you may want to use `nn.Conv1d` here. Why?
        
        self.shared_fc1 = shared_fc(self.k, self.shared_mlp_s1)
        self.shared_fc2 = shared_fc(self.shared_mlp_s1, self.shared_mlp_s2)
        self.shared_fc3 = shared_fc(self.shared_mlp_s2, self.shared_mlp_s3)

        ## define your network layers here
        ## mlp
        ## input size: B x 1024
        ## output size: B x (K*K)
        
        self.final_fc1 = final_fc(self.shared_mlp_s3, self.final_mlp_s1)
        self.final_fc2 = final_fc(self.final_mlp_s1, self.final_mlp_s2)
        self.final_fc3 = final_fc(self.final_mlp_s2, self.k * self.k, False)
        
    
    def forward(self, x):
        B, K, N = x.shape # batch-size, dim, number of points
        ## TASK 2.1
        
        assert K == self.k

        ## forward of shared mlp
        # input - B x K x N
        # output - B x 1024 x N
        x = F.relu(self.shared_fc1(x))
        x = F.relu(self.shared_fc2(x))
        x = F.relu(self.shared_fc3(x))
        
        ## global max pooling
        # input - B x 1024 x N
        # output - B x 1024
        
        x = nn.MaxPool1d(N)(x)    
        x = x.view(B, -1)
        
        ## mlp
        # input - B x 1024
        # output - B x (K*K)
        
        x = F.relu(self.final_fc1(x))
        x = F.relu(self.final_fc2(x))
        x = self.final_fc3(x)
        
        
        ## reshape the transformation matrix to B x K x K
        identity = torch.eye(self.k, device=x.device)
        x = x.view(B, self.k, self.k) + identity[None]
        return x

In [121]:
example_transform = Transformation(3)

In [98]:
input = torch.randn(4, 3, 20)
output = example_transform(input)
print(output.size())

torch.Size([4, 3, 3])


In [123]:
example_transform = example_transform.cuda()
input = torch.randn(4, 3, 20).cuda()
output = example_transform(input)
print(output.size())

torch.Size([4, 3, 3])


## 2.2 Regularization Loss
$$L_{reg}=\|I-TT^\intercal\|^2_F$$

The output of `Transformation` network is of size `B x K x K`. The module `OrthoLoss` has no trainable parameters, only computes this norm.

In [100]:
class OrthoLoss(nn.Module):
    def __init__(self):
        super(OrthoLoss, self).__init__()
        
    def forward(self, x):
        ## hint: useful function `torch.bmm` and `torch.matmul`
        
        ## TASK 2.2
        ## compute the matrix product
        prod = torch.bmm(x, x.permute(0, 2, 1))

        norm = torch.norm(prod - torch.eye(x.shape[1], device=x.device)[None], dim=(1,2))
        return norm.mean()

In [102]:
## random generate data and test this network
example_ortholoss = OrthoLoss()

In [103]:
input = torch.randn(5, 4, 4)
output = example_ortholoss(input)
print(output)

tensor(9.6585)


## 2.3 Feature Network
In this subsection, you will be asked to implement the feature network (the top branch).

Local features are a matrix of size `B x 64 x N`, which will be used in the segmentation task.

Global features are a matrix of size `B x 1024`, which will be used in the classification task.

In [115]:
class Feature(nn.Module):
    def __init__(self, alignment=False):
        super(Feature, self).__init__()
        
        self.alignment = alignment
        self.input_shared_mlp_s1 = 64
        self.input_shared_mlp_s2 = 64
        self.feature_shared_mlp_s1 = 64
        self.feature_shared_mlp_s2 = 128
        self.feature_shared_mlp_s3 = 1024
        
        ## `input_transform` calculates the input transform matrix of size `3 x 3`
        if self.alignment:
            self.input_transform = Transformation(3)
        
        ## TASK 2.3
        ## define your network layers here
        ## local feature
        ## shared mlp
        ## input size: B x 3 x N
        ## output size: B x 64 x N
        ## hint: you may want to use `nn.Conv1d` here.
        
        self.input_shared_fc1 = shared_fc(3, self.input_shared_mlp_s1)
        self.input_shared_fc2 = shared_fc(self.input_shared_mlp_s1, self.input_shared_mlp_s2)
        
        
        ## `feature_transform` calculates the feature transform matrix of size `64 x 64`
        if self.alignment:
            self.feature_transform = Transformation(64)
        
        ## TASK 2.4
        ## define your network layers here
        ## global feature
        ## shared mlp
        ## input size: B x 64 x N
        ## output size: B x 1024 x N  
        
        self.feature_shared_fc1 = shared_fc(self.input_shared_mlp_s2, self.feature_shared_mlp_s1)
        self.feature_shared_fc2 = shared_fc(self.feature_shared_mlp_s1, self.feature_shared_mlp_s2)
        self.feature_shared_fc3 = shared_fc(self.feature_shared_mlp_s2, self.feature_shared_mlp_s3)
    
    
    def forward(self, x):
        
        B, K, N = x.shape
        
        ## apply the input transform
        if self.alignment:
            transform = self.input_transform(x)
            ## TASK 2.5
            ## apply the input transform
            x = torch.bmm(transform, x)

        ## TASK 2.3
        ## forward of shared mlp
        # input - B x K x N
        # output - B x 64 x N
        x = F.relu(self.input_shared_fc1(x))
        x = F.relu(self.input_shared_fc2(x))
        
        if self.alignment:
            transform = self.feature_transform(x)
            ## TASK 2.5
            ## apply the feature transform
            x = torch.bmm(transform, x)
        else:
            ## do not modify this line
            transform = None
        
        local_feature = x
        
        ## TASK 2.4
        ## forward of shared mlp
        # input - B x 64 x N
        # output - B x 1024 x N
        x = F.relu(self.feature_shared_fc1(x))
        x = F.relu(self.feature_shared_fc2(x))
        x = F.relu(self.feature_shared_fc3(x))
        
        ## TASK 2.4
        ## global max pooling
        # input - B x 1024 x N
        # output - B x 1024
        x = nn.MaxPool1d(N)(x)
        x = x.squeeze(2)
        
        global_feature = x
        
        ## summary:
        ## global_feature: B x 1024
        ## local_feature: B x 64 x N
        ## transform: B x K x K
        return global_feature, local_feature, transform

In [116]:
example_feature_net = Feature(alignment=True)
input = torch.randn(20, 3, 100)
global_feature, local_feature, transform = example_feature_net(input)
print(global_feature.size())
print(local_feature.size())
print(transform.size())

torch.Size([20, 1024])
torch.Size([20, 64, 100])
torch.Size([20, 64, 64])


In [118]:
example_feature_net = Feature()
input = torch.randn(20, 3, 100)
global_feature, local_feature, transform = example_feature_net(input)
print(global_feature.size())
print(local_feature.size())
if transform is not None:
    print(transform.size())
else:
    print(transform)

torch.Size([20, 1024])
torch.Size([20, 64, 100])
None


## 2.4 Classification Network
In this network, you will use the global features generated by the `Feature` network defined above.

In [128]:
class Classification(nn.Module):
    def __init__(self, num_classes, alignment=False):
        super(Classification, self).__init__()
                
        self.feature = Feature(alignment=alignment)
        
        self.input_dimension = 1024
        self.fc_s1 = 512
        self.fc_s2 = 256
        self.num_classes = num_classes
        self.dropout = nn.Dropout(0.2)
        
        ## TASK 2.6
        ## define your network layers here
        ## mlp
        ## input size: B x 1024
        ## output size: B x num_classes
        
        self.classification_fc1 = final_fc(self.input_dimension, self.fc_s1)
        self.classification_fc2 = final_fc(self.fc_s1, self.fc_s2)
        self.classification_fc3 = final_fc(self.fc_s2, self.num_classes, batch_norm=False)
        
    def forward(self, x):
        # x is the global feature matrix
        # here we don't use local feature matrix
        x, _, trans = self.feature(x)
        
        ## TASK 2.6
        ## forward of mlp
        # input - B x 1024
        # output - B x num_classes        
        x = F.relu(self.classification_fc1(x))
        x = F.relu(self.classification_fc2(x))
        x = self.classification_fc3(self.dropout(x))
        
        ## x: B x num_classes
        ## trans: B x K x K
        return x, trans

In [129]:
## random generate data and test this network
example_classification = Classification(10, alignment=True).to(torch.device("cuda"))
input = torch.randn(5, 3, 30).cuda()
scores, trans = example_classification(input)
print(scores.size())
if trans is not None:
    print(trans.size())
else:
    print(trans)

torch.Size([5, 10])
torch.Size([5, 64, 64])


### 2.4.1 Train this network on ModelNet10

In [134]:
# main train function for classification
def train_cls(train_loader, test_loader, network, optimizer, epochs, scheduler):
    reg = OrthoLoss()
    criterion = nn.CrossEntropyLoss()
    for epoch in range(epochs):
        print('Epoch:[{:02d}/{:02d}]'.format(epoch+1, epochs))
        print('Training...')
        network.train()
        train_loss = 0
        correct = 0
        for batch, (pos, label) in enumerate(train_loader):
            network.zero_grad()
            pos, label = pos.cuda(), label.cuda()
            
            ## TASK 2.7
            ## forward propagation
            output, trans = network(pos)
            loss = criterion(output, label)
            ##########
            
            ## regularizer
            if trans is not None:
                loss += reg(trans) * 0.001

            pred = output.max(1)[1]
            correct += pred.eq(label).sum().item()

            loss.backward()
            optimizer.step()
            train_loss += loss.item()
            print('\rIter: [{:03d}/{:03d}] Loss: {:.4f}'.format(batch+1, len(train_loader), loss.item()), end='', flush=True)
        
        scheduler.step()
        print('\nAverage Train Loss: {:.4f}; Train Acc: {:.4f}'.format(train_loss/len(train_loader), correct/len(train_loader.dataset) * 100))
        
        print('\nTesting...')
        with torch.no_grad():
            network.eval()
            test_loss = 0
            correct = 0
            for batch, (pos, label) in enumerate(test_loader):
                pos, label = pos.cuda(), label.cuda()
    
                ## TASK 2.7
                ## forward propagation
                output, trans = network(pos)
                loss = criterion(output, label)
                ##########

                if trans is not None:
                    loss += reg(trans) * 0.001

                pred = output.max(1)[1]
                correct += pred.eq(label).sum().item()

                test_loss += loss.item()
                print('\rIter: [{:03d}/{:03d}] Loss: {:.4f}'.format(batch+1, len(test_loader), loss.item()), end='', flush=True)

            print('\nAverage Test Loss: {:.4f}; Test Acc: {:.4f}'.format(test_loss/len(test_loader), correct/len(test_loader.dataset) * 100))
        print('-------------------------------------------')


In [135]:
network = Classification(10, alignment=True).cuda()
epochs = 200 # you can change the value to a small number for debugging

## TASK 2.8
# see Appendix C
# choose an optimizer and an initial learning rate
optimizer = torch.optim.Adam(network.parameters(), lr=0.001)
# choose a lr scheduler
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 20, 0.5)
#######

# start training
train_cls(train_cls_loader, test_cls_loader, network, optimizer, epochs, scheduler)

Epoch:[01/200]
Training...
Iter: [250/250] Loss: 1.7127
Average Train Loss: 1.4186; Train Acc: 52.9692

Testing...
Iter: [908/908] Loss: 1.2730
Average Test Loss: 1.2000; Test Acc: 55.9471
-------------------------------------------
Epoch:[02/200]
Training...
Iter: [250/250] Loss: 2.7369
Average Train Loss: 1.0605; Train Acc: 65.7980

Testing...
Iter: [908/908] Loss: 0.83239
Average Test Loss: 0.9119; Test Acc: 70.9251
-------------------------------------------
Epoch:[03/200]
Training...
Iter: [250/250] Loss: 0.6264
Average Train Loss: 0.9249; Train Acc: 70.7592

Testing...
Iter: [908/908] Loss: 1.6432
Average Test Loss: 0.9759; Test Acc: 65.6388
-------------------------------------------
Epoch:[04/200]
Training...
Iter: [250/250] Loss: 2.0572
Average Train Loss: 0.8619; Train Acc: 71.8867

Testing...
Iter: [908/908] Loss: 0.32447
Average Test Loss: 0.8946; Test Acc: 70.9251
-------------------------------------------
Epoch:[05/200]
Training...
Iter: [250/250] Loss: 0.6128
Average Tr

Iter: [250/250] Loss: 0.2438
Average Train Loss: 0.2425; Train Acc: 91.8316

Testing...
Iter: [908/908] Loss: 0.0054
Average Test Loss: 0.5068; Test Acc: 82.4890
-------------------------------------------
Epoch:[37/200]
Training...
Iter: [250/250] Loss: 0.3414
Average Train Loss: 0.2459; Train Acc: 91.4057

Testing...
Iter: [908/908] Loss: 0.05285
Average Test Loss: 0.5456; Test Acc: 83.2599
-------------------------------------------
Epoch:[38/200]
Training...
Iter: [250/250] Loss: 0.0375
Average Train Loss: 0.2115; Train Acc: 92.7587

Testing...
Iter: [908/908] Loss: 0.00549
Average Test Loss: 0.4256; Test Acc: 85.3524
-------------------------------------------
Epoch:[39/200]
Training...
Iter: [250/250] Loss: 0.0818
Average Train Loss: 0.2386; Train Acc: 92.1824

Testing...
Iter: [908/908] Loss: 0.00681
Average Test Loss: 0.3893; Test Acc: 87.4449
-------------------------------------------
Epoch:[40/200]
Training...
Iter: [250/250] Loss: 0.0539
Average Train Loss: 0.2131; Train Ac

Iter: [250/250] Loss: 0.0427
Average Train Loss: 0.0884; Train Acc: 96.5923

Testing...
Iter: [908/908] Loss: 0.00304
Average Test Loss: 0.3228; Test Acc: 90.0881
-------------------------------------------
Epoch:[107/200]
Training...
Iter: [250/250] Loss: 0.0079
Average Train Loss: 0.0851; Train Acc: 96.5923

Testing...
Iter: [908/908] Loss: 0.00227
Average Test Loss: 0.3131; Test Acc: 90.3084
-------------------------------------------
Epoch:[108/200]
Training...
Iter: [250/250] Loss: 0.0263
Average Train Loss: 0.0920; Train Acc: 96.2917

Testing...
Iter: [908/908] Loss: 0.00350
Average Test Loss: 0.3159; Test Acc: 90.5286
-------------------------------------------
Epoch:[109/200]
Training...
Iter: [250/250] Loss: 0.0504
Average Train Loss: 0.0879; Train Acc: 96.3919

Testing...
Iter: [908/908] Loss: 0.00283
Average Test Loss: 0.3383; Test Acc: 90.3084
-------------------------------------------
Epoch:[110/200]
Training...
Iter: [250/250] Loss: 0.0324
Average Train Loss: 0.0835; Tra

Iter: [250/250] Loss: 0.0541
Average Train Loss: 0.0809; Train Acc: 96.8930

Testing...
Iter: [908/908] Loss: 0.00326
Average Test Loss: 0.3238; Test Acc: 89.9780
-------------------------------------------
Epoch:[142/200]
Training...
Iter: [250/250] Loss: 0.0131
Average Train Loss: 0.0708; Train Acc: 97.2187

Testing...
Iter: [908/908] Loss: 0.01129
Average Test Loss: 0.3099; Test Acc: 90.4185
-------------------------------------------
Epoch:[143/200]
Training...
Iter: [250/250] Loss: 0.6013
Average Train Loss: 0.0786; Train Acc: 96.9682

Testing...
Iter: [908/908] Loss: 0.01336
Average Test Loss: 0.3263; Test Acc: 90.6388
-------------------------------------------
Epoch:[144/200]
Training...
Iter: [250/250] Loss: 0.0032
Average Train Loss: 0.0766; Train Acc: 97.2438

Testing...
Iter: [908/908] Loss: 0.01301
Average Test Loss: 0.3090; Test Acc: 90.9692
-------------------------------------------
Epoch:[145/200]
Training...
Iter: [250/250] Loss: 1.9950
Average Train Loss: 0.0831; Tra

Iter: [250/250] Loss: 0.2564
Average Train Loss: 0.0796; Train Acc: 97.1937

Testing...
Iter: [908/908] Loss: 0.00422
Average Test Loss: 0.2979; Test Acc: 90.9692
-------------------------------------------
Epoch:[177/200]
Training...
Iter: [250/250] Loss: 0.1255
Average Train Loss: 0.0783; Train Acc: 97.0433

Testing...
Iter: [908/908] Loss: 0.00300
Average Test Loss: 0.3094; Test Acc: 90.9692
-------------------------------------------
Epoch:[178/200]
Training...
Iter: [250/250] Loss: 0.1562
Average Train Loss: 0.0709; Train Acc: 97.3691

Testing...
Iter: [908/908] Loss: 0.00278
Average Test Loss: 0.3143; Test Acc: 90.6388
-------------------------------------------
Epoch:[179/200]
Training...
Iter: [250/250] Loss: 0.1138
Average Train Loss: 0.0782; Train Acc: 97.2438

Testing...
Iter: [908/908] Loss: 0.00346
Average Test Loss: 0.3098; Test Acc: 91.5198
-------------------------------------------
Epoch:[180/200]
Training...
Iter: [250/250] Loss: 0.0038
Average Train Loss: 0.0758; Tra

### Report the best test accuracy you can get.

The best test accuracy is 91.5198

## 2.5 Segmentation Network
In this network, you will use the global features and local features generated by the `Feature` network defined above.

The global feature matrix is of size `B x 1024` and the local feature matrix is of size `B x 64 x N`.

They need to be stacked together to a new matrix of size `B x 1088 x n` (How?). 

In [151]:
# main train function for classification
class Segmentation(nn.Module):
    def __init__(self, num_classes, alignment=False):
        super(Segmentation, self).__init__()
               
        self.feature = Feature(alignment=alignment)
        
        self.in_dimension = 1088
        self.shared_fc_s1 = 512
        self.shared_fc_s2 = 256
        self.shared_fc_s3 = 128
        self.shared_fc_s4 = 128
        self.num_classes = num_classes

        ## TASK 2.9
        ## shared mlp
        ## input size: B x 1088 x N
        ## output size: B x num_classes x N
        
        self.shared_fc1 = shared_fc(self.in_dimension, self.shared_fc_s1)
        self.shared_fc2 = shared_fc(self.shared_fc_s1, self.shared_fc_s2)
        self.shared_fc3 = shared_fc(self.shared_fc_s2, self.shared_fc_s3)
        self.shared_fc4 = shared_fc(self.shared_fc_s3, self.shared_fc_s4)
        self.shared_fc5 = shared_fc(self.shared_fc_s4, self.num_classes, batch_norm=False)
        
    def forward(self, x):
        g, l, trans = self.feature(x)
        
        B, _, N = l.shape
        
        ## TASK 2.10
        # concat global features and local features to a single matrix
        # g - B x 1024, global features
        # l - B x 64 x N, local features
        # x - B x 1088 x N, concatenated features
        g = g.unsqueeze(2)
        y = torch.cat(N * [g], dim=2)
        x = torch.cat((l, y), dim=1)
            
        ## TASK 2.9
        ## forward of shared mlp
        # input - B x 1088 x N
        # output - B x num_classes x N  
        x = F.relu(self.shared_fc1(x))
        x = F.relu(self.shared_fc2(x))
        x = F.relu(self.shared_fc3(x))
        x = F.relu(self.shared_fc4(x))
        x = self.shared_fc5(x)
            
        return x, trans

In [152]:
## random generate data and test this network
example_segmentation = Segmentation(10, alignment=True)
input = torch.randn(10, 3, 40)
scores, trans = example_segmentation(input)
print(scores.size())

torch.Size([10, 10, 40])


### 2.5.1 Calculating Intersection over Union (IoU) 
For 2D image, the IoU is calculated as follows,
![iou](img/iou.png)

How is it used in the literature of point clouds?

In [203]:
## TASK 2.11
# implement the helper functions to calculate the IoU
def get_i_and_u(pred, target, num_classes):
    """Calculate intersection and union between pred and target.
    
    pred -- B x N matrix
    target -- B x N matrix
    num_classes -- number of classes
    
    return i, u
    i -- B x num_classes binary matrix, intersection, i[b, n] equals 1 if and only if it is a true-positive.
    u -- B x num_classes matrix, union, u[b, n] equals 0 if and only if it is a true-negative
    """
    ## TASK 2.11
    ## calculate i and u here
    ## hint: useful function `F.one_hot`    
    ## hint: use element-wise logical tensor operation (`&` and `|`)
    pred_one_hot = F.one_hot(pred, num_classes)
    target_one_hot = F.one_hot(target, num_classes)
    i = torch.sum(pred_one_hot & target_one_hot, dim=1)
    u = torch.sum(pred_one_hot | target_one_hot, dim=1)

    return i, u

def get_iou(pred, target, num_classes):
    """Calculate IoU
    pred -- B x N matrix
    target -- B x N matrix
    num_classes -- number of classes
    
    return iou
    iou -- B matrix, iou[b] is the IoU of b-th point cloud in this batch
    """
    
    ## use the helper function `i_and_u` defined above
    i, u = get_i_and_u(pred, target, num_classes)
    
    ## TASK 2.11
    ## calculate iou
    iou_by_classes = i.double() / u.double()
    # all NaN values are replaced by ones but not zeros 
    # because mIoU must equal one when all classes predicted correctly
    iou_by_classes[iou_by_classes.ne(iou_by_classes)] = 1 
    iou = torch.mean(iou_by_classes, dim=1)
    
    return iou

In [204]:
num_classes = 10
target_example = torch.randint(low = 0, high = num_classes, size=(5, 30))
pred_example = target_example

In [205]:
i, u = get_i_and_u(pred_example, target_example, num_classes)
i

tensor([[2, 2, 1, 4, 3, 2, 3, 3, 4, 6],
        [4, 4, 3, 1, 6, 2, 2, 3, 4, 1],
        [2, 4, 4, 5, 2, 3, 3, 4, 2, 1],
        [3, 2, 4, 5, 4, 0, 4, 4, 2, 2],
        [2, 5, 4, 1, 3, 1, 2, 4, 4, 4]])

In [206]:
u

tensor([[2, 2, 1, 4, 3, 2, 3, 3, 4, 6],
        [4, 4, 3, 1, 6, 2, 2, 3, 4, 1],
        [2, 4, 4, 5, 2, 3, 3, 4, 2, 1],
        [3, 2, 4, 5, 4, 0, 4, 4, 2, 2],
        [2, 5, 4, 1, 3, 1, 2, 4, 4, 4]])

In [207]:
iou = get_iou(pred_example, target_example, num_classes)
iou

tensor([1., 1., 1., 1., 1.], dtype=torch.float64)

### 2.5.2 Train this network on ShapeNet

In [208]:
# main train function for segmentation
def train_seg(train_loader, test_loader, network, optimizer, epochs, scheduler):  
    reg = OrthoLoss()
    criterion = nn.CrossEntropyLoss()
    for epoch in range(epochs):
        print('Epoch:[{:02d}/{:02d}]'.format(epoch+1, epochs))
        print('Training...')
        network.train()
        train_loss = 0
        correct = 0
        total = 0
        ious = []
        for batch, (pos, label) in enumerate(train_loader):
            network.zero_grad()
            pos, label = pos.cuda(), label.cuda()
            ## TASK 2.12
            ## forward propagation
            output, trans = network(pos)
            loss = criterion(output, label)
            ##########
            if trans is not None:
                loss += reg(trans) * 0.001        

            pred = output.max(1)[1]
            correct += pred.eq(label).sum().item()
            total += label.numel()

            loss.backward()
            optimizer.step()
            train_loss += loss.item()

            ious += [get_iou(pred, label, train_loader.dataset.num_classes)]
            print('\rIter: [{:03d}/{:03d}] Loss: {:.4f}'.format(batch+1, len(train_loader), loss.item()), end='', flush=True)
        
        scheduler.step()
        print('\nAverage Train Loss: {:.4f}; Train Acc: {:.4f}; Train mean IoU: {:.4f}'.format(train_loss/len(train_loader), correct/total * 100, torch.cat(ious, dim=0).mean().item()))

        print('\nTesting...')
        with torch.no_grad():
            network.eval()
            test_loss = 0
            correct = 0
            total = 0
            ious = []
            for batch, (pos, label) in enumerate(test_loader):
                pos, label = pos.cuda(), label.cuda()
                
                ## TASK 2.12
                ## forward propagation
                output, trans = network(pos)
                loss = criterion(output, label)
                ##########
                
                if trans is not None:
                    loss += reg(trans) * 0.001   

                pred = output.max(1)[1]
                correct += pred.eq(label).sum().item()
                total += label.numel()

                test_loss += loss.item()

                ious += [get_iou(pred, label, train_loader.dataset.num_classes)]
                print('\rIter: [{:03d}/{:03d}] Loss: {:.4f}'.format(batch+1, len(test_loader), loss.item()), end='', flush=True)

            print('\nAverage Test Loss: {:.4f}; Test Acc: {:.4f}; Test mean IoU: {:.4f}'.format(test_loss/len(test_loader), correct/total * 100, torch.cat(ious, dim=0).mean().item()))
        print('-------------------------------------------')

In [209]:
network = Segmentation(train_seg_dataset.num_classes, alignment=True).cuda()
epochs = 200 # you can change the value to a small number for debugging

## TASK 2.13
# see Appendix C
# choose an optimizer and an initial learning rate
optimizer = torch.optim.Adam(network.parameters(), lr=0.001)
# choose a lr scheduler
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 20, 0.5)
#######

train_seg(train_seg_loader, test_seg_loader, network, optimizer, epochs, scheduler)

Epoch:[01/200]
Training...
Iter: [147/147] Loss: 0.3915
Average Train Loss: 0.6008; Train Acc: 80.4071; Train mean IoU: 0.7121

Testing...
Iter: [341/341] Loss: 0.3996
Average Test Loss: 0.5209; Test Acc: 82.4000; Test mean IoU: 0.7125
-------------------------------------------
Epoch:[02/200]
Training...
Iter: [147/147] Loss: 0.3090
Average Train Loss: 0.3779; Train Acc: 87.3634; Train mean IoU: 0.7934

Testing...
Iter: [341/341] Loss: 0.3030
Average Test Loss: 0.4238; Test Acc: 85.4502; Test mean IoU: 0.7749
-------------------------------------------
Epoch:[03/200]
Training...
Iter: [147/147] Loss: 0.3116
Average Train Loss: 0.3309; Train Acc: 88.4364; Train mean IoU: 0.8145

Testing...
Iter: [341/341] Loss: 0.3244
Average Test Loss: 0.3821; Test Acc: 86.3150; Test mean IoU: 0.7735
-------------------------------------------
Epoch:[04/200]
Training...
Iter: [147/147] Loss: 0.3136
Average Train Loss: 0.3074; Train Acc: 89.0043; Train mean IoU: 0.8224

Testing...
Iter: [341/341] Loss:

Iter: [341/341] Loss: 0.2322
Average Test Loss: 0.3848; Test Acc: 86.4736; Test mean IoU: 0.8033
-------------------------------------------
Epoch:[31/200]
Training...
Iter: [147/147] Loss: 0.1861
Average Train Loss: 0.1619; Train Acc: 93.5973; Train mean IoU: 0.8928

Testing...
Iter: [341/341] Loss: 0.1821
Average Test Loss: 0.2714; Test Acc: 89.9032; Test mean IoU: 0.8366
-------------------------------------------
Epoch:[32/200]
Training...
Iter: [147/147] Loss: 0.1913
Average Train Loss: 0.1605; Train Acc: 93.6273; Train mean IoU: 0.8933

Testing...
Iter: [341/341] Loss: 0.2979
Average Test Loss: 0.4015; Test Acc: 86.9112; Test mean IoU: 0.8068
-------------------------------------------
Epoch:[33/200]
Training...
Iter: [147/147] Loss: 0.2458
Average Train Loss: 0.1551; Train Acc: 93.8823; Train mean IoU: 0.8978

Testing...
Iter: [341/341] Loss: 0.2571
Average Test Loss: 0.3026; Test Acc: 89.3207; Test mean IoU: 0.8235
-------------------------------------------
Epoch:[34/200]
Trai

Iter: [147/147] Loss: 0.1291
Average Train Loss: 0.1194; Train Acc: 95.1806; Train mean IoU: 0.9168

Testing...
Iter: [341/341] Loss: 0.1678
Average Test Loss: 0.3682; Test Acc: 87.1244; Test mean IoU: 0.8149
-------------------------------------------
Epoch:[61/200]
Training...
Iter: [147/147] Loss: 0.1505
Average Train Loss: 0.1138; Train Acc: 95.4177; Train mean IoU: 0.9206

Testing...
Iter: [341/341] Loss: 0.1456
Average Test Loss: 0.3088; Test Acc: 90.2360; Test mean IoU: 0.8396
-------------------------------------------
Epoch:[62/200]
Training...
Iter: [147/147] Loss: 0.1248
Average Train Loss: 0.1111; Train Acc: 95.5234; Train mean IoU: 0.9216

Testing...
Iter: [341/341] Loss: 0.1508
Average Test Loss: 0.2894; Test Acc: 91.0390; Test mean IoU: 0.8618
-------------------------------------------
Epoch:[63/200]
Training...
Iter: [147/147] Loss: 0.1024
Average Train Loss: 0.1105; Train Acc: 95.5582; Train mean IoU: 0.9225

Testing...
Iter: [341/341] Loss: 0.2054
Average Test Loss: 

Iter: [341/341] Loss: 0.1716
Average Test Loss: 0.3010; Test Acc: 91.4290; Test mean IoU: 0.8630
-------------------------------------------
Epoch:[90/200]
Training...
Iter: [147/147] Loss: 0.0998
Average Train Loss: 0.0988; Train Acc: 95.9943; Train mean IoU: 0.9298

Testing...
Iter: [341/341] Loss: 0.2125
Average Test Loss: 0.3142; Test Acc: 90.9751; Test mean IoU: 0.8561
-------------------------------------------
Epoch:[91/200]
Training...
Iter: [147/147] Loss: 0.1033
Average Train Loss: 0.0981; Train Acc: 96.0231; Train mean IoU: 0.9302

Testing...
Iter: [341/341] Loss: 0.2113
Average Test Loss: 0.3108; Test Acc: 91.0272; Test mean IoU: 0.8587
-------------------------------------------
Epoch:[92/200]
Training...
Iter: [147/147] Loss: 0.0963
Average Train Loss: 0.0981; Train Acc: 96.0238; Train mean IoU: 0.9302

Testing...
Iter: [341/341] Loss: 0.1644
Average Test Loss: 0.3084; Test Acc: 91.2957; Test mean IoU: 0.8605
-------------------------------------------
Epoch:[93/200]
Trai

Training...
Iter: [147/147] Loss: 0.1165
Average Train Loss: 0.0924; Train Acc: 96.2367; Train mean IoU: 0.9337

Testing...
Iter: [341/341] Loss: 0.1439
Average Test Loss: 0.3183; Test Acc: 91.4607; Test mean IoU: 0.8625
-------------------------------------------
Epoch:[120/200]
Training...
Iter: [147/147] Loss: 0.1064
Average Train Loss: 0.0922; Train Acc: 96.2302; Train mean IoU: 0.9336

Testing...
Iter: [341/341] Loss: 0.1957
Average Test Loss: 0.3210; Test Acc: 91.1192; Test mean IoU: 0.8573
-------------------------------------------
Epoch:[121/200]
Training...
Iter: [147/147] Loss: 0.1083
Average Train Loss: 0.0917; Train Acc: 96.2607; Train mean IoU: 0.9339

Testing...
Iter: [341/341] Loss: 0.1421
Average Test Loss: 0.3102; Test Acc: 91.5824; Test mean IoU: 0.8621
-------------------------------------------
Epoch:[122/200]
Training...
Iter: [147/147] Loss: 0.1086
Average Train Loss: 0.0911; Train Acc: 96.2865; Train mean IoU: 0.9348

Testing...
Iter: [341/341] Loss: 0.1469
Aver

Iter: [341/341] Loss: 0.1399
Average Test Loss: 0.3189; Test Acc: 91.5546; Test mean IoU: 0.8623
-------------------------------------------
Epoch:[149/200]
Training...
Iter: [147/147] Loss: 0.0854
Average Train Loss: 0.0887; Train Acc: 96.3850; Train mean IoU: 0.9361

Testing...
Iter: [341/341] Loss: 0.1280
Average Test Loss: 0.3275; Test Acc: 91.4020; Test mean IoU: 0.8613
-------------------------------------------
Epoch:[150/200]
Training...
Iter: [147/147] Loss: 0.0889
Average Train Loss: 0.0888; Train Acc: 96.3859; Train mean IoU: 0.9363

Testing...
Iter: [341/341] Loss: 0.1368
Average Test Loss: 0.3229; Test Acc: 91.4140; Test mean IoU: 0.8606
-------------------------------------------
Epoch:[151/200]
Training...
Iter: [147/147] Loss: 0.0767
Average Train Loss: 0.0881; Train Acc: 96.4035; Train mean IoU: 0.9369

Testing...
Iter: [341/341] Loss: 0.1483
Average Test Loss: 0.3187; Test Acc: 91.5338; Test mean IoU: 0.8601
-------------------------------------------
Epoch:[152/200]


Iter: [147/147] Loss: 0.0759
Average Train Loss: 0.0885; Train Acc: 96.3747; Train mean IoU: 0.9357

Testing...
Iter: [341/341] Loss: 0.1567
Average Test Loss: 0.3203; Test Acc: 91.5114; Test mean IoU: 0.8609
-------------------------------------------
Epoch:[179/200]
Training...
Iter: [147/147] Loss: 0.0826
Average Train Loss: 0.0871; Train Acc: 96.4518; Train mean IoU: 0.9374

Testing...
Iter: [341/341] Loss: 0.1595
Average Test Loss: 0.3239; Test Acc: 91.4711; Test mean IoU: 0.8612
-------------------------------------------
Epoch:[180/200]
Training...
Iter: [147/147] Loss: 0.0733
Average Train Loss: 0.0878; Train Acc: 96.4042; Train mean IoU: 0.9363

Testing...
Iter: [341/341] Loss: 0.1600
Average Test Loss: 0.3203; Test Acc: 91.5330; Test mean IoU: 0.8619
-------------------------------------------
Epoch:[181/200]
Training...
Iter: [147/147] Loss: 0.0822
Average Train Loss: 0.0879; Train Acc: 96.4199; Train mean IoU: 0.9367

Testing...
Iter: [341/341] Loss: 0.1446
Average Test Los

### Report the best test mIoU you can get.

The highest test mIoU is 0.8625