## 3D 데이터에서의 물체 검출/분류 모델(PointNet) 학습하기

이번 실습에서는 3D 데이터(3D point cloud)에서의 물체 검출/분류 모델 중 매우 우수한 성능을 보인 바 있는<br/>
PointNet([PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation](https://arxiv.org/abs/1612.00593), CVPR 2017)을 학습해 보겠습니다.<br/>

In [1]:
from __future__ import print_function
import argparse
import os
import random
import numpy as np
import torch
import torch.nn as nn
import torch.nn.parallel
import torch.backends.cudnn as cudnn
import torch.optim as optim
import torch.utils.data
import torchvision.datasets as dset
import torchvision.transforms as transforms
import torchvision.utils as vutils
from datasets import PartDataset
from pointnet import PointNetCls
from utils import Timer
import torch.nn.functional as F

## PointNet 모델 및 PASCAL VOC2007 데이터 셋 로드
학습에 필요한 hyper parameter 값들을 정의하고, PointNet 모델과 ShapeNetCore 데이터셋을 로드합니다.

+ PointNet (http://stanford.edu/~rqi/pointnet/)
    - Concept<br>
        <img src="misc/pointnet_teaser.jpg" width="300"><br>
    - Architecture<br>
    <img src="misc/pointnet_architecture.jpg" width="300"><br>
    
+ ShapeNetCore Dataset (https://www.shapenet.org/)
    - ShapeNetCore is a subset of the full ShapeNet dataset with single clean 3D models and manually verified category and alignment annotations. It covers 55 common object categories with about 51,300 unique 3D models. The 12 object categories of PASCAL 3D+, a popular computer vision 3D benchmark dataset, are all covered by ShapeNetCore.
    - View examples: https://www.shapenet.org/taxonomy-viewer <br>
    <img src="misc/shapenetcore_example.PNG" width="400">
    
    

In [2]:
DB_NAME = '../dataset/shapenetcore_partanno_segmentation_benchmark_v0'

batch_size     = 32
num_points     = 2500
workers        = 4
max_epochs     = 25
save_dir       = 'cls'
resume         = ''

lr             = 1e-2
momentum       = 0.9
weight_decay   = 5e-4
lr_schedule    = [int(max_epochs*0.5)]
num_classes    = 21

log_interval   = 10

blue           = lambda x:'\033[94m' + x + '\033[0m'    # pretty log

# Manual random seed
manualSeed     = random.randint(1, 10000) # fix seed
random.seed(manualSeed)
torch.manual_seed(manualSeed)
print("Random Seed: ", manualSeed)

# Load datasets
train_dataset = PartDataset(root=DB_NAME, classification=True, npoints=num_points)
train_dataloader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=workers)

test_dataset = PartDataset(root=DB_NAME, classification=True, train=False, npoints=num_points)
test_dataloader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=False, num_workers=workers)

num_classes = len(train_dataset.classes)

print('[Load datasets] # of batches in train_dataset: {:}, test_dataset: {:}'.format(len(train_dataset), len(test_dataset)))
print('# of classes', num_classes)

if not os.path.exists(save_dir): 
    os.makedirs(save_dir)
    
# Load model    
classifier = PointNetCls(k=num_classes)
if resume != '':
    classifier.load_state_dict(torch.load(resume))
cudnn.benchmark = True

if torch.cuda.is_available():
    classifier = classifier.cuda()
    
# Optimizer
optimizer = optim.SGD(classifier.parameters(), lr=lr, momentum=momentum)
optim_scheduler = optim.lr_scheduler.MultiStepLR(optimizer, milestones=lr_schedule, gamma=0.1)

print( 'Model: {}\n'.format( classifier.__class__.__name__ ) )
print( classifier )
print( 'Optimizer: {}\n'.format( optimizer.__class__.__name__ ) )

Random Seed:  594
{'Airplane': 0, 'Bag': 1, 'Cap': 2, 'Car': 3, 'Chair': 4, 'Earphone': 5, 'Guitar': 6, 'Knife': 7, 'Lamp': 8, 'Laptop': 9, 'Motorbike': 10, 'Mug': 11, 'Pistol': 12, 'Rocket': 13, 'Skateboard': 14, 'Table': 15}
{'Airplane': 0, 'Bag': 1, 'Cap': 2, 'Car': 3, 'Chair': 4, 'Earphone': 5, 'Guitar': 6, 'Knife': 7, 'Lamp': 8, 'Laptop': 9, 'Motorbike': 10, 'Mug': 11, 'Pistol': 12, 'Rocket': 13, 'Skateboard': 14, 'Table': 15}
[Load datasets] # of batches in train_dataset: 15990, test_dataset: 1785
# of classes 16
Model: PointNetCls

PointNetCls(
  (feat): PointNetfeat(
    (stn): STN3d(
      (conv1): Conv1d(3, 64, kernel_size=(1,), stride=(1,))
      (conv2): Conv1d(64, 128, kernel_size=(1,), stride=(1,))
      (conv3): Conv1d(128, 1024, kernel_size=(1,), stride=(1,))
      (fc1): Linear(in_features=1024, out_features=512, bias=True)
      (fc2): Linear(in_features=512, out_features=256, bias=True)
      (fc3): Linear(in_features=256, out_features=9, bias=True)
      (relu): ReL

## 모델 학습
아래의 코드는 target과 prediction(network output)으로부터 계산된 loss를 이용하여, 
loss를 줄여가는 방향의 gradient를 구하고 모델을 업데이트 하는 방식으로 모델을 학습하는 코드입니다. 

1. 데이터 로드 
2. Forward propagation 
3. Backward propagation (Gradients 계산)
4. Loss 계산
5. Model update

In [3]:
def train(epoch, net, dataloader, optimizer, cur_lr):
    
    net.train()
    
    # loss counters
    total_loss = 0    
    sum_loss = 0
    
    # timers
    _t = {'forward': Timer(), 'backward': Timer()}    
    
    # load train data
    for batch_idx, (points, target) in enumerate(dataloader):
        
        points = points.transpose(2,1)
        target = target[:,0]
        
        if torch.cuda.is_available():
            points = points.cuda()            
            target = target.cuda()                            
        
        _t['forward'].tic()        
        pred, _ = net(points)
        forward_time = _t['forward'].toc(average=True)
                
        _t['backward'].tic()        
        optimizer.zero_grad()       
                
        loss = F.nll_loss(pred, target)
        loss.backward()
        optimizer.step()
        backward_time = _t['backward'].toc(average=True)
                
        sum_loss += loss.item()
        total_loss += loss.item()
        
        pred_choice = pred.data.max(1)[1]
        correct = pred_choice.eq(target.data).cpu().sum()

        if (batch_idx+1) % log_interval == 0:            
            print('[Train][Epoch {:3d}][iter {:5d}/{:5d}] Loss: {:7.4f} || Accuracy: {:5.2f} || forward {:4.2f}s, backward {:4.2f}s || lr: {:.6f}'.format(
                epoch, batch_idx, len(dataloader), \
                sum_loss/log_interval, correct.item(), \
                forward_time, backward_time, \
                cur_lr
                )
            )               
            sum_loss = 0
            
    return total_loss

In [4]:
def test(epoch, net, dataloader):
    
    net.eval()
    
    sum_loss = 0
    total_loss = 0
    
    # timers
    _t = {'forward': Timer()}    
    
    for batch_idx, (points, target) in enumerate(dataloader):
        
        points = points.transpose(2,1)
        target = target[:,0]
        
        if torch.cuda.is_available():
            points = points.cuda()            
            target = target.cuda()                
        
        _t['forward'].tic()
        with torch.no_grad():            
            pred, _ = net(points)
        forward_time = _t['forward'].toc(average=True)
    
        loss = F.nll_loss(pred, target)        
        
        sum_loss += loss.item()
        total_loss += loss.item()
        
        pred_choice = pred.data.max(1)[1]
        correct = pred_choice.eq(target.data).cpu().sum()
    
        if (batch_idx+1) % log_interval == 0:            
            print('[{:s}][Epoch {:3d}][iter {:5d}/{:5d}] Loss: {:7.4f} || Accuracy: {:5.2f} || forward {:4.2f}s'.format(
                blue('Test'), epoch, batch_idx, len(dataloader), \
                sum_loss/log_interval, correct.item(), \
                forward_time                
                )
            )               
            sum_loss = 0
    return total_loss

In [6]:
## DEBUG
points, target = next(iter(train_dataloader))

In [8]:
points[0]


tensor([[ 0.0384,  0.2304,  0.1898],
        [-0.1580,  0.2304,  0.1600],
        [ 0.1970,  0.2304, -0.1321],
        ...,
        [ 0.0948,  0.2304, -0.2326],
        [-0.2657,  0.0437,  0.2506],
        [-0.2313,  0.2142,  0.3044]])

## 모델 학습 시작
max_epochs 만큼 loop을 돌면서 모델을 학습합니다. <br>
특정 iteration(in train func.) 또는 epoch 마다 learning rate을 조절하는 learning rate scheduling 도 일반적으로 사용됩니다.

In [5]:
for epoch in range(max_epochs):
    optim_scheduler.step()
    loss_train = train( epoch, classifier, train_dataloader, optimizer, optim_scheduler.get_lr()[0])    
    loss_test  = test(  epoch, classifier, test_dataloader)    
    
    torch.save(classifier.state_dict(), '{:s}/PointNetCls_Epoch_{:03d}.pth'.format(save_dir, epoch))
    
print('done.')

[Train][Epoch   0][iter     9/  500] Loss:  2.5786 || Accuracy: 20.00                   || forward 0.04s, backward 0.01s || lr: 0.001000
[Train][Epoch   0][iter    19/  500] Loss:  1.9188 || Accuracy: 28.00                   || forward 0.02s, backward 0.01s || lr: 0.001000
[Train][Epoch   0][iter    29/  500] Loss:  1.4295 || Accuracy: 25.00                   || forward 0.02s, backward 0.01s || lr: 0.001000
[Train][Epoch   0][iter    39/  500] Loss:  1.0678 || Accuracy: 26.00                   || forward 0.02s, backward 0.01s || lr: 0.001000
[Train][Epoch   0][iter    49/  500] Loss:  0.9486 || Accuracy: 27.00                   || forward 0.01s, backward 0.01s || lr: 0.001000
[Train][Epoch   0][iter    59/  500] Loss:  0.7454 || Accuracy: 29.00                   || forward 0.01s, backward 0.01s || lr: 0.001000
[Train][Epoch   0][iter    69/  500] Loss:  0.7091 || Accuracy: 28.00                   || forward 0.01s, backward 0.01s || lr: 0.001000
[Train][Epoch   0][iter    79/  500] Loss

KeyboardInterrupt: 