# i3ce 2024: Workshop on Deep Learning Tools for Understanding and Modeling the Built Environment

In this workshop, we will implement a deep learning pipeline to perform semantic segmentation, i.e. point-wise classification, for 3D point clouds of buildings.

The dataset for this workshop is taken from Stanford 3D Indoor Scene Dataset [S3DIS](http://buildingparser.stanford.edu/dataset.html).

The neural network architecture will be based on [PointNet](https://arxiv.org/abs/1612.00593), *Qi et al. (2017) PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation*. The original source code for PointNet can be found [here](https://github.com/charlesq34/pointnet/blob/master/models/pointnet_cls.py).

A basic PyTorch tutorial can be found here:
[link](https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html).
Doing the tutorial is optional but it should help explain many concepts that we will cover in this workshop.

# Part 1: Setup

In [2]:
# Install the Open3D library for point cloud processing and visualization
# This step is necessary because Open3D is not included by default on Google Colab
!pip install open3d

Collecting open3d
  Downloading open3d-0.18.0-cp310-cp310-manylinux_2_27_x86_64.whl.metadata (4.2 kB)
Collecting dash>=2.6.0 (from open3d)
  Downloading dash-2.17.1-py3-none-any.whl.metadata (10 kB)
Collecting configargparse (from open3d)
  Downloading ConfigArgParse-1.7-py3-none-any.whl.metadata (23 kB)
Collecting ipywidgets>=8.0.4 (from open3d)
  Downloading ipywidgets-8.1.3-py3-none-any.whl.metadata (2.4 kB)
Collecting addict (from open3d)
  Downloading addict-2.4.0-py3-none-any.whl.metadata (1.0 kB)
Collecting pyquaternion (from open3d)
  Downloading pyquaternion-0.9.9-py3-none-any.whl.metadata (1.4 kB)
Collecting dash-html-components==2.0.0 (from dash>=2.6.0->open3d)
  Downloading dash_html_components-2.0.0-py3-none-any.whl.metadata (3.8 kB)
Collecting dash-core-components==2.0.0 (from dash>=2.6.0->open3d)
  Downloading dash_core_components-2.0.0-py3-none-any.whl.metadata (2.9 kB)
Collecting dash-table==5.0.0 (from dash>=2.6.0->open3d)
  Downloading dash_table-5.0.0-py3-none-any.w

In [1]:
# Mount a Google Drive folder so that the data files can be accessed
from google.colab import drive
from google.colab import files
import sys
drive.mount('/content/drive', force_remount=True)
%cd drive/MyDrive/i3ce 2024 DL Workshop/code/
sys.path.insert(0,'/content/drive/MyDrive/i3ce 2024 DL Workshop/code/')

Mounted at /content/drive
/content/drive/MyDrive/i3ce 2024 DL Workshop/code


In [4]:
# Import the necessary libraries and utility functions
import numpy as np
import torch
import torch.nn.functional as F
import open3d as o3d

# This file contains the model definition for PointNet
from pointnet import PointNet

# This file contains the data loader code for the S3DIS dataset
from dataloader_s3dis import SemSegDataset, class_names, class_colors

# This file contains the code for drawing 3D point clouds in a Python notebook
from utils import draw_geometries

In [5]:
 # define training parameters for deep learning
learning_rate = 2e-4
batch_size = 10
max_epochs = 1000
num_resampled_points = 1024
num_class = len(class_names)

In [6]:
# Allow the model to be trained on GPU (if the CUDA driver is available)
# Google Colab allows using a T4 GPUs for free accounts whereas premium GPUs require a subscription
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print('Using device:', device)

# Create a PointNet model
# PointNet consists of 5 convolution and batch norm layers and 1 max pooling layer
model = PointNet(num_class = num_class).to(device)
print('PointNet model:')
print(model)

# Create the Adam optimizer, an extension to the stochastic gradient descent algorithm for updating model weights
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

Using device: cuda
PointNet model:
PointNet(
  (conv1): Conv1d(9, 64, kernel_size=(1,), stride=(1,))
  (conv2): Conv1d(64, 128, kernel_size=(1,), stride=(1,))
  (conv3): Conv1d(128, 1024, kernel_size=(1,), stride=(1,))
  (conv4): Conv1d(1088, 256, kernel_size=(1,), stride=(1,))
  (conv5): Conv1d(256, 13, kernel_size=(1,), stride=(1,))
  (bn1): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (bn2): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (bn3): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (bn4): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)


# Part 2: Data Loading and Visualization

In [9]:
# create data loaders for the S3DIS dataset
train_dataset = SemSegDataset(root='data', area=1, N=num_resampled_points)
train_dataloader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=1)
validation_dataset = SemSegDataset(root='data', area=2, N=num_resampled_points)
validation_dataloader = torch.utils.data.DataLoader(validation_dataset, batch_size=batch_size, shuffle=True, num_workers=1)
num_train_batches = int(np.ceil(len(train_dataset) / batch_size))
num_validation_batches = int(np.ceil(len(validation_dataset) / batch_size))

Created dataset from area=1 with 44 rooms
Created dataset from area=2 with 40 rooms


In [None]:
# visualize labels in training dataset by plotting colored point clouds
visualized_objects = []

# only visualize a small number of rooms to prevent memory overflow
num_rooms_to_visualize = 10
#num_rooms_to_visualize = len(train_dataset)

for i in range(num_rooms_to_visualize):
  pcd_object = o3d.geometry.PointCloud()
  pcd_object.points = o3d.utility.Vector3dVector(train_dataset.points[i][:, :3])
  # assign colors based on the ground truth class label for each point
  pcd_object.colors = o3d.utility.Vector3dVector(np.array(class_colors)[train_dataset.labels[i]])
  visualized_objects.append(pcd_object)


# Visualize the point cloud using a 3D web viewer
draw_geometries(visualized_objects, show_axes=True)

Output hidden; open in https://colab.research.google.com to view.

# Part 3: Training

In [10]:
# Run the training and evaluation loop for 1000 epochs
# Observe the trend in how the training / validation loss and accuracy values changes over time
# After every 100 epochs, we will calculate the recall and precision metrics as well

for epoch in range(max_epochs):
    train_loss, train_correct, train_accuracy = 0, 0, 0
    for i, data in enumerate(train_dataloader):
        points, target = data

        #put the model in training mode
        model = model.train()

        #move this batch of data to the GPU if device is cuda
        points, target = points.to(device), target.to(device)

        # TODO: run a forward pass through the neural network and predict the outputs
        pred = model(points)

        # TODO: compare the prediction vs the target labels and determine the negative log-likelihood loss
        pred_1d = pred.view(-1, num_class)
        target_1d = target.view(-1, 1)[:, 0]
        loss = F.nll_loss(pred_1d, target_1d)

        # TODO: perform backpropagation to update the weights of the network based on the computed loss function
        loss.backward()
        optimizer.step()

        # keep track of the accuracy of our predictions
        pred_choice = pred_1d.data.max(1)[1]
        train_loss += loss.item()
        train_correct += pred_choice.eq(target_1d).sum().item()

    train_loss /= num_train_batches
    train_accuracy = train_correct / len(train_dataset) / num_resampled_points
    if epoch % 10 == 9:
      print('[Epoch %d] train loss: %.3f accuracy: %.3f' % (epoch, train_loss, train_accuracy))

    if epoch % 100 == 99: # run validation every 100 epochs
        validation_loss, validation_correct, validation_accuracy = 0, 0, 0
        tp_per_class = [0] * num_class
        fp_per_class = [0] * num_class
        fn_per_class = [0] * num_class
        for j, data in enumerate(validation_dataloader):
            points, target = data
            points, target = points.to(device), target.to(device)
            # put the model in evaluation mode
            model = model.eval()
            with torch.no_grad():
                pred = model(points)
                pred_1d = pred.view(-1, num_class)
                target_1d = target.view(-1, 1)[:, 0]
                loss = F.nll_loss(pred_1d, target_1d)
                pred_choice = pred_1d.data.max(1)[1]
                validation_loss += loss.item()
                validation_correct += pred_choice.eq(target_1d).sum().item()
                for i in range(num_class):
                    tp_per_class[i] += ((pred_choice==i) & (target_1d==i)).sum().item()
                    fp_per_class[i] += ((pred_choice==i) & (target_1d!=i)).sum().item()
                    fn_per_class[i] += ((pred_choice!=i) & (target_1d==i)).sum().item()
        validation_loss /= num_validation_batches
        validation_accuracy = validation_correct / len(validation_dataset) / num_resampled_points
        print('[Epoch %d] validation  loss: %.3f accuracy: %.3f' % (epoch, validation_loss, validation_accuracy))
        for i in range(num_class):
            precision = 1.0 * tp_per_class[i] / (tp_per_class[i] + fp_per_class[i] + 1e-6)
            recall = 1.0 * tp_per_class[i] / (tp_per_class[i] + fn_per_class[i] + 1e-6)
            print('%10s: recall %.3f precision %.3f' % (class_names[i], precision, recall))


[Epoch 9] train loss: 2.206 accuracy: 0.294
[Epoch 19] train loss: 2.190 accuracy: 0.295
[Epoch 29] train loss: 2.273 accuracy: 0.213
[Epoch 39] train loss: 2.151 accuracy: 0.279
[Epoch 49] train loss: 2.204 accuracy: 0.297
[Epoch 59] train loss: 2.062 accuracy: 0.321
[Epoch 69] train loss: 2.044 accuracy: 0.289
[Epoch 79] train loss: 1.857 accuracy: 0.369
[Epoch 89] train loss: 1.857 accuracy: 0.402
[Epoch 99] train loss: 1.757 accuracy: 0.461
[Epoch 99] validation  loss: 3.346 accuracy: 0.360
   clutter: recall 0.000 precision 0.000
     board: recall 0.000 precision 0.000
  bookcase: recall 0.000 precision 0.000
      beam: recall 0.000 precision 0.000
     chair: recall 0.000 precision 0.000
    column: recall 0.000 precision 0.000
      door: recall 0.000 precision 0.000
      sofa: recall 0.000 precision 0.000
     table: recall 0.000 precision 0.000
    window: recall 0.000 precision 0.000
   ceiling: recall 0.646 precision 0.057
     floor: recall 0.536 precision 0.081
      wa

In [11]:
# Save the trained model weights in Google Drive so that it can be used later
torch.save(model.state_dict(), 'pointnet.pth')

# Part 4: Testing

In [12]:
# Create data loader for the test set
# Note that the test set is distinct from the training set and the validation set
test_dataset = SemSegDataset(root='data', area=3, N=num_resampled_points)

Created dataset from area=3 with 23 rooms


In [13]:
# Load the previously saved PointNet model

import os
model_path = 'pointnet.pth'
if os.path.exists(model_path):
    print('Loading PointNet model from', model_path)
    model.load_state_dict(torch.load(model_path))
else:
    print('Failed to load model from', model_path)
    sys.exit(1)
model.eval()

Loading PointNet model from pointnet.pth


PointNet(
  (conv1): Conv1d(9, 64, kernel_size=(1,), stride=(1,))
  (conv2): Conv1d(64, 128, kernel_size=(1,), stride=(1,))
  (conv3): Conv1d(128, 1024, kernel_size=(1,), stride=(1,))
  (conv4): Conv1d(1088, 256, kernel_size=(1,), stride=(1,))
  (conv5): Conv1d(256, 13, kernel_size=(1,), stride=(1,))
  (bn1): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (bn2): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (bn3): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (bn4): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)

In [14]:
# Compute predictions on test dataset

# Iterate through each of the rooms
for i, points in enumerate(test_dataset.normalized_points):
    shuffle_idx = np.arange(len(points))
    np.random.shuffle(shuffle_idx)
    num_batches = int(np.ceil(1.0 * len(points) / num_resampled_points))
    input_points = np.zeros((1, num_resampled_points, 9), dtype=np.float32)
    predicted_labels = np.zeros(len(points), dtype=int)

    print('Processing room %d with %d batches' % (i, num_batches))
    # Iterate through each batch of 1024 points in the room point cloud
    for batch_id in range(num_batches):
        start_idx = batch_id * num_resampled_points
        end_idx = (batch_id + 1) * num_resampled_points
        valid_idx = min(len(points), end_idx)
        if end_idx <= len(points):
            # if there are sufficient points, use all of them
            input_points[0, :valid_idx-start_idx] = points[shuffle_idx[start_idx:valid_idx],:]
        else:
            # if there are insufficient points to make a batch of 1024, resample from the rest of the point cloud
            input_points[0, :valid_idx-start_idx] = points[shuffle_idx[start_idx:valid_idx],:]
            input_points[0, valid_idx-end_idx:] = points[np.random.choice(range(len(points)), end_idx-valid_idx, replace=True),:]

        with torch.no_grad():
            # TODO: run a forward pass through the neural network and predict the outputs
            pred = model(torch.from_numpy(input_points.transpose(0,2,1)).to(device))

            # TODO: determine the output class, which should be the one predicted with maximum probability
            pred = pred[0].data.max(1)[1]
            predicted_labels[shuffle_idx[start_idx:valid_idx]] = pred[:valid_idx-start_idx].cpu().numpy()

    # append the output class predictions to the array of predicted labels
    test_dataset.predicted_labels.append(predicted_labels)

Processing room 0 with 8 batches
Processing room 1 with 46 batches
Processing room 2 with 45 batches
Processing room 3 with 64 batches
Processing room 4 with 49 batches
Processing room 5 with 41 batches
Processing room 6 with 34 batches
Processing room 7 with 9 batches
Processing room 8 with 40 batches
Processing room 9 with 34 batches
Processing room 10 with 84 batches
Processing room 11 with 5 batches
Processing room 12 with 34 batches
Processing room 13 with 56 batches
Processing room 14 with 46 batches
Processing room 15 with 38 batches
Processing room 16 with 36 batches
Processing room 17 with 33 batches
Processing room 18 with 9 batches
Processing room 19 with 50 batches
Processing room 20 with 28 batches
Processing room 21 with 26 batches
Processing room 22 with 5 batches


In [15]:
# visualize predictions for the test dataset by plotting colored point clouds
visualized_objects = []

# only visualize a small number of rooms to prevent memory overflow
num_rooms_to_visualize = 10
#num_rooms_to_visualize = len(test_dataset)

for i in range(num_rooms_to_visualize):
  pcd_object = o3d.geometry.PointCloud()
  pcd_object.points = o3d.utility.Vector3dVector(test_dataset.points[i][:, :3])
  # assign colors based on the predicted label for each point
  pcd_object.colors = o3d.utility.Vector3dVector(np.array(class_colors)[test_dataset.predicted_labels[i]])
  visualized_objects.append(pcd_object)

# Visualize the point cloud using a 3D web viewer
draw_geometries(visualized_objects, show_axes=True)

Output hidden; open in https://colab.research.google.com to view.