# Training a semantic segmentation model using PyTorch

In this tutorial, we will learn how to train a semantic segmentation model using PyTorch.

Before you begin, ensure that you have *PyTorch* installed. To install a compatible version of PyTorch, use the requirement file:

```sh
pip install -r requirements-torch-cuda.txt
```

At a high level, we will:

- Read a dataset and create a *'training'* split. For this example, we will use the `SemanticKITTI` dataset.
- Train a model. We will train a `RandLANet` model on the *'training'* split.
- Run a test on a *'test'* split to evaluate the model.
- Run an inference on a custom point cloud.


## Reading a dataset

Downloading scripts are available in: `Open3D-ML/scripts/download_datasets`

You can use any dataset available in the `ml3d.datasets` dataset namespace. Here, we will use the `SemanticKITTI` dataset and visualize it. You can use any of the other datasets to load data. However, you must understand that the parameters may vary for each dataset.

We will read the dataset by specifying its path and then get all splits.

In [1]:
# Training Semantic Segmentation Model using PyTorch

# import torch
import open3d.ml.torch as ml3d

# Read a dataset by specifying the path. We are also providing the cache directory and training split.
dataset = ml3d.datasets.SemanticKITTI(dataset_path='/root/data/kitti_odometry/',
                                      cache_dir='logs/cache',
                                      training_split=['00'],
                                      validation_split=['01'],
                                      test_split=['01'])

# Split the dataset for 'training'. You can get the other splits by passing 'validation' or 'test'
train_split = dataset.get_split('training')

#support of Open3d-ML visualizer in Jupyter Notebooks is in progress
#view the frames using the visualizer
#vis = ml3d.vis.Visualizer()
#vis.visualize_dataset(dataset, 'training',indices=range(len(train_split)))

Jupyter environment detected. Enabling Open3D WebVisualizer.
[Open3D INFO] WebRTC GUI backend enabled.
[Open3D INFO] WebRTCWindowSystem: HTTP handshake server disabled.


INFO - 2022-10-19 03:13:43,009 - semantickitti - Found 4541 pointclouds for training


Now that you have visualized the dataset for training, let us train the model.

## Training a model

First, import the desired model from `open3d.ml.torch.models`.

After you load a dataset, you can initialize any model and then train the model. The following example shows how you can train RandLANet:

In [2]:
# Training Semantic Segmentation Model using PyTorch

# Import torch and the model to use for training
import open3d.ml.torch as ml3d
from open3d.ml.torch.models import RandLANet
from open3d.ml.torch.pipelines import SemanticSegmentation

# Read a dataset by specifying the path. We are also providing the cache directory and training split.
# dataset = ml3d.datasets.SemanticKITTI(dataset_path='/Users/sanskara/data/SemanticKITTI/', cache_dir='./logs/cache',training_split=['00'], validation_split=['01'], test_split=['01'])
dataset = ml3d.datasets.SemanticKITTI(dataset_path='/root/data/kitti_odometry/',
                                      cache_dir='logs/cache',
                                      training_split=['00'],
                                      validation_split=['01'],
                                      test_split=['01'])

# Initialize the RandLANet model.
model = RandLANet(in_channels=3)
pipeline = SemanticSegmentation(model=model,
                                dataset=dataset,
                                max_epoch=3,
                                optimizer={'lr': 0.001},
                                num_workers=0)

# Run the training
pipeline.run_train()

INFO - 2022-10-19 03:13:53,936 - semantic_segmentation - DEVICE : cuda
INFO - 2022-10-19 03:13:53,938 - semantic_segmentation - Logging in file : ./logs/RandLANet_SemanticKITTI_torch/log_train_2022-10-19_03:13:53.txt
INFO - 2022-10-19 03:13:53,951 - semantickitti - Found 4541 pointclouds for train
INFO - 2022-10-19 03:13:53,954 - semantickitti - Found 1101 pointclouds for validation
INFO - 2022-10-19 03:13:53,955 - semantic_segmentation - Initializing from scratch.
INFO - 2022-10-19 03:13:53,958 - semantic_segmentation - Writing summary in train_log/00001_RandLANet_SemanticKITTI_torch.
INFO - 2022-10-19 03:13:53,958 - semantic_segmentation - Started training
INFO - 2022-10-19 03:13:53,959 - semantic_segmentation - === EPOCH 0/3 ===
training:  11%|█         | 121/1136 [04:48<40:16,  2.38s/it]


KeyboardInterrupt: 

The training checkpoints are saved in: `pipeline.main_log_dir` (default path is: “./logs/Model_Dataset/“). You can use them for testing and inference.

## Running a test

Next, we will evaluate the trained model on the test split by calling the `run_test()` method:

In [3]:
pipeline.run_test()

INFO - 2022-10-19 03:18:46,988 - semantic_segmentation - DEVICE : cuda
INFO - 2022-10-19 03:18:46,989 - semantic_segmentation - Logging in file : ./logs/RandLANet_SemanticKITTI_torch/log_test_2022-10-19_03:18:46.txt
INFO - 2022-10-19 03:18:46,993 - semantickitti - Found 1101 pointclouds for test
INFO - 2022-10-19 03:22:54,504 - semantic_segmentation - Initializing from scratch.
INFO - 2022-10-19 03:22:54,506 - semantic_segmentation - Started testing
test 0/1101: 100%|██████████| 90404/90404 [00:05<00:00, 14871.08it/s]INFO - 2022-10-19 03:23:01,693 - semantic_segmentation - Accuracy : [0.00303951367781155, nan, nan, nan, nan, nan, nan, nan, 0.5496127929196419, nan, nan, nan, nan, 0.0, 0.00728126834997064, nan, 0.00018758206715438003, 0.0, 0.0, 0.08001730814493979]
INFO - 2022-10-19 03:23:01,694 - semantic_segmentation - IoU : [6.836204539239814e-05, nan, nan, nan, nan, nan, nan, nan, 0.3575316654454098, nan, 0.0, nan, 0.0, 0.0, 0.00728126834997064, nan, 0.00018508664368507507, 0.0, 0.0,

KeyboardInterrupt: 

## Running a pre-trained RandLANet (3D Semantic Segmentation)

In [16]:
import os
import open3d.ml as _ml3d
import open3d.ml.torch as ml3d


cfg_file = "configs/randlanet_semantickitti.yml"
cfg = _ml3d.utils.Config.load_from_file(cfg_file)

model = ml3d.models.RandLANet(**cfg.model)
cfg.dataset['dataset_path'] = "/root/data/kitti_odometry/"
dataset = ml3d.datasets.SemanticKITTI(cfg.dataset.pop('dataset_path', None), **cfg.dataset)
pipeline = ml3d.pipelines.SemanticSegmentation(model, dataset=dataset, device="gpu", **cfg.pipeline)

# download the weights.
ckpt_folder = "pretrained_weights"
os.makedirs(ckpt_folder, exist_ok=True)
ckpt_path = ckpt_folder + "/randlanet_semantickitti_202201071330utc.pth"
randlanet_url = "https://storage.googleapis.com/open3d-releases/model-zoo/randlanet_semantickitti_202201071330utc.pth"
if not os.path.exists(ckpt_path):
    cmd = "wget {} -O {}".format(randlanet_url, ckpt_path)
    os.system(cmd)

# load the parameters.
pipeline.load_ckpt(ckpt_path=ckpt_path)

test_split = dataset.get_split("test")
data = test_split.get_data(0)

# run inference on a single example.
# returns dict with 'predict_labels' and 'predict_scores'.
result = pipeline.run_inference(data)

# evaluate performance on the test set; this will write logs to './logs'.
pipeline.run_test()

INFO - 2022-10-19 03:34:15,721 - semantic_segmentation - Loading checkpoint pretrained_weights/randlanet_semantickitti_202201071330utc.pth
INFO - 2022-10-19 03:34:15,827 - semantickitti - Found 20351 pointclouds for test

[A
  accs.append(np.nanmean(accs))
INFO - 2022-10-19 03:34:25,495 - semantic_segmentation - Accuracy : [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
  ious.append(np.nanmean(ious))
INFO - 2022-10-19 03:34:25,500 - semantic_segmentation - IoU : [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
INFO - 2022-10-19 03:34:25,504 - semantic_segmentation - DEVICE : cuda
INFO - 2022-10-19 03:34:25,505 - semantic_segmentation - Logging in file : ./logs/RandLANet_SemanticKITTI_torch/log_test_2022-10-19_03:34:25.txt
INFO - 2022-10-19 03:34:25,552 - semantickitti - Found 20351 pointclouds for test


[A[A

[A[A

[A[A
[A

[A[A

[A[A

[A[A

[A[A

[A[A

[A[A



KeyboardInterrupt: 

## Running a pre-trained PointPillars (3D Object Detection)

In [4]:
import os
import open3d.ml as _ml3d
import open3d.ml.torch as ml3d


cfg_file = "configs/pointpillars_kitti.yml"
cfg = _ml3d.utils.Config.load_from_file(cfg_file)

model = ml3d.models.PointPillars(**cfg.model)
cfg.dataset['dataset_path'] = "/root/data/kitti_odometry/"
dataset = ml3d.datasets.KITTI(cfg.dataset.pop('dataset_path', None), **cfg.dataset)
pipeline = ml3d.pipelines.ObjectDetection(model, dataset=dataset, device="gpu", **cfg.pipeline)

# download the weights.
ckpt_folder = "pretrained_weights"
os.makedirs(ckpt_folder, exist_ok=True)
ckpt_path = ckpt_folder + "/pointpillars_kitti_202012221652utc.pth"
pointpillar_url = "https://storage.googleapis.com/open3d-releases/model-zoo/pointpillars_kitti_202012221652utc.pth"
if not os.path.exists(ckpt_path):
    cmd = "wget {} -O {}".format(pointpillar_url, ckpt_path)
    os.system(cmd)

# load the parameters.
pipeline.load_ckpt(ckpt_path=ckpt_path)

test_split = dataset.get_split("test")
data = test_split.get_data(0)

# run inference on a single example.
# returns dict with 'predict_labels' and 'predict_scores'.
result = pipeline.run_inference(data)

# evaluate performance on the test set; this will write logs to './logs'.
pipeline.run_test()

INFO - 2022-10-19 03:46:08,577 - object_detection - Loading checkpoint pretrained_weights/pointpillars_kitti_202012221652utc.pth
INFO - 2022-10-19 03:46:08,616 - kitti - Found 0 pointclouds for test


IndexError: list index out of range