# Training a Semantic Segmentation Model Using TensorFlow

In this tutorial, we will learn how to train a semantic segmentation model using `TensorFlow` in a Jupyter Notebook. We assume that you are familiar with Jupyter Notebook and have created a folder *notebooks* in a folder that is relative to *ml3d*.

Before you begin, ensure that you have `TensorFlow` installed. To install a compatible version of `TensorFlow`, use the requirement file:

```sh
pip install -r requirements-tensorflow.txt
```

At a high-level, we will:

- Read a dataset and get a training split. For this example, we will use `SemanticKITTI` dataset.
- Run a pre-trained model. For this example, we will use the `RandLANet` model.
- Train a model. We will train a model using the `SemanticKITTI` dataset and `RandLANet` model.
- Run an inference and run a test. We will run an inference using the 'training' split that use a pointcloud and display a result. However, a test is run on a pre-defined test set rather than a pass pointcloud.

## Reading a dataset

You can use any dataset available in the `ml3d.datasets` dataset namespace. For this example, we will use the `SemanticKITTI` dataset and visualize it. You can use any of the other dataset to load data. However, you must understand that the parameters may vary for each dataset.

We will read the dataset by specifying its path and then get all splits.

In [1]:
# Training Semantic Segmentation Model using TensorFlow

# import tensorflow
import open3d.ml.tf as ml3d

# Read a dataset by specifying the path. We are also providing the cache directory and training split.
dataset = ml3d.datasets.SemanticKITTI(dataset_path='./SemanticKITTI/', cache_dir='./logs/cache',training_split=['00'])

# Split the dataset for 'training'. You can get the other splits by passing 'validation' or 'test'
train_split = dataset.get_split('training')

# view the first 1000 frames using the visualizer
# MyVis = ml3d.vis.Visualizer()
# MyVis.visualize_dataset(dataset, 'training',indices=range(1))

Jupyter environment detected. Enabling Open3D WebVisualizer.
[Open3D INFO] WebRTC GUI backend enabled.
[Open3D INFO] WebRTCWindowSystem: HTTP handshake server disabled.


2022-04-07 10:20:31.517091: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-04-07 10:20:31.517124: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2022-04-07 10:20:33.389558: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-04-07 10:20:33.389589: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
INFO - 2022-04-07 10:20:34,543 - semantickitti - Found 1 pointclouds for training


Now that you have visualized the dataset for training, let us train the model.

## Training a model

`TensorFlow` maps nearly all of GPU memory by default. This may result in out_of_memory error if some of the ops allocate memory independent to tensorflow. You may want to limit memory usage as and when needed by the process. Use following code right after importing tensorflow:

In [2]:
import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    try:
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
    except RuntimeError as e:
        print(e)

2022-04-06 16:31:11.326658: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2022-04-06 16:31:11.326709: W tensorflow/stream_executor/cuda/cuda_driver.cc:326] failed call to cuInit: UNKNOWN ERROR (303)
2022-04-06 16:31:11.326750: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (intel-VirtualBox): /proc/driver/nvidia/version does not exist


Refer to [this link](https://www.tensorflow.org/guide/gpu#limiting_gpu_memory_growth) for more details.

Before you train a model, you must decide which model you want to use. For this example, we will use `RandLANet` model. To use models, you must import the model from `open3d.ml.tf.models`.

After you load a dataset, you can initialize any model and then train the model. The following example shows how you can train a model:

In [6]:
# Training Semantic Segmentation Model using TensorFlow

# Import tensorflow and the model to use for training
import open3d.ml.tf as ml3d
from open3d.ml.tf.models import RandLANet
from open3d.ml.tf.pipelines import SemanticSegmentation

# Read a dataset by specifying the path. We are also providing the cache directory and training split.
dataset = ml3d.datasets.SemanticKITTI(dataset_path='SemanticKITTI/', cache_dir='./logs/cache',training_split=['00'], validation_split=['01'], test_split=['01'])

# Initialize the RandLANet model with three layers.
model = RandLANet(dim_input=3, augment={})
pipeline = SemanticSegmentation(model=model, dataset=dataset, max_epoch=3, optimizer={'learning_rate': 0.001})

# Run the training
pipeline.run_train()

INFO - 2022-04-07 10:25:19,140 - semantic_segmentation - <open3d._ml3d.tf.models.randlanet.RandLANet object at 0x7fe900ea0e50>
INFO - 2022-04-07 10:25:19,142 - semantic_segmentation - Logging in file : ./logs/RandLANet_SemanticKITTI_tf/log_train_2022-04-07_10:25:19.txt
INFO - 2022-04-07 10:25:19,144 - semantickitti - Found 1 pointclouds for training
INFO - 2022-04-07 10:25:19,308 - semantickitti - Found 1 pointclouds for validation
INFO - 2022-04-07 10:25:19,441 - semantic_segmentation - Writing summary in train_log/00009_RandLANet_SemanticKITTI_tf.
INFO - 2022-04-07 10:25:19,444 - semantic_segmentation - Initializing from scratch.
INFO - 2022-04-07 10:25:19,445 - semantic_segmentation - === EPOCH 0/3 ===
training:   0%|                                           | 0/1 [00:00<?, ?it/s]2022-04-07 10:25:20.646027: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 46137344 exceeds 10% of free system memory.
2022-04-07 10:25:20.787692: W tensorflow/core/framework/cpu_alloc

## Running a test

Running a test is very similar to training the model.

We can call the `run_test()` method, and it will run inference on the test split.

In [7]:
# Run the test
pipeline.run_test()

INFO - 2022-04-07 10:27:15,640 - semantic_segmentation - Restored from ./logs/RandLANet_SemanticKITTI_tf/checkpoint/ckpt-1
INFO - 2022-04-07 10:27:15,642 - semantic_segmentation - Logging in file : ./logs/RandLANet_SemanticKITTI_tf/log_test_2022-04-07_10:27:15.txt
INFO - 2022-04-07 10:27:15,651 - semantic_segmentation - Started testing
INFO - 2022-04-07 10:27:15,653 - semantickitti - Found 1 pointclouds for test
test:   0%|                                               | 0/1 [00:00<?, ?it/s]INFO - 2022-04-07 10:27:15,656 - semantic_segmentation - running inference

  0%|                                                 | 0/83500 [00:00<?, ?it/s][A
 20%|██████▉                           | 16966/83500 [00:00<00:03, 18266.25it/s][A
 47%|███████████████▉                  | 39272/83500 [00:01<00:02, 20370.00it/s][A
 61%|████████████████████▋             | 50904/83500 [00:03<00:02, 14856.55it/s][A
 87%|█████████████████████████████▋    | 72911/83500 [00:04<00:00, 17843.96it/s][A
 90%|███

## Running an Inference

An inference processes point cloud and displays the results based on the trained model. For this example, we will use a trained `RandLANet` model.

This example gets the pipeline, model, and dataset based on our previous training example. It runs the inference based the "train" split and prints the results.

In [8]:
# Get data from the SemanticKITTI dataset using the "test" split
train_split = dataset.get_split("test")
data = train_split.get_data(0)

# Run the inference
results = pipeline.run_inference(data)

# Print the results
print(results)

INFO - 2022-04-07 10:29:05,675 - semantickitti - Found 1 pointclouds for test
INFO - 2022-04-07 10:29:05,678 - semantic_segmentation - running inference
100%|███████████████████████████████████| 83500/83500 [00:09<00:00, 9063.39it/s]
INFO - 2022-04-07 10:29:15,823 - semantic_segmentation - Accuracy : [0.0, nan, nan, nan, nan, nan, nan, 0.0, 0.0, 0.0, 0.0, nan, 0.10099061522419187, 0.0, 0.0, 0.0, 0.0, 0.0, 0.95, 0.08758255126868265]
INFO - 2022-04-07 10:29:15,825 - semantic_segmentation - IoU : [0.0, nan, nan, nan, nan, nan, nan, 0.0, 0.0, 0.0, 0.0, nan, 0.04242968544642075, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0012920775246514791, 0.0036434802475893524]


{'predict_labels': array([14, 14, 14, ...,  8,  8,  8]), 'predict_scores': array([[0.004246 , 0.000257 , 0.000623 , ..., 0.0004656, 0.0004363,
        0.002386 ],
       [0.00435  , 0.0002494, 0.0006337, ..., 0.0004673, 0.000422 ,
        0.002426 ],
       [0.004383 , 0.0002475, 0.0006332, ..., 0.000465 , 0.0004222,
        0.00244  ],
       ...,
       [0.02219  , 0.011086 , 0.01326  , ..., 0.01465  , 0.01413  ,
        0.01793  ],
       [0.02217  , 0.01109  , 0.01327  , ..., 0.01464  , 0.01414  ,
        0.01793  ],
       [0.02217  , 0.01109  , 0.01327  , ..., 0.01464  , 0.01414  ,
        0.01793  ]], dtype=float16)}
