# Training a semantic segmentation model using TensorFlow

In this tutorial, we will learn how to train a semantic segmentation model using `TensorFlow` in a Jupyter Notebook. We assume that you are familiar with Jupyter Notebook and have created a folder *notebooks* in a folder that is relative to *ml3d*.

Before you begin, ensure that you have `TensorFlow` installed. To install a compatible version of `TensorFlow`, use the requirement file:

```sh
pip install -r requirements-tensorflow.txt
```

At a high level, we will:

- Read a dataset and create a training split. Here, we will use the `SemanticKITTI` dataset.
- Train a model. We will train a `RandLANet` model on the training split.
- Run a test on a *'test'* split to evaluate the model.
- Run an inference on a custom point cloud.

## Reading a dataset

Downloading scripts are available in: `Open3D-ML/scripts/download_datasets`

You can use any dataset available in the `ml3d.datasets` dataset namespace. For this example, we will use the `SemanticKITTI` dataset and visualize it. You can use any of the other datasets to load data. However, you must understand that the parameters may vary for each dataset.

We will read the dataset by specifying its path and then get all splits.

In [None]:
# Training Semantic Segmentation Model using TensorFlow

# import tensorflow
import open3d.ml.tf as ml3d

# Read a dataset by specifying the path. We are also providing the cache directory and training split.
dataset = ml3d.datasets.SemanticKITTI(dataset_path='./SemanticKITTI/',
                                      cache_dir='./logs/cache',
                                      training_split=['00'])

# Split the dataset for 'training'. You can get the other splits by passing 'validation' or 'test'
train_split = dataset.get_split('training')

# view the first 1000 frames using the visualizer
# MyVis = ml3d.vis.Visualizer()
# MyVis.visualize_dataset(dataset, 'training',indices=range(1))

Now that you have visualized the dataset for training, let us train the model.

## Training a model

`TensorFlow` maps nearly all of GPU memory by default. This may result in out_of_memory error if some of the ops allocate memory independent to tensorflow. You may want to limit memory usage as and when needed by the process. Use following code right after importing tensorflow:

In [None]:
import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    try:
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
    except RuntimeError as e:
        print(e)

Refer to [this link](https://www.tensorflow.org/guide/gpu#limiting_gpu_memory_growth) for more details.

First, import the desired model from `open3d.ml.torch.models`.

After you load a dataset, you can initialize any model and then train the model. The following example shows how you can train RandLANet:

In [None]:
# Training Semantic Segmentation Model using TensorFlow

# Import tensorflow and the model to use for training
import open3d.ml.tf as ml3d
from open3d.ml.tf.models import RandLANet
from open3d.ml.tf.pipelines import SemanticSegmentation

# Read a dataset by specifying the path. We are also providing the cache directory and training split.
dataset = ml3d.datasets.SemanticKITTI(dataset_path='SemanticKITTI/',
                                      cache_dir='./logs/cache',
                                      training_split=['00'],
                                      validation_split=['01'],
                                      test_split=['01'])

# Initialize the RandLANet model with three layers.
model = RandLANet(dim_input=3, augment={})
pipeline = SemanticSegmentation(model=model,
                                dataset=dataset,
                                max_epoch=3,
                                optimizer={'learning_rate': 0.001})

# Run the training
pipeline.run_train()

The training checkpoints are saved in: `pipeline.main_log_dir` (default path is: “./logs/Model_Dataset/“). You can use them for testing and inference.

## Running a test

Running a test is very similar to training the model.

We can call the `run_test()` method, and it will run inference on the test split.

In [None]:
# Run the test
pipeline.run_test()

## Running an inference

An inference processes point cloud and displays the results based on the trained model. For this example, we will use a trained `RandLANet` model.

This example gets the pipeline, model, and dataset based on our previous training example. It runs the inference based the "train" split and prints the results.

In [None]:
# Get data from the SemanticKITTI dataset using the "test" split
train_split = dataset.get_split("test")
data = train_split.get_data(0)

# Run the inference
results = pipeline.run_inference(data)

# Print the results
print(results)