# Config files

This tutorial shows basic usage of config files.

## Basic format
We use [YAML format](https://en.wikipedia.org/wiki/YAML) for defining our config files. A config file in Open3D-ML always has the following parameters:

1. `dataset`: Contains key-value pairs related to training, validation and test dataset.  
2. `model`: Contains key-value pairs realted to model architecture.
3. `pipeline`: Contains key-value pairs related to the training, testing and inference pipleine.

For example, `randlanet_semantickitti.yml` config file uses RandLA-Net *model* architecture, SemanticKITTI *dataset* and semantic segmentation *pipeline*. Let us have a look at it's contents as a raw text file.

In [1]:
cfg_file = "../../../ml3d/configs/randlanet_semantickitti.yml"
%pycat {cfg_file}

<div class="alert alert-info">
**Note:** You may find such config files available at <a>https://github.com/isl-org/Open3D-ML/tree/master/ml3d/configs</a>
</div>

### Reading a config file

In order to access the key-value pairs we must load the file into memory as `Config` class object. `Config` class object's usage is very much similar to standard Python dictionary `dict`.

In [2]:
import open3d.ml as _ml3d

cfg = _ml3d.utils.Config.load_from_file(cfg_file)

Jupyter environment detected. Enabling Open3D WebVisualizer.
[Open3D INFO] WebRTC GUI backend enabled.
[Open3D INFO] WebRTCWindowSystem: HTTP handshake server disabled.


`_ml3d.utils.Config.load_from_file(cfg_file)` takes path to config file as input and returns `Config`. 

<div class="alert alert-info">
**Note:**  To avoid `FileNotFoundError`, always make sure that the config file exists at the given location. The config file must be a valid YAML file!  </div>

## Accessing parameters

Built-in function `vars` grabs all the properties of the object as dictionary

In [3]:
vars(cfg)

{'_cfg_dict': {'dataset': {'name': 'SemanticKITTI',
   'dataset_path': None,
   'cache_dir': './logs/cache',
   'class_weights': [55437630,
    320797,
    541736,
    2578735,
    3274484,
    552662,
    184064,
    78858,
    240942562,
    17294618,
    170599734,
    6369672,
    230413074,
    101130274,
    476491114,
    9833174,
    129609852,
    4506626,
    1168181],
   'test_result_folder': './test',
   'test_split': ['11',
    '12',
    '13',
    '14',
    '15',
    '16',
    '17',
    '18',
    '19',
    '20',
    '21'],
   'training_split': ['00',
    '01',
    '02',
    '03',
    '04',
    '05',
    '06',
    '07',
    '09',
    '10'],
   'all_split': ['00',
    '01',
    '02',
    '03',
    '04',
    '05',
    '06',
    '07',
    '09',
    '08',
    '10',
    '11',
    '12',
    '13',
    '14',
    '15',
    '16',
    '17',
    '18',
    '19',
    '20',
    '21'],
   'validation_split': ['08'],
   'use_cache': True,
   'sampler': {'name': 'SemSegRandomSampler'}},
  'm

You can list all the keys in the top most level using:

In [4]:
cfg.keys()

dict_keys(['dataset', 'model', 'pipeline'])

Here, you can see the three essential components.

You can access configuration values as object attributes (`cfg.{property_name}`) or dictionary key values (`cfg['{property_name}']`).

For example, `dataset` dictionary can be accessed using the following code (same can be done for `model` and `pipeline`)

In [5]:
cfg.dataset

{'name': 'SemanticKITTI',
 'dataset_path': None,
 'cache_dir': './logs/cache',
 'class_weights': [55437630,
  320797,
  541736,
  2578735,
  3274484,
  552662,
  184064,
  78858,
  240942562,
  17294618,
  170599734,
  6369672,
  230413074,
  101130274,
  476491114,
  9833174,
  129609852,
  4506626,
  1168181],
 'test_result_folder': './test',
 'test_split': ['11',
  '12',
  '13',
  '14',
  '15',
  '16',
  '17',
  '18',
  '19',
  '20',
  '21'],
 'training_split': ['00',
  '01',
  '02',
  '03',
  '04',
  '05',
  '06',
  '07',
  '09',
  '10'],
 'all_split': ['00',
  '01',
  '02',
  '03',
  '04',
  '05',
  '06',
  '07',
  '09',
  '08',
  '10',
  '11',
  '12',
  '13',
  '14',
  '15',
  '16',
  '17',
  '18',
  '19',
  '20',
  '21'],
 'validation_split': ['08'],
 'use_cache': True,
 'sampler': {'name': 'SemSegRandomSampler'}}

One other way to access `dataset` parameters is accessing like a built-in `dict`.

In [6]:
cfg['dataset']

{'name': 'SemanticKITTI',
 'dataset_path': None,
 'cache_dir': './logs/cache',
 'class_weights': [55437630,
  320797,
  541736,
  2578735,
  3274484,
  552662,
  184064,
  78858,
  240942562,
  17294618,
  170599734,
  6369672,
  230413074,
  101130274,
  476491114,
  9833174,
  129609852,
  4506626,
  1168181],
 'test_result_folder': './test',
 'test_split': ['11',
  '12',
  '13',
  '14',
  '15',
  '16',
  '17',
  '18',
  '19',
  '20',
  '21'],
 'training_split': ['00',
  '01',
  '02',
  '03',
  '04',
  '05',
  '06',
  '07',
  '09',
  '10'],
 'all_split': ['00',
  '01',
  '02',
  '03',
  '04',
  '05',
  '06',
  '07',
  '09',
  '08',
  '10',
  '11',
  '12',
  '13',
  '14',
  '15',
  '16',
  '17',
  '18',
  '19',
  '20',
  '21'],
 'validation_split': ['08'],
 'use_cache': True,
 'sampler': {'name': 'SemSegRandomSampler'}}

Accessing individual parameters can be done with either `cfg.{property_name}.{property_name}` or `cfg['{property_name}']['{property_name}']` syntax. Inner levels can be accessed using the same idea.

Let's try to access `dataset -> sampler`:

In [7]:
cfg.dataset.sampler

{'name': 'SemSegRandomSampler'}

Another approach:

In [8]:
cfg['dataset']['sampler']

{'name': 'SemSegRandomSampler'}

## Mutating Parameters

If you want to change the keys or values in your config files, you may use dictionary synatx as shown below:

In [9]:
cfg['dataset']['sampler'] = "NewSampler"

print(cfg['dataset']['sampler'])

NewSampler


We may use object dot notation too. Let us change `dataset_path` using it.

In [10]:
cfg.dataset.dataset_path = "./SemanticKITTI"

print(cfg.dataset.dataset_path)

./SemanticKITTI


<div class="alert alert-info">
    **Note:** Original YAML file on disk remains unchanged!
</div>

## Building Dataset Component

Look at the code snippet below:

```py
# Read a dataset by specifying the path, cache directory, training split etc.
dataset = ml3d.datasets.SemanticKITTI(dataset_path='SemanticKITTI/',
                                      cache_dir='./logs/cache',
                                      training_split=['00'],
                                      validation_split=['01'],
                                      test_split=['01'])
```

The `dataset` object is created by explicitly passing **dataset-specific parameters** to the constructor of `ml3d.datasets.SemanticKITTI` class. Instead of passing these parameters one by one manually we may use config files as shown below:

In [11]:
import open3d.ml.torch as ml3d

dataset = ml3d.datasets.SemanticKITTI(**cfg.dataset)


--------------------------------------------------------------------------------

 Using the Open3D PyTorch ops with CUDA 11 may have stability issues!

 We recommend to compile PyTorch from source with compile flags
   '-Xcompiler -fno-gnu-unique'

 or use the PyTorch wheels at
   https://github.com/isl-org/open3d_downloads/releases/tag/torch1.8.2


 Ignore this message if PyTorch has been compiled with the aforementioned
 flags.

 See https://github.com/isl-org/Open3D/issues/3324 and
 https://github.com/pytorch/pytorch/issues/52663 for more information on this
 problem.

--------------------------------------------------------------------------------



Look at what properties the newly-created `dataset` object exposes with the Python `vars()` function:

In [12]:
vars(dataset)

{'cfg': <open3d._ml3d.utils.config.Config at 0x7fbbe9f96110>,
 'name': 'SemanticKITTI',
 'rng': Generator(PCG64) at 0x7FBBE9D225F0,
 'label_to_names': {0: 'unlabeled',
  1: 'car',
  2: 'bicycle',
  3: 'motorcycle',
  4: 'truck',
  5: 'other-vehicle',
  6: 'person',
  7: 'bicyclist',
  8: 'motorcyclist',
  9: 'road',
  10: 'parking',
  11: 'sidewalk',
  12: 'other-ground',
  13: 'building',
  14: 'fence',
  15: 'vegetation',
  16: 'trunk',
  17: 'terrain',
  18: 'pole',
  19: 'traffic-sign'},
 'num_classes': 20,
 'remap_lut_val': array([ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  1,  2,  0,  5,  0,  3,  5,
         0,  4,  0,  5,  0,  0,  0,  0,  0,  0,  0,  0,  0,  6,  7,  8,  0,
         0,  0,  0,  0,  0,  0,  9,  0,  0,  0, 10,  0,  0,  0, 11, 12, 13,
        14,  0,  0,  0,  0,  0,  0,  0,  0,  9,  0,  0,  0,  0,  0,  0,  0,
         0,  0, 15, 16, 17,  0,  0,  0,  0,  0,  0,  0, 18, 19,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0

We may reference any property of the `dataset` using above syntax. For example, to find out what value the `num_classes` property holds, we use:

In [13]:
dataset.label_to_names

{0: 'unlabeled',
 1: 'car',
 2: 'bicycle',
 3: 'motorcycle',
 4: 'truck',
 5: 'other-vehicle',
 6: 'person',
 7: 'bicyclist',
 8: 'motorcyclist',
 9: 'road',
 10: 'parking',
 11: 'sidewalk',
 12: 'other-ground',
 13: 'building',
 14: 'fence',
 15: 'vegetation',
 16: 'trunk',
 17: 'terrain',
 18: 'pole',
 19: 'traffic-sign'}

Similarly, we may bulild **model and pipeline components** using `cfg.model` and `cfg.pipeline` respectively. Have a look at training tutorials for code examples.