# Config Files & How to Use Them üöÄüåçüí´

Look at the code snippet below:

```py
# Read a dataset by specifying the path. We can pass other arguments like cache directory and training split.

dataset = ml3d.datasets.SemanticKITTI(dataset_path='SemanticKITTI/',
                                      cache_dir='./logs/cache',
                                      training_split=['00'],
                                      validation_split=['01'],
                                      test_split=['01'])
```

The `dataset` object is created by explicitly passing dataset-specific parameters to the constructor of `ml3d.datasets.SemanticKITTI` class. Instead of passing these parameters one after another manually, we may use config files to automate our processes. Each config file in Open3D-ML contains parameters (i.e key-value pairs) for `dataset`, `model` and `pipeline` in general.


> üìù **Note:** We use [YAML format](https://en.wikipedia.org/wiki/YAML) for defining our config files.


In this tutorial, we will learn how to:

- Load a config file into `Config` class object.
- Parse data dictionaries from the loaded `Config` object.
- Access individual dictionaries in the `Config` object.
- Access individual elements within the dictionaries.

## ‚è¨ Necessary Imports

In [None]:
from pprint import pprint
from open3d.ml import utils
import open3d.ml.torch as ml3d

Here, we import two modules from Open3D:
    
   1. `utils`: Open3D-ML utilities. Used for reading config file in this tutorial.
   2. `ml3d`: Open3D-ML PyTorch API library. Used for building multiple datasets, models and pipelines.

## üìñ Loading YAML Config File

`cfg_file` contains relative or absolute path of our config file. First, let us have a look at the contents of our config file as a raw text document.

In [None]:
cfg_file = "../../../ml3d/configs/randlanet_semantickitti.yml"
!cat {cfg_file}

dataset:
  name: SemanticKITTI
  dataset_path:  # path/to/your/dataset
  cache_dir: ./logs/cache
  class_weights: [55437630, 320797, 541736, 2578735, 3274484, 552662, 184064,
    78858, 240942562, 17294618, 170599734, 6369672, 230413074, 101130274,
    476491114, 9833174, 129609852, 4506626, 1168181]
  test_result_folder: ./test
  test_split: ['11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21']
  training_split: ['00', '01', '02', '03', '04', '05', '06', '07', '09', '10']
  all_split: ['00', '01', '02', '03', '04', '05', '06', '07', '09',
  '08', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21']
  validation_split: ['08']
  use_cache: true
  sampler:
    name: 'SemSegRandomSampler'
model:
  name: RandLANet
  batcher: DefaultBatcher
  ckpt_path: # path/to/your/checkpoint
  num_neighbors: 16
  num_layers: 4
  num_points: 45056
  num_classes: 19
  ignored_label_inds: [0]
  sub_sampling_ratio: [4, 4, 4, 4]
  in_channels: 3
  dim_features: 8
  dim_output:

Here, we can see that the file is divided into three different parts - `dataset`, `model` and `pipeline`.

Let us load the config file as `Config` class object. This can be done with the help of `utils.Config.load_from_file` which takes path of config file as input.

In [None]:
cfg = utils.Config.load_from_file(cfg_file)

## ü©∫ Examining Dataset Dictionaries

Let us try to access the contents of `cfg` object.

In [None]:
pprint(vars(cfg))

In [None]:
cfg.keys()

dict_keys(['dataset', 'model', 'pipeline'])

`cfg` has three dictionaries - `dataset`, `model` and `pipeline` (like we saw in the raw YAML file).

Let us explore them.


### üîé Accessing Individual Dictionaries

The first one is `cfg.dataset` dictionary:

In [None]:
pprint(cfg.dataset)

Similary, let us access `model` and `pipeline` dictionaries:

In [None]:
pprint(cfg.model)

{'augment': {'recenter': {'dim': [0, 1]}},
 'batcher': 'DefaultBatcher',
 'ckpt_path': None,
 'dim_features': 8,
 'dim_output': [16, 64, 128, 256],
 'grid_size': 0.06,
 'ignored_label_inds': [0],
 'in_channels': 3,
 'name': 'RandLANet',
 'num_classes': 19,
 'num_layers': 4,
 'num_neighbors': 16,
 'num_points': 45056,
 'sub_sampling_ratio': [4, 4, 4, 4]}


In [None]:
pprint(cfg.pipeline)

{'batch_size': 4,
 'main_log_dir': './logs',
 'max_epoch': 100,
 'name': 'SemanticSegmentation',
 'optimizer': {'lr': 0.001},
 'save_ckpt_freq': 5,
 'scheduler_gamma': 0.9886,
 'summary': {'max_outputs': 1,
             'max_pts': None,
             'record_for': [],
             'use_reference': False},
 'test_batch_size': 1,
 'train_sum_dir': 'train_log',
 'val_batch_size': 2}


### üî≠ Accessing Individual Elements within The Dictionaries

> üìù **Note:** The dictionary items within `Config` class object can be viewed & updated just like a standard Python dictionary. It is mutable.

List all the keys available inside `dataset` dictionary (just like the built-in `dict` data type) using:

In [None]:
cfg.dataset.keys()

dict_keys(['name', 'dataset_path', 'cache_dir', 'class_weights', 'test_result_folder', 'test_split', 'training_split', 'all_split', 'validation_split', 'use_cache', 'sampler'])

We may access any of the available keys and even update their values as shown below:

In [None]:
cfg.dataset['cache_dir'] # Access individual element

'./logs/cache'

In [None]:
# Update individual element
cfg.dataset['cache_dir'] = './logs/new_cache'
cfg.dataset['cache_dir']

'./logs/new_cache'

We may do the same for any of the individual elements of `cfg.model` and `cfg.pipeline`. Try it yourselves!

## üèóÔ∏è Initializing Dataset From a Config File

We saw how to load, probe and mutate config files in the above examples. 

Let us now explicitly create a `dataset` object which will hold all the information from the `cfg.dataset` dictionary


In [None]:
dataset = ml3d.datasets.SemanticKITTI(cfg.dataset)

Properties exposed by the newly-created `dataset` can be accessed using the `vars` built-in function.

In [None]:
vars(dataset)

{'cfg': <open3d._ml3d.utils.config.Config at 0x7f588ac3b390>,
 'label_to_names': {0: 'unlabeled',
  1: 'car',
  2: 'bicycle',
  3: 'motorcycle',
  4: 'truck',
  5: 'other-vehicle',
  6: 'person',
  7: 'bicyclist',
  8: 'motorcyclist',
  9: 'road',
  10: 'parking',
  11: 'sidewalk',
  12: 'other-ground',
  13: 'building',
  14: 'fence',
  15: 'vegetation',
  16: 'trunk',
  17: 'terrain',
  18: 'pole',
  19: 'traffic-sign'},
 'name': 'SemanticKITTI',
 'num_classes': 20,
 'remap_lut': array([ 0, 10, 11, 15, 18, 20, 30, 31, 32, 40, 44, 48, 49, 50, 51, 70, 71,
        72, 80, 81,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0

We may reference any `dataset` object property using `{object_name}.{property_name}` syntax. For example, `num_classes` property can be accessed using:

In [None]:
dataset.num_classes

20

Likewise, to extract information from `label_to_names` property (which maps class label IDs to the class label names), we can call:

In [None]:
dataset.label_to_names

{0: 'unlabeled',
 1: 'car',
 2: 'bicycle',
 3: 'motorcycle',
 4: 'truck',
 5: 'other-vehicle',
 6: 'person',
 7: 'bicyclist',
 8: 'motorcyclist',
 9: 'road',
 10: 'parking',
 11: 'sidewalk',
 12: 'other-ground',
 13: 'building',
 14: 'fence',
 15: 'vegetation',
 16: 'trunk',
 17: 'terrain',
 18: 'pole',
 19: 'traffic-sign'}

Experiment with other `dataset` properties to see how convenient it is to reference them!