# Config File Structure

In [1]:
import tensorflow as tf
import json

The hyper-parameter dictionary used for model tuning, fitting and data structue has the following form:
```python3
hyper = {
    "info":{ 
        # General information for training run
        "kgcnn_version": "1.1.0", # Version 
        "postfix": "" # postfix for output folder
    },
    "model": { 
        # Model specific parameter, see kgcnn.literature
    },
    "data": { 
        # Dataset specific parameter
    },
    "training": {
        "fit": { 
            # keras fit arguments serialized
        },
        "compile": { 
            # Keras compile arguments serialized
        },
        "Kfold": {
            # kwargs unpacked in scikit-learn Kfold class.  
        }
    }
}
```
The following sections explain each block.


## Model

The model parameters can be reviewed from the default values in ``kgcnn.literature``. Mostly model input and output has to be matched depending on the data representation. That is type of input and its shape. An input-type checker will be added in the future to adapt the model input automatically, but as for now the input shape has to be chosen properly. In ``inputs`` a list of kwargs must be given, which are each unpacked in the corresponding ``tf.keras.layers.Input``. The order is model dependent.

Moreover, by naming of the model input is used to link the tensor properties of the dataset with the model input. 

In [2]:
model_kwargs = {
    "inputs": [
        {"shape": [None, 100], "name": "node_attributes", "dtype": "float32", "ragged": True},
        {"shape": [None, 2], "name": "edge_indices", "dtype": "int64", "ragged": True}],
    # More model specific kwargs, like:
    # "depth": 5,
    "output_mlp": {"use_bias": [True, True, False], "units": [140, 70, 70],
                   "activation": ["relu", "relu", "softmax"]},
}

Here, the training script will provide ``dataset.edge_indices`` of shape `(batch, None, 2)` and ``dataset.node_attributes`` of shape `(batch, None, 100)`. Note that the shape must match the actual shape in dataset.

For output, most models simply have a MLP at the output and the activation as well as the final output dimension can be chosen by setting the kwargs ``output_mlp`` (unpacked in MLP) for last layer in ``units`` and ``activation``. The number in units must macht the labels or classes of the target. This is moslty ``dataset.graph_labels``, but depends on dataset and classification task, either graph or node classification.

## Data