# Literature

There are a set of popular graph network architectures implemented already in `kgcnn` . They can be found in `kgcnn.literature` . Most models are set up in the functional ``keras`` API. Information on hyperparameters, training and benchmarking can be found below.

* **[GCN](https://github.com/aimat-lab/gcnn_keras/blob/master/kgcnn/literature/GCN)**: [Semi-Supervised Classification with Graph Convolutional Networks](https://arxiv.org/abs/1609.02907) by Kipf et al. (2016)
* **[Schnet](https://github.com/aimat-lab/gcnn_keras/blob/master/kgcnn/literature/Schnet)**: [SchNet – A deep learning architecture for molecules and materials ](https://aip.scitation.org/doi/10.1063/1.5019779) by Schütt et al. (2017)
* **[GAT](https://github.com/aimat-lab/gcnn_keras/blob/master/kgcnn/literature/GAT)**: [Graph Attention Networks](https://arxiv.org/abs/1710.10903) by Veličković et al. (2018)
* **[GraphSAGE](https://github.com/aimat-lab/gcnn_keras/blob/master/kgcnn/literature/GraphSAGE)**: [Inductive Representation Learning on Large Graphs](http://arxiv.org/abs/1706.02216) by Hamilton et al. (2017)
* **[GNNExplainer](https://github.com/aimat-lab/gcnn_keras/blob/master/kgcnn/literature/GNNExplain)**: [GNNExplainer: Generating Explanations for Graph Neural Networks](https://arxiv.org/abs/1903.03894) by Ying et al. (2019)
* **[AttentiveFP](https://github.com/aimat-lab/gcnn_keras/blob/master/kgcnn/literature/AttentiveFP)**: [Pushing the Boundaries of Molecular Representation for Drug Discovery with the Graph Attention Mechanism](https://pubs.acs.org/doi/10.1021/acs.jmedchem.9b00959) by Xiong et al. (2019)
* **[GATv2](https://github.com/aimat-lab/gcnn_keras/blob/master/kgcnn/literature/GATv2)**: [How Attentive are Graph Attention Networks?](https://arxiv.org/abs/2105.14491) by Brody et al. (2021)
* **[GIN](https://github.com/aimat-lab/gcnn_keras/blob/master/kgcnn/literature/GIN)**: [How Powerful are Graph Neural Networks?](https://arxiv.org/abs/1810.00826) by Xu et al. (2019)
* **[PAiNN](https://github.com/aimat-lab/gcnn_keras/blob/master/kgcnn/literature/PAiNN)**: [Equivariant message passing for the prediction of tensorial properties and molecular spectra](https://arxiv.org/pdf/2102.03150.pdf) by Schütt et al. (2020)
* **[DMPNN](https://github.com/aimat-lab/gcnn_keras/blob/master/kgcnn/literature/DMPNN)**: [Analyzing Learned Molecular Representations for Property Prediction](https://pubs.acs.org/doi/abs/10.1021/acs.jcim.9b00237) by Yang et al. (2019)

## Training Scripts

Currently there are training scripts *train_graph.py*, *train_node.py*, *train_force.py*.

> **NOTE**: They are quite integrated with ``kgcnn`` models and datasets which is why a custom training script can be favorable for models not in `kgcnn.literature`.

Training scripts can be started with:

```bash
python3 train_node.py --hyper hyper/hyper_cora.py --category GCN
python3 train_graph.py --hyper hyper/hyper_esol.py --category GIN
```

Where `hyper_esol.py` stores hyperparameter and must be in the same folder or a path to a `.py`. 

In principle, training can be fully configured with a serialized hyper parameter file as 'hyper.json' or 'hyper.yaml'.
If a pyhton file 'hyper.py' is used then `hyper = {...}` must be set in the python script, in which case the items do not necessarily need to be in serailized form.

In [1]:
hyper = {
    "info":{ 
        # General information for training run
        "kgcnn_version": "4.0.0", # Version 
        "postfix": "" # Postfix for output folder.
    },
    "model": { 
        # Model specific parameter, see kgcnn.literature.
    },
    "data": { 
        # Data specific parameters.
    },
    "dataset": { 
        # Dataset specific parameters.
    },
    "training": {
        "fit": { 
            # serialized keras fit arguments.
        },
        "compile": { 
            # serialized keras compile arguments.
        },
        "cross_validation": {
            # serialized parameters for cross-validation.  
        },
        "scaler": {
            # serialized parameters for scaler.
            # Only add when training for regression.
        }
    }
}

#### Data hyperparameter

The kwargs for the dataset are not fully identical and vary a little depending on the datset. However, the most common are listed below.

In [3]:
hyper.update({
    "data":{
        # Other optinal entries (depends on the training script)
        "data_unit": "mol/L",
    },
    "dataset": {
        "class_name": "QM9Dataset", # Name of the dataset
        "module_name": "kgcnn.data.datasets.QM9Dataset",
        
        # Config like filepath etc., leave empty for pre-defined datasets
        "config": {}, 
        
        # Methods to run on dataset, i.e. the list of graphs
        "methods": [
            {"prepare_data": {}}, # Used for cache and pre-compute data, leave out for pre-defined datasets
            {"read_in_memory": {}}, # Used for reading into memory, leave out for pre-defined datasets
            
            # Example method to run over each graph in the list using `map_list` method.
            # The string 'set_range' refers to a preprocessor. Legacy short access to graph preprocessors.
            {"map_list": {"method": "set_range", "max_distance": 4, "max_neighbours": 30}},
            {"map_list": {"method": "count_nodes_and_edges", "total_edges": "total_edges",
                          "count_edges": "edge_indices", "count_nodes": "node_attributes", "total_nodes": "total_nodes"}},
        ]
    }
})

#### Model hyperparameter

The model parameters can be reviewed from the default values in ``kgcnn.literature``. Mostly model input and output has to be matched depending on the data representation. That is type of input and its shape. An input-type checker can be used from `kgcnn.data.base.MemoryGraphDataset`, which has `assert_valid_model_input`. In ``inputs`` a list of kwargs must be given, which are each unpacked in the corresponding ``tf.keras.layers.Input``. The order matters and is model dependent.

Moreover, naming of the model input is used to link the tensor properties of the dataset with the model input. The output dimension of either node or graph embedding can be set for most models with the "output_mlp" argument.

In [4]:
hyper.update({
    "model":{
        "module_name": "kgcnn.literature.GCN", 
        "class_name": "make_model",
        "config":{
            "inputs": [
                {"shape": [None, 100], "name": "node_attributes", "dtype": "float32"},
                {"shape": [None, 2], "name": "edge_indices", "dtype": "int64"},
                {"shape": (), "name": "total_nodes", "dtype": "int64"},
                {"shape": (), "name": "total_edges", "dtype": "int64"}
            ],
            # More model specific kwargs, like:
            "depth": 5,
            # Output part defining model output
            "output_embedding": "graph",
            "output_mlp": {"use_bias": [True, True, False], "units": [140, 70, 70],
                           "activation": ["relu", "relu", "softmax"]}
        }
    }
})

#### Training hyperparameter

The kwargs for training simply sets arguments for ``model.compile(**kwargs)`` and ``model.fit(**kwargs)`` that matches keras arguments as well as for the k-fold split from scikit-learn. The kwargs are expected to be fully serialized, if the hyper parameters are supposed to be saved to json.

In [None]:
import keras_core as ks
hyper.update({
    "training":{
        # Cross-validation of the data
        "cross_validation": {
            "class_name": "KFold",
            "config": {"n_splits": 5, "random_state": 42, "shuffle": True}
        },
        # Standard scaler for regression targets
        "scaler": {
            "class_name": "StandardScaler",
            "module_name": "kgcnn.data.transform.scaler.standard",
            "config": {"with_std": True, "with_mean": True, "copy": True}
        },
        # Keras model compile and fit
        "compile": {
            "loss": "categorical_crossentropy",
            "optimizer": ks.saving.serialize_keras_object(
                ks.optimizers.Adam(learning_rate=0.001))
        },
        "fit": {
            "batch_size": 32, "epochs": 800, "verbose": 2, 
            "callbacks": []
        }
    }
})

Using PyTorch backend.


#### Info

Some general information on the training, such as the used kgcnn version or a postfix for the output files.

In [None]:
hyper.update({
    "info":{ # Generla information
        "postfix": "_v1", # Appends _v1 to output folder
        "postfix_file": "_run2", # Appends _run2 to info files
        "kgcnn_version": "4.0.0"    
    }
})

## Benchmarks

# Summary of Benchmark Training

Note that these are the results for models within `kgcnn` implementation, and that training is not always done with optimal hyperparameter or splits, when comparing with literature.
This table is generated automatically from keras history logs.
Model weights and training statistics plots are not uploaded on 
[github](https://github.com/aimat-lab/gcnn_keras/tree/master/training/results) 
due to their file size.

*Max.* or *Min.* denotes the best test error observed for any epoch during training.
To show overall best test error run ``python3 summary.py --min_max True``.
If not noted otherwise, we use a (fixed) random k-fold split for validation errors.

#### CoraLuDataset

Cora Dataset after Lu et al. (2003) of 2708 publications and 1433 sparse attributes and 7 node classes. Here we use random 5-fold cross-validation on nodes. 

| model     | kgcnn   |   epochs | Categorical accuracy   |
|:----------|:--------|---------:|:-----------------------|
| GAT       | 4.0.0   |      250 | 0.8464 &pm; 0.0105     |
| GATv2     | 4.0.0   |      250 | 0.8331 &pm; 0.0104     |
| GCN       | 4.0.0   |      300 | 0.8072 &pm; 0.0109     |
| GIN       | 4.0.0   |      500 | 0.8279 &pm; 0.0170     |
| GraphSAGE | 4.0.0   |      500 | **0.8497 &pm; 0.0100** |

#### ESOLDataset

ESOL consists of 1128 compounds as smiles and their corresponding water solubility in log10(mol/L). We use random 5-fold cross-validation. 

| model   | kgcnn   |   epochs | MAE [log mol/L]        | RMSE [log mol/L]       |
|:--------|:--------|---------:|:-----------------------|:-----------------------|
| GAT     | 4.0.0   |      500 | 0.4826 &pm; 0.0255     | 0.6903 &pm; 0.0705     |
| GCN     | 4.0.0   |      800 | **0.4623 &pm; 0.0224** | **0.6567 &pm; 0.0456** |
| Schnet  | 4.0.0   |      800 | 0.4678 &pm; 0.0227     | 0.6662 &pm; 0.0629     |



> **NOTE**: You can find this page as jupyter notebook in https://github.com/aimat-lab/gcnn_keras/tree/master/docs/source