# A tutorial for 

# *PersLay: A Simple and Versatile Neural Network Layer for Persistence Diagrams*. 

# Introduction

Printing the current version of Python.

In [1]:
import sys
print("Current version of your system: (we recommand Python 3.6)")
print(sys.version)

Current version of your system: (we recommand Python 3.6)
3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34) 
[GCC 7.3.0]


## Imports

In [1]:
from examples.utils import generate, visualization, single_run
from archi import perslay

## Outline:
In this notebook:
- First, we select a dataset. Two types of datasets are provided by default, either synthetic orbits from dynamical systems, or real-life graph dataset. 
- Then, we generate the persistence diagrams (and other useful informations such as labels, etc.) for the chosen dataset.
- (Optional) we propose to visualize the generated diagrams.
- We define a neural network that uses some PersLay channels as first layers to handle persistence diagrams. This can be used as a guideline to use PersLay in your own experiments.
- We show how to train this neural network on the chosen dataset.

# Choose the dataset

We start by choosing the dataset we want to run the experiments on.

We suggest the user to start with `"MUTAG"` as this dataset is reasonably small (188 graphs with 18 nodes on average). Note that its small size implies a large variability in tests.

Available options are:

- Orbit datasets: `"ORBIT5K"`, `"ORBIT100K"`.

- Graphs datasets: `"MUTAG"`, `"BZR"`, `"COX2"`, `"DHFR"`, `"PROTEINS"`, `"NCI1"`, `"NCI109"`, `"FRANKENSTEIN"`,  `"IMDB-BINARY"`, `"IMDB-MULTI"`.

__Important note:__ `COLLAB`, `"REDDIT5K"` and `"REDDIT12K"` are not available yet (see README.md). Contact the authors for more information.

Beware that for the larger datasets (`COLLAB`, `REDDIT5K, REDDIT12K, ORBIT100K`), the needed files can be quite large (e.g. 3Gb for for `ORBIT100K`), so that RAM can be limiting, and time to generate the diagrams and running the experiments can be quite involving depending on the hardware available. You can have access to a description of the dataset in the Section B in the supplementary material of the article.

In [None]:
# Chose your config file using one of the filename mentioned above.
dataset = "MUTAG"

# Building persistence diagrams and eventual features

Here, we implicitely load our data (saved as `.mat` files for graphs datasets, and generated on-the-fly for orbits datasets---which can take some time for `ORBIT100K` especially), and then compute the persistence diagrams that will be used in the classification experiment (requires to have `gudhi` installed). For graph datasets, we also generate a series of additional features (see [1]).

Running `generate` will store diagrams, features and labels. Therefore, it is sufficient to run it just once (for each different dataset).

Note that for bigger datasets, the computations of these diagrams can be quite long.

In [None]:
generate(dataset)

### Visualization (optional)

In [None]:
# Run this cell to visualise some example of diagrams generated.
# Requires matplotlib.
visualization(dataset)

# Using PersLay in a neural network

We define a PersLay layer, and then a (very simple) neural network architecture that uses PersLay. This can be used as a template to build your own architecture using PersLay.

### Set the hyper-parameters

Layer type, must be one of (see [1] for details):
- `"im"` for persistence image layer.
- `"pm"` for permutation equivarient layer (as in [2]).
- `"gs"` for a gaussian layer.
- `"ls"` for a landscape layer.

In [2]:
layer_type = "im"

Permutation invariant operator, must be one of:
- `"sum"`.
- `"topk"`, will select the $k$ highest values, specified in `keep`.

In [None]:
perm_op = "sum"
keep = 5  # only useful if perm_op = "topk"

Weight, must be one of
- `"grid"`, if so, one must pick a grid_size.
- `"linear"`.

In [3]:
weight="grid"
grid_size = [10, 10]  # Only used if weight=="grid"

Now, there are some hyper parameters that are specific to different the layer types.

In [None]:
# Parameter specific to layer_type="im"
image_size=[10, 10]
# Parameter specific to layer_type="gs"
num_gaussians=50
# Parameter specific to layer_type="pm"
d = 50  # Output dimension
# Parameter specific to layer_type="ls"
num_sample = 50

__Note:__ There are some other parameters available to tune PersLay that are not detailed here. This will be updated later. Feel free to check the implementation provided in `archi.py`.

### Designing the network

In the template below, we define a very simple `model` that encodes a network architecture. In this model, we define a PersLay layer for each type of diagrams used in input, but all these layers have the same hyper-parameters (as in [1]).

Eventual additional features are simply concatenated with the output of these perslay layers, and a fully connected layer is then used to make the prediction.

In [None]:
def model(feats, diags, parameters, num_filts, num_labels):
    list_v = []
    for i in range(num_filts):
        # A perslay channel must be defined for each type of diagram
        # Here, they all have the same hyper-parameters.
        perslay(output=list_v, # the vector use to store the output of all perslay
                name="perslay-" + str(i),  # name of this layer
                diag=diags[i],  # this layer handle the i-th type of diagrams
                layer=layer_type, 
                perm_op=perm_op,
                keep=keep,
                persistence_weight=weight,
                grid_size=grid_size,
                image_size=image_size,
                num_gaussians=num_gaussians,
                num_samples=num_samples,
                peq=[(d,None)]
                )

        # Concatenate all channels and add other features
        vector = tf.concat(list_v, 1)
    with tf.variable_scope("norm_feat"):
        feat = tf.layers.batch_normalization(feats)

    vector = tf.concat([vector, feat], 1)

    #  Final layer to make predictions
    with tf.variable_scope("final-dense-3"):
        vector = _post_processing(tf.layers.dense(vector, num_labels), "")

    return vector

## Training the network

We can now train our network, following the architecture illustrated in the article (Figure 3)<WARNING CHECK FIG LABEL>.

As any neural-network framework, PersLay benefits from the use of GPU(s). If a GPU is available (and `tensorflow-gpu` is installed), the computations should hopefully use it. Otherwise, the computations will be run on the cpu.

## Running the experiments.

### Single run:

We suggest the user to run a single-run first with the `single_run` function, that is training the network once and observing the performance (classification accuracy) on the test set.
- For orbit datasets, the train-test split is 70-30 (to be consistent with [LY18]).
- For graph datasets, the train-test split is 90-10 (to be consistent with [ZWX+18]).

The `single_run` function will load (and print) the network parameters as described in Table 5<CHECK LABEL>: perslay hyperparameters (choice of $\phi$, $w$...), optimizer (number of epochs, learning rate...), etc.
   
It then uses the diagrams (and eventual features) that have been generated when calling `generate(dataset)`, randomly split them into train/test sets, and use them to feed to the network. 

Train and Test accuracies are printed every 10 epochs during the training.

Note that (especially on small datasets like `MUTAG, COX2` etc.), there can be an important variability in the accuracy reported on different calls of `single_run`.

In [None]:
single_run(dataset)

# Bibliography

[1] _PersLay: A Simple and Versatile Neural Network Layer for Persistence Diagrams._
Mathieu Carrière, Frederic Chazal, Yuichi Ike, Théo Lacombe, Martin Royer, Yuhei Umeda.

[2] _Deep Sets._
Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Ruslan R. Salakhutdinov, Alexander J. Smola.
_Advances in Neural Information Processing Systems 30 (NIPS 2017)_