# Graph-to-Expander-Hypergraph Lifting Tutorial

***
This notebook shows how to import a dataset, with the desired lifting, and how to run a neural network using the loaded data.

The notebook is divided into sections:

- [Loading the dataset](#loading-the-dataset) loads the config files for the data and the desired tranformation, createsa a dataset object and visualizes it.
- [Loading and applying the lifting](#loading-and-applying-the-lifting) defines a simple neural network to test that the lifting creates the expected incidence matrices.
- [Create and run a simplicial nn model](#create-and-run-a-simplicial-nn-model) simply runs a forward pass of the model to check that everything is working as expected.

***
***

Note that for simplicity the notebook is setup to use a simple graph. However, there is a set of available datasets that you can play with.

To switch to one of the available datasets, simply change the *dataset_name* variable in [Dataset config](#dataset-config) to one of the following names:

* cocitation_cora
* cocitation_citeseer
* cocitation_pubmed
* MUTAG
* NCI1
* NCI109
* PROTEINS_TU
* AQSOL
* ZINC
***

### Imports and utilities

In [None]:
# With this cell any imported module is reloaded before each cell execution
%load_ext autoreload
%autoreload 2
from modules.data.load.loaders import GraphLoader
from modules.data.preprocess.preprocessor import PreProcessor
from modules.utils.utils import (
    describe_data,
    load_dataset_config,
    load_model_config,
    load_transform_config,
)

## Loading the Dataset

Here we just need to specify the name of the available dataset that we want to load. First, the dataset config is read from the corresponding yaml file (located at `/configs/datasets/` directory), and then the data is loaded via the implemented `Loaders`.


In [None]:
dataset_name = "manual_dataset"
dataset_config = load_dataset_config(dataset_name)
loader = GraphLoader(dataset_config)

We can then access to the data through the `load()`method:

In [None]:
dataset = loader.load()
describe_data(dataset)

## Loading and Applying the Lifting

In this section we will instantiate the expander graph lifting, inspired by recent interest in expander graphs in the context of machine learning [1,2]. This lifting generates a random Ramanujan graph, which is a powerful spectral expander, using [the implementation](https://networkx.org/documentation/stable/reference/generated/networkx.generators.expanders.random_regular_expander_graph.html#random-regular-expander-graph) of `networkx`. This expander graph shares the same nodes as the source graph but is connected in a way that guarantees favourable properties w.r.t. node connectivity.

***
[1] [Andreea Deac, Marc Lackenby, Petar Velickovic: Expander Graph Propagation. LoG 2022: 38.](https://arxiv.org/abs/2210.02997)\
[2] [Hamed Shirzad, Ameya Velingker, Balaji Venkatachalam, Danica J. Sutherland, Ali Kemal Sinop: Exphormer: Sparse Transformers for Graphs. ICML 2023: 31613-31632.](https://arxiv.org/abs/2303.06147)
***

For hypergraphs creating a lifting involves creating the `incidence_hyperedges` matrix.

Similarly to before, we can specify the transformation we want to apply through its type and id --the correxponding config files located at `/configs/transforms`. 

Note that the *tranform_config* dictionary generated below can contain a sequence of tranforms if it is needed.

This can also be used to explore liftings from one topological domain to another, for example using two liftings it is possible to achieve a sequence such as: graph -> simplicial complex -> hypergraph. 

In [None]:
# Define transformation type and id
transform_type = "liftings"
# If the transform is a topological lifting, it should include both the type of the lifting and the identifier
transform_id = "graph2hypergraph/expander_graph_lifting"

# Read yaml file
transform_config = {
    "lifting": load_transform_config(transform_type, transform_id)
    # other transforms (e.g. data manipulations, feature liftings) can be added here
}

We then apply the transform (with default node degree of two) via our `PreProcesor`:

In [None]:
lifted_dataset = PreProcessor(dataset, transform_config, loader.data_dir)
describe_data(lifted_dataset)

We can see that by choosing `node_degree=2`, each node is incident to two hyperedges. Let's see what happens if we choose a higher degree.

In [None]:
transform_config["lifting"]["node_degree"] = 4
lifted_dataset = PreProcessor(dataset, transform_config, loader.data_dir)

print(
    f"Incidence matrix for hyperedges:\n{lifted_dataset[0].incidence_hyperedges.to_dense()}"
)

This is the incidence matrix assigning each node (row) its incident hyperedges (column). By summing row elements, we see that each node is incident to four hyperedges now.

## Create and Run a Simplicial NN Model

In this section a simple model is created to test that the used lifting works as intended. In this case the model uses the `incidence_hyperedges` matrix so the lifting should make sure to add it to the data.

In [None]:
from modules.models.hypergraph.unigcn import UniGCNModel

model_type = "hypergraph"
model_id = "unigcn"
model_config = load_model_config(model_type, model_id)

model = UniGCNModel(model_config, dataset_config)

In [None]:
y_hat = model(lifted_dataset.get(0))

If everything is correct the cell above should execute without errors. 