# GE-LSPE: Geometrically Enhanced Learnable Structural and Positional Encodings 

**Authors**: Veljko Kovac, Gerard Planella, Adam Valin and Luca Pantea \
**Course**: Deep Learning 2, University of Amsterdam \
**Course Year**: 2023 \
**Course Website**: https://uvadl2c.github.io/ 

---

This notebook is meant to showcase the main experiments that our group worked on in the Deep Learning 2 course timeline. Our goal is to develop a generic method that also combines geometric and topological information by improving upon the established LSPE framework. By combining these distinct approaches, we seek to leverage the complementary nature of geometric and topological information in capturing complex graph relationships and enhancing the discriminative capabilities of GNN models.

In [4]:
# Standard imports
import os
import torch

from IPython.display import display, HTML

script_dir = os.path.abspath('')

# Set this to false to see wandb output
os.environ["WANDB_SILENT"] = "true"

# Set this to "online" for loggin metrics to cloud
!wandb offline 

W&B offline. Running your script from this directory will only write metadata locally. Use wandb disabled to completely turn off W&B.


## About this notebook

This notebook showcases the main experiments that we base our paper on. \
By integrating the **geometrical features** of the graph (node distances in the case of the QM9 dataset) with **topological features** given by PEs, we seek to achieve more expressive node attributes.

We thus divide our approach into **two main directions**:

**i)** **GeTo-MPNN**: A method to **combine the LSPE method** while also making use of the **geometrical information found in EGNNs**, by taking relative absolute distances between nodes into account in the message function, and \

**ii)** **A study of MPNN Architectures**: For both _standard_ and _isotropic_ MPNNs, we conduct experiments with Baseline*, Geometry only, PE only, Geometry and PE, LSPE only, LSPE and Geometry. Formulas found in the table.

*Baseline refers to the barebone implementation of MPNN models, i.e. no added topological, geometrical or learnt features. 

We acknowledge that not anybody might have access to the required computational resources to train each of the models we tested, and thus we provide the saved model weights in the HuggingFace repository [here](https://huggingface.co/datasets/lucapantea/egnn-lspe/tree/main). We thus load each of the model weights saved during training in the following cell.

In [None]:
# Create saved_models dir
saved_models_dir = os.path.join(os.path.dirname(script_dir), 'saved_models')
if not os.path.exists(saved_models_dir):
    os.makedirs(saved_models_dir)

# Load model weights from hugging face
saved_models_dir_git = r'"{}"'.format(saved_models_dir)
if os.path.exists(saved_models_dir) and len(os.listdir(saved_models_dir)) == 0:
    !git clone https://huggingface.co/datasets/lucapantea/egnn-lspe {saved_models_dir_git}
else:
    print('Model weights already initialized.')

Cloning into '/Users/luca/Documents/Masters/Deep Learning 2/LSPE-EGNN/saved_models'...
remote: Enumerating objects: 60, done.[K
remote: Counting objects: 100% (60/60), done.[K
remote: Compressing objects: 100% (59/59), done.[K
remote: Total 60 (delta 4), reused 0 (delta 0), pack-reused 0[K
Unpacking objects: 100% (60/60), 8.87 KiB | 168.00 KiB/s, done.
Filtering content:  53% (24/45), 352.79 MiB | 8.04 MiB/s 

## Results and Analysis

This section of the notebook corresponds to its homonymous counterpart in the blogpost/report, we will first examine how infusing the models with implicit topological information in the shape of Random Walk PEs (RWPE) affects their performance on the QM9 dataset in a fully connected (FC) and non-fully connected (NFC) setting. Moreover, we will demonstrate how geometrical information, the absolute distance between nodes, can be utilized effectively to learn
better node embeddings.

We make the run configurations, training and test metric,alongside with visualisations accessible via a WandB report [here](https://api.wandb.ai/links/dl2-gnn-lspe/uotynqoo). It is additionally displayed below via an embedded HTML element.

In [None]:
wandb_visualizations_code = r'<iframe src="https://wandb.ai/dl2-gnn-lspe/dl2-modularized-exp/reports/EGNN-LSPE-Experiments--Vmlldzo0NDAyMjQ0" style="border:none;height:1024px;width:100%">'
display(HTML(wandb_visualizations_code))

In case one would want to run evaluations for each of the individual experiments, please follow the run argument pattern outlined below, by specifying the model path under the "evaluate" argument. We showcase a two of examples below.

In [None]:
# Standard MPNN, using PE and LSPE, yet without conditioning on distance.
!python -W ignore ../main.py --evaluate "mpnn_qm9_rw24_yes-lspe_no-dist_no-reduced_epochs-1000_num_layers-7_in_c-11_h_c-128_o_c-1_bs-96_lr-0.0005.pt" \
                            --dataset "qm9" \
                            --pe "rw" \
                            --pe_dim 24 \
                            --lspe

In [None]:
# Standard MPNN operating on a fully connected dataset, without PE or LSPE, now with conditioning on distance.
!python -W ignore ../main.py --evaluate "mpnn_qm9_fc_nope_no-lspe_yes-dist_no-reduced_no-update_with_pe_epochs-1000_num_layers-7_in_c-11_h_c-128_o_c-1_bs-96_lr-0.0005.pt" \
                            --dataset "qm9_fc" \
                            --pe "nope" \
                            --include_dist
