You can use r-neighborhoods in your project by importing the following modules
├── src/
└── data/
└── custom_collate.py # collate function that accounts for r-neighborhoods
└── transforms/
└── r_neighborhood.py # build the r-neighborhood of each node in the graph
└── nn/
└── loopy.py # class definition of loopy layers that process r-neighborhoods
and modifying them according to your needs.
Given an input graph r-neighborhood
The computation of r-neighborhoods uses the networkx.simple_cycles
function, which returns simple cycles of the input graph
The paths are usually computed in the preprocessing step. However, this could lead to memory overload, especially when --lazy
flag, which postpones the computation of cyclic permutation to the forward step. In this way, we don't store all paths for each graph in the dataset but compute them on the flight.
The
Our code uses GIN layers to process paths, as it is simple but maximally expressive on paths. You can choose a different neural architecture. Note that messages on paths are transmitted via torch.nn.functional.conv3d
with kernel
To limit the number of learnable parameters, we provide a --shared
flag: it guarantees that we have shared weights among paths of different lengths in each loopy layer.
To implement the pooling operations, we use segment_csr
instead of scatter
: the former is fully deterministic, as noted in the documentation. When indices are not unique, the behavior of scatter
is non-deterministic: one of the values from src
will be picked arbitrarily, and the gradient propagated to all elements with the same index, resulting in an incorrect gradient computation.
The hyperparams used for the experiments can be retrieved from the folder
├── scripts/
You can reproduce the results by typing
bash scripts/<dataset_name>.sh
or you can specify your own configuration as
python run_model.py --dataset zinc_subset --r 5
For subgraphcount
[2], you need to specify the target motiv
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
F | |||||||||
hom(F, G) | sub(F, G) |
by typing
python run_model.py --dataset subgraphcount_2 --r 1
The first three motifs are used to test against homomorphism-counts, the latter six against subgraph-counts. The preprocessing of the dataset is done following the official repo of [3].
Similarly, you can specify the regression target of qm9
by qm9_<n>
where
For brec
[1], you need to specify the name of the raw file, i.e., brec_<name>
where name is one among basic
, extension
, regular
, 4vtx
(4-vertex condition), dr
(distance regular), str
(strongly regular), and cfi
.
Moreover, exp_iso
is the name given to exp
when the task is to count the number of indistinguishable pairs.
[1] Yanbo Wang et al. "Towards better evaluation of gnn expressiveness with brec dataset." arXiv preprint arXiv:2304.07702 (2023).
[2] Lingxiao Zhao et al. "From stars to subgraphs: Uplifting any gnn with local structure awareness." In International Conference on Learning Representations, 2022.
[3] Bohang Zhang et al. "Beyond Weisfeiler-Lehman: A Quantitative Framework for GNN Expressiveness." In International Conference on Learning Representations, 2024.