Graph DNA

An implementation of an alternative embedding algorithm for graph data. Based on the paper Graph DNA: Deep Neighborhood Aware Graph Encoding for Collaborative Filtering (Liwei Wu et. al)

The Goal

The goal of Graph DNA is to create a memory and time efficient way to embed graph data into a format that is more easily consumed by Graph Neural Networks, while still preserving as much structural insight as possible.

Existing methods localize vertices with message passing using adjacency matrix multiplication + loss function minimization, but have trouble gathering structural insight from graphs beyond 1-2 hops without pushing exponential time and space limits.

Graph DNA uses bloom filters and gossip algorithms to gain deeper neighbourhood insight at a much lower cost. The bloom filters let vertex embeddings inherently hold information about which other vertices influence them. We can then create a new bipartite graph G', where each vertex v of G is adjacent to a corresponding meta-vertex if and only if the corresponding bit in the Bloom filter for v has been activated. See diagram below for a visual representation. We can then perform standard filtering algorithms on G' to obtain deep neighbourhood aware feature aggregations at a fraction of the cost. Finally, for predictions to be made we disregard meta-vertices. Again, all theory is from this paper (Liwei Wu et. al).

_{^{(https://arxiv.org/pdf/1905.12217.pdf, Liwei Wu et. al)}}

The Bigger Picture

Graph Convolutional Networks are extremely powerful tools in biology, social networks & eCommerce with applications including drug interaction prediction, and advanced reccomendation systems.

Demonstrating more time and memory efficient methods for creating graph embeddings is essential in pushing forward these advancments, and making graph data more easily consumed by neural networks.

Examples

As seen in examples.py we can recreate the graph on page 5, and generate its DNA encoding as follows:

figure_1 = Graph(5, [[0,1], [1,2], [2,3], [2,4]])

# generate dna encoding and display results for each vertex
dna_encoding = figure_1.generate_dna_encoding(depth=2, filter_capacity=3)
print(dna_encoding)

>>> [[1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1], 
     [1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1], 
     [1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1], 
     [1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1], 
     [1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1]]
     
# construct our meta bipartite graph and display the set of edges
meta_graph = figure_1.generate_meta_graph(dna_encoding)
print(meta_graph.edges)
>>> [[0, 5], [0, 6], [0, 7], [0, 10], [0, 11], [0, 12], [0, 13], [0, 14], [0, 16], [0, 17], [0, 18], [0, 20], 
    [1, 5], [1, 6], [1, 7], [1, 8], [1, 10], [1, 11], [1, 12], [1, 13], [1, 14], [1, 15], [1, 16], [1, 17], [1, 18], [1, 20], 
    [2, 5], [2, 6], [2, 7], [2, 8], [2, 10], [2, 11], [2, 12], [2, 13], [2, 14], [2, 15], [2, 16], [2, 17], [2, 18], [2, 20], 
    [3, 5], [3, 7], [3, 8], [3, 10], [3, 11], [3, 12], [3, 13], [3, 14], [3, 15], [3, 17], [3, 18], [3, 20], 
    [4, 5], [4, 7], [4, 8], [4, 10], [4, 11], [4, 12], [4, 13], [4, 14], [4, 15], [4, 17], [4, 18], [4, 20]]

The output is the generated embedding for each vertex. We can see that vertices 3,4 and 1,2 respectively get the same embeddings with d=2 just as depicted in Figure 1 in the paper.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
README.md		README.md
dna_demo.png		dna_demo.png
examples.py		examples.py
graph_dna.py		graph_dna.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Graph DNA

The Goal

The Bigger Picture

Examples

About

Releases

Packages

Languages

e-vandenberg/graph-dna

Folders and files

Latest commit

History

Repository files navigation

Graph DNA

The Goal

The Bigger Picture

Examples

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages