Explore node2vec via stress minimization graph layout

Stress minimization is family of graph layout algorithms that produces nice looking two dimensional visualizations of graphs by minimizing difference between "ideal" desired distance and actual distance of nodes.

For example, given graph of city roads, such algorithm could use ground truth about distances of road segments, and converge on globally optimal solution, producing layouts that match exactly what we have on a map:

For abstract graphs we can assign distance between two nodes based on shortest path length between them (this distance is also called "graph theoretical distance"):

node2vec is an algorithm that assigns a high dimensional vector to each node of a graph. The algorithm tries to preserve similarity between nodes on the graph (similar nodes of the graph are close in the high dimensional space).

Since node2vec gives us vectors, we can always measure distance between nodes in such space and use it as "ideal" desired distance of stress minimization algorithm. This repository explores node2vec graph embeddings with stress minimization.

Why this could be important? The answer is twofold:

Stress minimization requeres graph theoretical distance between every single node in the graph. In most cases this requeres at least O(n^2) memory and O(n * (m + n)) preprocessing time. Having nice node embeddings could potentially speed up stress minimization driven graph layouts.
node2vec embedding algorithms have a lot of hyperparemeters. Stress minimization driven graph layout algorithm can help researchers judge quality of selected embedding by visually exploring produced layouts and quantitatively observing final stress cost function value.

Test setup

This repository uses two graphs: a primitive 10x10 grid graph, and miserables graph.

It performs various embeddings with node2vec and then performs graph layout, following Graph Drawing by Stochastic Gradient Descent paper. Isntead of graph-theoretical distance we take distance between nodes in embedded space.

Results

For miserables graph, embeddings in very high dimensional space produced results very similar to ideal, graph-theoretical distances. Image below used 4,048 dimensions in the embedded space:

However going to smaller number of dimensions results in poor layouts:

For 10x10 grid I wasn't able to find a good embedding, no matter how many dimensions I used. Image below used 1,048 dimensions for embedding:

Feedback

If you find a bug in the code, please don't hesitate to ping me.

Usage

To setup code make sure you have downloaded and compiled node2vec tool:

git clone https://github.com/snap-stanford/snap/
cd snap/examples/node2vec
make

This should create an executable node2vec file. Remember path to it, as it will be needed later.

Now, clone this repository and install its dependencies:

git clone https://github.com/anvaka/explore_node2vec/
cd explore_node2vec
npm install

Open sequence.sh and modify node_to_vec variable to point to your compiled node2vec.

Create test graph with node createTestGraphs.js, and launch sequence.sh to produce layouts into svg folder

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
emb		emb
src		src
svg		svg
.gitignore		.gitignore
README.md		README.md
clean.sh		clean.sh
createTestGraphs.js		createTestGraphs.js
distanceCompare.js		distanceCompare.js
n2vlayout.js		n2vlayout.js
package-lock.json		package-lock.json
package.json		package.json
saveGraphToEdgeList.js		saveGraphToEdgeList.js
sequence.sh		sequence.sh
sgdLayout.js		sgdLayout.js

anvaka/explore_node2vec

Folders and files

Latest commit

History

Repository files navigation

Explore node2vec via stress minimization graph layout

Test setup

Results

Feedback

Usage

About

Resources

Stars

Watchers

Forks

Languages