Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The implementation of _fruchterman_reingold significantly deviates from the original algorithm with adverse effects for every graph that is not fully symmetric #4885

Open
paulbrodersen opened this issue Jun 9, 2021 · 13 comments · May be fixed by #6400
Labels
Visualization Related to graph visualization or layout

Comments

@paulbrodersen
Copy link
Contributor

paulbrodersen commented Jun 9, 2021

The implementation of _fruchterman_reingold significantly deviates from the original algorithm with adverse effects for every graph that is not fully symmetric

Preamble

A pdf of the original paper can be found here. There is a listing or pseudocode of the algorithm on page 1132. I believe that the implementation in networkx is largely based on this pseudocode. Before anyone raises objections to my points below, I would like to point out that the text of the paper significantly deviates from this pseudocode, and that I assume that the more detailed text -- not the pseudocode -- describes the intended algorithm, and thus constitutes ground truth.

Summary of the algorithm

The aims of the Fruchterman-Reingold algorithm are (1) to distribute vertices evenly within a given frame, while (2) having uniform edge lengths. It achieves this goal by simulating two forces between nodes:

  1. a repulsive force, that acts between all vertices:
fr = k^2 / distance,
  1. an attractive force, that only acts between connected vertices:
fa = distance^2 / k.

For small distances (distance < k), the repulsive force is stronger than any attraction; for large distances, attraction dominates.

In the absence of connections, vertices only repel each other. In the presence of a frame, the system reaches equilibrium when all vertices are maximally far apart from their nearest neighbours (as only short distances matter for repulsion). This occurs when the vertices are distributed uniformly (aim (1)) over the area of the frame. In this case, the average distance between them will be Area / sqrt(|V|) (assuming rectangular frames). If two vertices are connected, the repulsive and attractive forces balance when the two vertices are k apart. This promotes uniform edge lengths (aim (2)). If k is further approximately Area / sqrt(|V|), then both forces act in concert such all neighbouring nodes will be approximately k apart. If k is smaller, the average distance between nearest neighbour nodes will be between k and Area / sqrt(|V|), depending on connection density.

Significant deviations

The implementation in networkx differs in two, seemingly innocuous but significant details:

1. Application of the temperature

As in many other annealing algorithms, the Fruchterman-Reingold algorithm has a temperature parameter that influences the magnitude of displacements and decreases over time. This is supposed to dampen oscillations, and to eventually lock a single solution into place. Specifically, and from the paper:

"The idea is that the displacement of a vertex is limited to some maximum value, and that this maximum value decreases over time [...]."

However, in the implementation in networkx, all displacement magnitudes are set to the current temperature t, and the previously computed displacements are only used to determine the direction:

length = np.linalg.norm(displacement, axis=-1)
delta_pos = np.einsum("ij,i->ij", displacement, t / length)

Dividing the displacement vector by its length results in a unit vector. All of these unit vectors are then rescaled by t.

Furthermore, the cooling scheme in the paper is not linear, but 2-part: first, a rapid decrease in temperature ("quenching") followed by further updates at low but constant temperature ("simmering"). However, in my own experiments, the cooling scheme does not have a large effect (whereas how the temperature is applied does matter).

2. Enforcement of the frame

In the paper, on each iteration, for vertices with displacements that pushes them outside the frame, the frame negates all components of displacements normal to the frame (see Fig. 6 for a much better and easier to parse pictorial explanation). In my experiments, this approach actually results in issues as the vertices then often slide into the corners. When multiple vertices slide into the same corner, the repulsion terms lead to a numerically unstable behaviour. Personally, I much prefer the approach in Fig. 3, inelastic collision, where vertices colliding with the frame simply do not move until all components of the displacement are pushing the vertex off the frame.

However, the implementation in networkx simply ignores the frame (given by the scale and origin parameters) until the very end, when the positions are rescaled to be within the frame iff there are no fixed nodes.

When does this matter?

The spring layout in networkx often results in node placements that seem extremely plausible:

import networkx as nx
nx.draw(nx.cubical_graph())

Figure_1

Graph is a cube. The plot shows a cube. All is well that ends well. Case closed.

However, there are counter-examples that clearly show that there is something rotten in the state of Denmark:

import networkx as nx
g = nx.Graph()
g.add_nodes_from(range(100))
nx.draw(g)

Figure_2

The Fruchterman-Reingold is a force directed algorithm. All nodes repel each other proportional to the inverse distance. As outlined above, for a fully unconnected graph, the equilibrium configuration should be attained when the nodes are evenly distributed within the frame. Clearly, this is not the case in networkx. Notably, other implementations of the FR algorithm show a different behaviour, e.g. see this SO post.

So what is happening here? The net force acting on each vertex pushes it away from the centroid of all other vertices. If the number of vertices is large enough, all of these centroids are approximately in the same location. As the displacement length is the same for all nodes, namely the current temperature, this results in a uniform push outwards, and the ring structure seen above. Crucially, the rings are fairly symmetric such that the centroids remain in place. On subsequent iterations, the ring hence simply expands.

Apart from this clearly pathological case, and more generally, I think these two issues affect every graph that is not symmetric with uniform degree distribution. For example, observe the layout of a ball-and-chain graph:

import networkx as nx
g = nx.complete_graph(30)
g.add_edges_from(zip(range(0, -5, -1), range(-1, -6, -1)))
nx.draw(g)

Figure_3

Repulsion from the 5 nodes in the chain should not be sufficient to push all 30 nodes of the ball into one corner. However, as the absolute magnitude of the repulsion is irrelevant (temperature), and expansion is ultimately limitless (no frame), that is exactly what happens.

I am still struggling to find good example graphs that are sufficiently asymmetric to show the effect while being simple enough such that anyone can have some intuition what the graph should look like given a force directed layout (within a frame !). However, I hope that these examples can get the ball rolling.


I have a different implementation of the FR algorithm here. It differs from the original algorithm in the exact manner how the frame is applied but follows the paper otherwise closely (it also handles fixed nodes a bit more efficiently but that is another conversation).
Using my implementation, I get the following layout:

Figure_4

from netgraph import Graph # pip install netgraph
Graph(g)
@rossbar
Copy link
Contributor

rossbar commented Jun 27, 2022

Thanks @paulbrodersen . This hasn't garnered any discussion, likely because there's a lot to digest! Any chance you'd be able to open a PR with suggested changes and/or some of the above examples as test cases?

@rossbar rossbar added the Visualization Related to graph visualization or layout label Jun 27, 2022
@paulbrodersen
Copy link
Contributor Author

Sorry, I now have a newborn (5 weeks) and won't have time for this for the foreseeable future. You can have a look at my implementation of the FR algorithm in netgraph here.

@991jo
Copy link

991jo commented Jan 25, 2023

As mentioned on the mailing list recently I will take over this issue.

@paulbrodersen
Copy link
Contributor Author

@991jo Feel free to tag me here or shoot me an email if you have any questions.

@991jo
Copy link

991jo commented Jan 27, 2023

Okay, quick progress report and a couple of observations:

I implemented the FR algorithm with the forces as intended by FR, handled nodes in the same position by slightly randomizing their positions and added a simple frame implementation.

Regarding the example with an the completely unconnected graph:
The paper also mentions the issue with disconnected parts of the graph getting pushed to the frame. This is a general problem of this algorithm, the 100 disconnected nodes produce an extreme example for this.
The behavior shown in the SO post with the implementation in gephi is the result of a gravity parameter in gephi that pulls all nodes to the center of the image. That is also why this implementation results in a circle instead of filling the whole rectangle.
FR wrote in their paper, that the grid variant of the algorithm produces better results, because the repulsive forces are limited in range so once disconnected parts of a graph are separated they no longer push each other further away.
Or to put it in an other way: This example is one where the algorithm is expected to fail, others have implemented variations that worked around this limitation. But at a certain point you are deviating to much and are no longer implementing the FR algorithm.

I will come up with several other examples instead and I will make a comparison between the different frame implementations.

@991jo
Copy link

991jo commented Jan 27, 2023

A more general issue with the border is the combination of the value of k.
Currently the networkx implementation simply uses sqrt(1/nnodes) by assuming an area of size 1. The same formula is used in the netgraph package.

However FR write that they used k = C * sqrt(area/nnodes) where

the constant C is found experimentally

So basically both implementations assume C=1.

When we are doing this and apply a boundary the graph usually is to big for the frame.
graph_stuck_at_boundary
The graph in the middle is a 5 by 5 2D grid. In theory this is the poster child example for the edge-length formula and should fill the frame perfectly, in practice it ends up as shown in this picture. The tree in the third image has the same problem.

If I reduce the edge-length by a factor of e.g. 0.2 the results are small enough to fit into the frame and the results are as expected.
graph_0 2_k
FR experienced the same problem an wrote:

In practice, we often choose the strength of the forces so that the resulting graph is small enough that it never nears the borders, then apply a filter to enlarge the graph to fill the frame.

Choosing the strength of the forces means reducing k. But if we reduce k so much the graph so much, that it almost never touches the frame, then the only influence the frame can have is on the aspect ratio of the graph.
However there is currently no way to specify one and matplotlib stretches the image to whatever size fits the screen anyway.

An other example for large values of k with a boundary is this long graph with 21 nodes, connected into one chain.
This picture is taken with C=1.
snake_k_1

If we set C=0.2 we get this picture.
snake_k_0 2

For the chain to straighten out we can remove the boundary and give it ~500 iterations.
snake_np_boundary_500_it

The netgraph package also has no boundary. @paulbrodersen : Did you come to the same conclusion as I did, that a boundary is mostly useless when the graph has to be small enough to almost never touch it? If so I would propose to just ignore the boundary, set C=1 and implement the forces and temperature as intended by FR.

@paulbrodersen
Copy link
Contributor Author

@991jo

There is a (inelastic) boundary in netgraph: https://github.com/paulbrodersen/netgraph/blob/master/netgraph/_node_layout.py#L469

Basically, on each iteration, I compute candidate node positions and but only update positions that result in a valid positions as defined by the origin and scale parameters (i.e. the frame).

When I last looked at the implementation in networkx, it did not have a frame. The origin/scale parameters are only used to rescale the final node layout. As I explain above, I think that is a flaw, as for asymmetric graphs like my ball-and-chain example, this results in solutions that are far from the intended optimum (i.e. an even distribution of nodes on the canvas within the constraints imposed by the graph structure). I like your current test cases but I would urge you to include some highly asymmetric graphs as well.

@991jo 991jo linked a pull request Jan 31, 2023 that will close this issue
9 tasks
@991jo
Copy link

991jo commented Jan 31, 2023

I just made a draft PR for the code I changed.

One open question that should be discussed is the sparse implementation that is used for graphs with more than 500 nodes.
I could simply adopt that one as well, but does this even make sense? The FR algorithm often produces bad results and is rather slow for such large graphs. These are the primary reasons why people designed multilevel approaches.

An other question is how to handle fixed positions and the random generation of the positions of the remaining vertices.
Since the fixed positions can have arbitrary values and the frame has to contain those fixed positions I think it is not a good idea to
still place all non-fixed vertices in a 1x1 square.
An when generating the frame we probably don't want to place the fixed positions directly at the boundary. So starting with a frame that encloses the fixed positions and then expanding each dimension by e.g. 50% of the length of that dimension in each direction would give us some space. Does anyone have experiences/opinions regarding that?

@paulbrodersen
Copy link
Contributor Author

I would raise an error whenever fixed positions fall outside the dimensions specified by origin/scale.

@991jo
Copy link

991jo commented Feb 1, 2023

I misread the description of the fixed vertices. The fixed vertices don't get initial positions by the user, they are randomly generated at the beginning but stay fixed in that position.

@dschult
Copy link
Member

dschult commented Feb 1, 2023

The docs say that the input fixed states which nodes should be fixed, but that an exception is raised if fixed is provided and pos is not. So the pos input provides the initial position of the fixed nodes... They should not have random initial positions. At least that is my understanding...

@991jo
Copy link

991jo commented Feb 23, 2023

Okay, I have spent a lot of time looking at pictures, tinkering with settings and extracting values from different layouts.

I will start with the ball and chain example from above. The fundamental problem of the example is not the size of the ball in the layout, it is the size of the chain which is pushed very far away. Since the relative size of ball and chain are very different the distribution of nodes in the frame gets very uneven. By adding a frame we simply limit how far the chain can get away. This significantly improves the issue for the chain and ball graph.
Take a look at this picture that shows 10 examples of the ball-and-chain graph. These are 10 random examples, the temperature and forces are as described by FR, the boundary is implemented as an inelastic collision that does not move a node if it would leave the frame (as done by netgraph). C is fixed to 1. The nodes of the chain are colored from purple over blue to green and finally yellow so that one can see in which order they should be.

ball_and_chain_inelastic_boundary

Generally the ball has a size that is roughly equal in all drawings (with slight differences depending on the scaling). The deviation of the ball-size before scaling is usually less then 5% of its average size.
The placement of the chain around the ball is primarily dependent on the random initial placement, because due to the way the inelastic collision works, the nodes usually just move outwards until they hit the boundary and then are never able to move, because the ball always ends up in the center and pushes the remaining nodes into the boundary.
An other problem is that the nodes of the chain are not placed at the boundary in the order they are in the chain. An extreme example is image 10, however most of the others show a similar behavior.

These layouts look very similar to the ones generated by netgraph, however the ball in the netgraph layouts is larger. This is the result of a difference in how the edge length is calculated. In networkx each node is just a point. Each node in netgraph is a circle and therefore has a radius/diameter. The radius is subtracted from the distances between the center points of those circles. This has an influence on the size of the ball as can be seen in the following layouts which increase the node size from 1.0 up to 5.5 (4.0 is the default).
netgraph_node_size_comparison_2

Regarding the ball and chain example my conclusion for far is: Yes, adding a boundary can improve the uneven node distribution, however it sacrifices the (previously good) layout of the chain part.

The consequences of adding a frame for other graphs vary. If we simply set C=1 and add a frame with inelastic collisions many layouts suffer.
fr_boundary_c_1

Here we can see a set of example graphs, many of which which get pushed into the boundary. These graphs are clearly to big for the frame. To eliminate this problem we have to adjust C, and as we can see in the following examples with c=0.5 and c=0.2 the results get a lot better in general.

(c=0.5)
fr_boundary_c_0 5

(c=0.2)
fr_boundary_c_0 2

However this reintroduces the problem of the ball-and-chain example. Due to the smaller optimal edge length the ball as well as the chain are smaller, giving the chain more space in the frame to be straightened. Effectively we have removed the frame by making the graph so small that it never touches the frame.

So we are in a bit of a dilemma here: If we have no frame, then examples like the ball and chain are not good. If we add a frame, then the ball and chain can be good. However this makes many other layouts bad. To fix that we have to reduce the optimal edge length (e.g. by reducing C), however this reintroduces the original issue.

So in this case I would propose to do the following:

  • implement a frame with inelastic boundaries that is optional and disabled by default.
  • add the constant C as a possible parameter, defaulting to 1.
  • keep the force-calculation and temperature as it was in the original implementation.

The result of these changes would be, that by default the layouts stay unchanged. However by adding the option for a boundary and the constant C we give the users the possibility to adjust the layout to their needs. I think this is a practical solution, that also comes with a lot less changes to the inner loops of the current code.

If we want to continue this way I will prepare a new MR/update the old one accordingly.

@dschult
Copy link
Member

dschult commented Feb 27, 2023

Wow -- Thanks for this gallery of examples and proposal for moving forward.

I think the proposed approach would allow people to stay with the current function, but offer an option that might help a few graphs with long strands of hair. The doc_string description should explain when someone might want to use this.

To reduce keyword input pollution, could we combine the boundary and C values into a single keyword argument that is None by default (indicating no boundary) and turns on the boundary with C-value given when the input is not None. Are there parts of this I'm missing?

I'm not actually sure how much better the layouts are with the restricted domain. The chain of the ball and chain is bent more as C increases. Does it also show more of the ball? I can't see that very well... Maybe I'm not looking for the right thing. I guess I usually don't expect a picture of a ball to show much useful information... :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Visualization Related to graph visualization or layout
Development

Successfully merging a pull request may close this issue.

4 participants