Skip to content

Commit

Permalink
Fix more spelling mistakes
Browse files Browse the repository at this point in the history
  • Loading branch information
paulbrodersen committed Jun 28, 2023
1 parent e25ba46 commit 2f66d5d
Showing 1 changed file with 8 additions and 8 deletions.
16 changes: 8 additions & 8 deletions publication/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ bibliography: paper.bib

# Statement of need

The empirical study and scholarly analysis of networks has increased manyfold in recent decades, fuelled by the new prominence of network structures in our lives (the web, social networks, artificial neural networks, ecological networks, etc.) and the data available on them. While there are several comprehensive Python libraries for network analysis such as NetworkX [@Hagberg:2008], igraph [@Csardi:2006], and graph-tool [@Peixoto:2014], their inbuilt visualisation capabilities lag behind specialised software solutions such as Graphviz [@Ellson:2002], Cytoscape [@Shannon:2003], or Gephi [@Bastian:2009]. However, although Python bindings for these applications exist in the form of PyGraphviz, py4cytoscape, and GephiStreamer, respectively, their outputs are not manipulable Python objects, which restricts customisation, limits their extensibility, and prevents a seamless integration within a wider Python application.
The empirical study and scholarly analysis of networks has increased manifold in recent decades, fuelled by the new prominence of network structures in our lives (the web, social networks, artificial neural networks, ecological networks, etc.) and the data available on them. While there are several comprehensive Python libraries for network analysis such as NetworkX [@Hagberg:2008], igraph [@Csardi:2006], and graph-tool [@Peixoto:2014], their inbuilt visualisation capabilities lag behind specialised software solutions such as Graphviz [@Ellson:2002], Cytoscape [@Shannon:2003], or Gephi [@Bastian:2009]. However, although Python bindings for these applications exist in the form of PyGraphviz, py4cytoscape, and GephiStreamer, respectively, their outputs are not manipulable Python objects, which restricts customisation, limits their extensibility, and prevents a seamless integration within a wider Python application.

# Summary

Expand Down Expand Up @@ -77,11 +77,11 @@ Graph(
plt.show()
```

# Interoperability & Customizability
# Interoperability & Customisability

Netgraph can be easily integrated into existing network analysis workflows as it accepts a variety of graph structures. The example below uses a NetworkX `Graph` object, but igraph and graph-tool objects are also valid inputs, as are plain edge lists and full-rank adjacency matrices. The output visualizations are created using Matplotlib and can hence form subplots in larger Matplotlib figures.
Netgraph can be easily integrated into existing network analysis workflows as it accepts a variety of graph structures. The example below uses a NetworkX `Graph` object, but igraph and graph-tool objects are also valid inputs, as are plain edge lists and full-rank adjacency matrices. The output visualisations are created using Matplotlib and can hence form subplots in larger Matplotlib figures.

Each visualization can be customized in various ways. Most parameters can be set using a scalar or string. In this case, the value is applied to all relevant artists. To style different artists differently, supply a dictionary instead. Furthermore, node and edge artists are derived from `matplotlib.patches.PathPatch`; node and edge labels are `matplotlib.text.Text` instances. Hence all node artists, edge artists, and labels can be manipulated using standard matplotlib syntax after the initial draw.
Each visualisation can be customised in various ways. Most parameters can be set using a scalar or string. In this case, the value is applied to all relevant artists. To style different artists differently, supply a dictionary instead. Furthermore, node and edge artists are derived from `matplotlib.patches.PathPatch`; node and edge labels are `matplotlib.text.Text` instances. Hence all node artists, edge artists, and labels can be manipulated using standard matplotlib syntax after the initial draw.

![Advanced example output](advanced_example.png){width=50%}

Expand All @@ -98,7 +98,7 @@ fig, ax = plt.subplots(figsize=(6,6))
# initialize the graph structure
balanced_tree = nx.balanced_tree(3, 3)

# initialize the visualization
# initialize the visualisation
g = Graph(
balanced_tree,
node_layout='radial',
Expand Down Expand Up @@ -133,11 +133,11 @@ fig.canvas.draw()

# Key Design Decisions

The creation of Netgraph was motivated by the desire to make high quality, easily customizable, and reproducible network visualizations, whilst maintaining an extensible code base. To that end, a key design decision was to have a single reference frame for all node artist and edge artist attributes that determine their extent (e.g. in the case of a circular node artist, its position and its radius).
The creation of Netgraph was motivated by the desire to make high quality, easily customisable, and reproducible network visualisations, whilst maintaining an extensible code base. To that end, a key design decision was to have a single reference frame for all node artist and edge artist attributes that determine their extent (e.g. in the case of a circular node artist, its position and its radius).

Good data visualizations are both accurate and legible. The legibility of a visualization is influenced predominantly by the size of the plot elements, and occlusions between them. However, there is often a tension between these two requirements, as larger plot elements are more visible but also more likely to cause overlaps with other plot elements. Most data visualization tools focus on accuracy and visibility. To that end, they operate in two reference frames: a data-derived reference frame and a display-derived reference frame. For example, in a standard line-plot, the data-derived reference frame determines the x and y values of the line. The thickness of the line, however, scales with the size of the display, and its width (measured in pixels) remains constant across different figure sizes and aspect ratios. Having two reference frames ensures that the line (1) is an accurate representation of the data, and (2) is visible and discernible independent of figure dimensions. The trade-off of this setup is that (1) the precise extents of plot elements can only be computed after the figure is initialized, and (2) occlusions are not managed and hence common, for example, if multiple lines are plotted in the same figure. Nevertheless, most network visualization tools follow this standard. For example, NetworkX specifies node positions and edge paths in data coordinates, but uses display units for node sizes and edge widths.
Good data visualisations are both accurate and legible. The legibility of a visualisation is influenced predominantly by the size of the plot elements, and occlusions between them. However, there is often a tension between these two requirements, as larger plot elements are more visible but also more likely to cause overlaps with other plot elements. Most data visualisation tools focus on accuracy and visibility. To that end, they operate in two reference frames: a data-derived reference frame and a display-derived reference frame. For example, in a standard line-plot, the data-derived reference frame determines the x and y values of the line. The thickness of the line, however, scales with the size of the display, and its width (measured in pixels) remains constant across different figure sizes and aspect ratios. Having two reference frames ensures that the line (1) is an accurate representation of the data, and (2) is visible and discernible independent of figure dimensions. The trade-off of this setup is that (1) the precise extents of plot elements can only be computed after the figure is initialised, and (2) occlusions are not managed and hence common, for example, if multiple lines are plotted in the same figure. Nevertheless, most network visualisation tools follow this standard. For example, NetworkX specifies node positions and edge paths in data coordinates, but uses display units for node sizes and edge widths.

However, network visualizations differ from other data visualizations in two aspects: (1) the precise positions of nodes and the precise paths of edges often carry no inherent meaning, and (2) most figures contain a multitude of node and edge artists instead of just the few lines typically present in a line-plot. As a consequence, a common goal of most algorithms for node layout, edge routing, and label placement is to minimize occlusions between different plot elements, as they reduce the ease with which a visualization is interpreted. To that end, precise knowledge of the extents of all plot elements is paramount, motivating the use of a single reference frame. In Netgraph, this reference frame derives from the data. Specifically, node positions and edge paths are specified in data units, and node sizes and edge widths are specified in 1/100s of data units (as this makes the node sizes and edge widths more comparable to typical values in NetworkX, igraph, and graph-tool). This decouples layout computations from rendering the figure, simplifies computing the extent of the different plot elements, facilitates the reduction of overlaps, and makes it possible to create pixel-perfect reproductions independent of display parameters.
However, network visualisations differ from other data visualisations in two aspects: (1) the precise positions of nodes and the precise paths of edges often carry no inherent meaning, and (2) most figures contain a multitude of node and edge artists instead of just the few lines typically present in a line-plot. As a consequence, a common goal of most algorithms for node layout, edge routing, and label placement is to minimize occlusions between different plot elements, as they reduce the ease with which a visualisation is interpreted. To that end, precise knowledge of the extents of all plot elements is paramount, motivating the use of a single reference frame. In Netgraph, this reference frame derives from the data. Specifically, node positions and edge paths are specified in data units, and node sizes and edge widths are specified in 1/100s of data units (as this makes the node sizes and edge widths more comparable to typical values in NetworkX, igraph, and graph-tool). This decouples layout computations from rendering the figure, simplifies computing the extent of the different plot elements, facilitates the reduction of overlaps, and makes it possible to create pixel-perfect reproductions independent of display parameters.

# Acknowledgements

Expand Down

0 comments on commit 2f66d5d

Please sign in to comment.