Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graph/network representation in Altair? #340

Closed
linclelinkpart5 opened this issue Jun 5, 2017 · 12 comments
Closed

Graph/network representation in Altair? #340

linclelinkpart5 opened this issue Jun 5, 2017 · 12 comments

Comments

@linclelinkpart5
Copy link

Hello,

I'm quite new to Vega(-Lite) and Altair, having only heard about them at Pycon 2017. I was wondering if it was possible to represent an interactive graph/network using Altair. My use case involves being able to view, zoom, pan, drag, drop, and edit the nodes and edge properties of a directed acyclic graph.

The Vega examples show something that could be modified to fit my use case (https://vega.github.io/vega/examples/airport-connections), but it seems that Altair specifically deals with Vega-Lite, and thus is mainly statistically-focused?

@kanitw
Copy link
Member

kanitw commented Jun 6, 2017

it seems that Altair specifically deals with Vega-Lite, and thus is mainly statistically-focused?

Yes, we don't plan to extend Vega-lite to support graph/network in the short term.

Thanks for asking :)

@luerhard
Copy link

luerhard commented Jun 6, 2017

for interactive networks i can recommend bokeh like here: https://gist.github.com/habet700/daa537362eef802f8d808782fc962bf2

@BrennanBarker
Copy link

This doesn't directly address the OP's question, but for people who end up here looking for help visualizing networks in Altair, it's probably useful to point to the nx_altair package, which focuses on providing an API similar to NetworkX's, but with all the goodness of Altair charts' interactivity, etc.

For those interested in leaning into Altair's lovely Grammar of Graphics (GoG)-focused API, below is one way I've gone about it.

A GoG'y way to think about you're doing when you make a typical network visualization (the drawings with dots and lines between them) is you're laying out point marks (representing your nodes) spatially on an X/Y plane and drawing line marks between the nodes where edges exist. While your nodes and edges do have intrinsic properties (names, weights, ranks in a hierarchy, etc.), the trick is to realize that (by definition) your graph doesn't include information about how to position nodes spatially. You have to provide that information, either by specifying node positions arbitrarily or by using a layout algorithm. There are lots of layout algorithms ('forced-directed' being one category of layouts, there are others), and there are great packages that calculate graph layouts, but Altair (I think wisely) leaves it to those other packages rather than trying to replicate that functionality themselves.

Here's an example where I use NetworkX to build a graph and calculate some properties as well as layout information before feeding it into Altair to actually build the graphic:

from itertools import chain
import pandas as pd
import networkx as nx
import altair as alt

# Step 1: Prepare an example graph (this all happens outside of Altair)

# example graph 
r,h = 3,3
G = nx.generators.classic.balanced_tree(r,h)

# calculate rank of a given node and assign it as data
for rank in range(0,h+1):
    nodes_in_rank = nx.descendants_at_distance(G, 0, rank)
    for node in nodes_in_rank: 
        G.nodes[node]['rank'] = rank

# calculate layout positions, for example using Graphviz's 'twopi' algorithm, calculated via networkx's API.  
pos = nx.drawing.nx_agraph.graphviz_layout(G, prog='twopi')


# Step 2: Convert graph data from NetworkX's format to the pandas DataFrames expected by Altair

pos_df = pd.DataFrame.from_records(dict(node_id=k, x=x, y=y) for k,(x,y) in pos.items())
node_df = pd.DataFrame.from_records(dict(data, **{'node_id': n}) for n,data in G.nodes.data())
edge_data = ((dict(d, **{'edge_id':i, 'end':'source', 'node_id':s}),
              dict(d, **{'edge_id':i, 'end':'target', 'node_id':t}))
             for i,(s,t,d) in enumerate(G.edges.data()))
edge_df = pd.DataFrame.from_records(chain.from_iterable(edge_data))


# Step 3:  Use Altair to encode the graph data as marks in a visualization
x,y = alt.X('x:Q', axis=None), alt.Y('y:Q', axis=None)
# use a lookup to tie position data to the other graph data
node_position_lookup = {
    'lookup': 'node_id', 
    'from_': alt.LookupData(data=pos_df, key='node_id', fields=['x', 'y'])
}
nodes = (
    alt.Chart(node_df)
    .mark_circle(size=300, opacity=1)
    .encode(x=x, y=y, color=alt.Color('rank:N', legend=None))
    .transform_lookup(**node_position_lookup)
)
edges = (
    alt.Chart(edge_df)
    .mark_line(color='gray')
    .encode(x=x, y=y, detail='edge_id:N')  # `detail` gives one line per edge
    .transform_lookup(**node_position_lookup)
)
chart = (
    (edges+nodes)
    .properties(width=500, height=500,)
    .configure_view(strokeWidth=0)
)
chart

visualization-6

The nice thing about keeping things in terms of Altair's API is that it's trivial at this point to add additional encodings, layers, interactivity, tooltips, etc.

All this said, there are a couple of things that Altair doesn't quite appear to support yet, although I'd be happy to be corrected:

  • An easy way to make marks to represent directed edges (this can be done poorly with some effort by making a layer of arrowhead marks, rotated to match the angle of the edge)
  • An easy way to rotate text, which is the only thing I can't figure out how to replicate from this Vega example
  • An easy way to make edge line-marks other than straight lines, such as the Bezier curves or orthogonal splits in the above Vega example's 'links' options.
  • The OP's requests for 'drag', 'drop' and 'edit' capabilities -- I think of these more as data editing vs data visualization, so I wouldn't be surprised if the Altair team considers them out-of-scope.

@BradKML
Copy link

BradKML commented Dec 10, 2021

Apologies for asking, but the Graph above is not quite a network graph, and I can't seem to find a fitting example. Alternate places of ideas:

@BrennanBarker
Copy link

@BrandonKMLee the examples you pointed to all incorporate the same visual elements as the plot above: some marks arranged arbitrarily in X and Y space, with lines between some of them, and optionally some additional encodings for more information, such as node color, or size.

Here's another example of applying the same technique as above, this time using the classic "Karate Club" social network dataset, calculating the nodes X and Y positions using networkx's spring layout function, and doing the data transformation in pandas before passing to altair. I also include a classic measure of network centrality (calculated with networkx), encoded as node color.

import altair as alt
import networkx as nx
import pandas as pd

# networkx for example graph data, and layout/SNA calculations
g = nx.karate_club_graph()
positions = nx.spring_layout(g)
betweenness = nx.centrality.betweenness_centrality(g)

# munging into an Altair-friendly format in pandas
nodes_data = (
    pd.DataFrame
    .from_dict(positions, orient='index')
    .rename(columns={0:'x', 1:'y'})
    .assign(betweenness=lambda df:df.index.map(betweenness))
)
edges_data = (
    nx.to_pandas_edgelist(g)
    .assign(
        x=lambda df:df.source.map(nodes_data['x']),
        y=lambda df:df.source.map(nodes_data['y']),
        x2=lambda df:df.target.map(nodes_data['x']),
        y2=lambda df:df.target.map(nodes_data['y']),
    )
)

# Chart building in altair
nodes = (
    alt.Chart(nodes_data)
    .mark_circle(size=300, opacity=1)
    .encode(x=alt.X('x', axis=None), y=alt.Y('y', axis=None), color='betweenness')
)
edges = alt.Chart(edges_data).mark_rule().encode(x='x', y='y', x2='x2', y2='y2')
chart = (
    (edges + nodes)
    .properties(width=500,height=500)
    .configure_view(strokeWidth=0)
    .configure_axis(grid=False)
)

visualization-7

@BradKML
Copy link

BradKML commented Dec 11, 2021

@BrennanBarker

some marks arranged arbitrarily in X and Y space

That is somewhat concerning, as I would like to play with weighted graphs, and KarateClub does not have edge weights.

@BrennanBarker
Copy link

There's no cause for concern when it comes to visualizing edge weights in Altair. Simply link the weight data column to an encoding for the line marks. Below I visualize the classic (weighted) Les Miserables graph, encoding the edge weights by opacity with one small change to the chart specification code from my last:

import altair as alt
import networkx as nx
import pandas as pd

# networkx for example graph data, and layout/SNA calculations
- g = nx.karate_club_graph()
+ g = nx.les_miserables_graph()
positions = nx.spring_layout(g)
betweenness = nx.centrality.betweenness_centrality(g)

# munging into an Altair-friendly format in pandas
nodes_data = (
    pd.DataFrame
    .from_dict(positions, orient='index')
    .rename(columns={0:'x', 1:'y'})
    .assign(betweenness=lambda df:df.index.map(betweenness))
)
edges_data = (
    nx.to_pandas_edgelist(g)
    .assign(
        x=lambda df:df.source.map(nodes_data['x']),
        y=lambda df:df.source.map(nodes_data['y']),
        x2=lambda df:df.target.map(nodes_data['x']),
        y2=lambda df:df.target.map(nodes_data['y']),
    )
)

# Chart building in altair
nodes = (
    alt.Chart(nodes_data)
    .mark_circle(size=300, opacity=1)
    .encode(x=alt.X('x', axis=None), y=alt.Y('y', axis=None), color='betweenness')
)
- edges = alt.Chart(edges_data).mark_rule().encode(x='x', y='y', x2='x2', y2='y2')
+ edges = alt.Chart(edges_data).mark_rule().encode(x='x', y='y', x2='x2', y2='y2', opacity='weight')
chart = (
    (edges + nodes)
    .properties(width=500,height=500)
    .configure_view(strokeWidth=0)
    .configure_axis(grid=False)
)

visualization-8

@BradKML
Copy link

BradKML commented Dec 12, 2021

This looks like the Kamada System in https://towardsdatascience.com/visualizing-networks-in-python-d70f4cbeb259

But there is an aesthetic alternative like Fruchterman–Reingold https://www.researchgate.net/publication/221157852_Summarization_Meets_Visualization_on_Online_Social_Networks

Weirdly enough the code you provided "Spring" is the latter... IDK why but this is weird
https://sci-hub.se/10.1109/TVCG.2019.2934802

@BrennanBarker
Copy link

Networkx has several graph layout choices, as do many other dedicated graph analysis libraries. I think Altair (really Vega-Lite) does the right thing by not trying to implement any graph layout algorithms, instead leaving it to the user to calculate layout points themselves using whichever of the many possible tools and algorithms best suits their needs. The calculated layouts are then passed to Altair as X and Y encodings, along with any other data encodings needed to tell the data visualization story.

@BradKML
Copy link

BradKML commented Dec 15, 2021

@BrennanBarker understandable, but this is based on NetworkX? If so alternate layouts can be constructed using KarateClub's proxemic node embedding.

@BradKML
Copy link

BradKML commented Dec 15, 2021

Currently, I am observing that

The graph is manufactured like:

# assuming we already have a dataframe
corr = df.corr()

from numpy import tanh, exp, pi
def scale(x): return sign(x)*abs(exp(tanh((x-0.6)/0.6)))
print(list(map(scale,[-1,-0.9,-0.6,-0.3,0,0.3,0.6,0.9,1])))

import altair as alt
import networkx as nx
import pandas as pd

# networkx for example graph data, and layout/SNA calculations
g = nx.from_numpy_matrix(np.vectorize(scale)(corr.to_numpy()))

#relabels the nodes to match the item names
g = nx.relabel_nodes(g,lambda x: df.columns[x])

@iridazzle
Copy link

I found a code example of the Airport Connections chart in the altair documentation. Maybe this could help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants