## D3 graphs (adapting some ideas from the IPython cookbook)

What we're mostly doing in this notebook is extending some ideas sketched in an Ipython cookbook recipe for using D3 graphs inline in a notebook.  It's pretty easy to adapt what's shown below to code that directly generates HTML pages that display the graphs, though that is not done here.  The example code assumes you have a GML version of Knuth's graph of *Anna Karenina* available at http://www-rohan.sdsu.edu/~gawron/python_for_ss/course_core/assignments/anna.gml.

In [1]:
import json
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
%matplotlib inline
from networkx.readwrite import json_graph

Matplotlib is building the font cache using fc-list. This may take a moment.


We'll make the json file we use for the `d3` display from a `networkx` graph.  The code in the next cell is going to save a minimal json version of the *anna karenina* graph.  For example, it does not let us include any node label information in the display.  Below we experiment with richer versions of the graph.

In [19]:
g.nodes[0]

{'club': 'Mr. Hi'}

In [56]:
import networkx as nx
import readwrite_gml
import json
from networkx.readwrite import json_graph

ak = readwrite_gml.read_gml('anna.gml',relabel=False)
# Prepare the graph to have the attributes d3 knows
for n in ak:
   # We delete the 'id' attribute because the json graph link fn 
   # wants to compute a unique id itself and breaks if there already is one.
   del ak.nodes[n]['attr_dict']['id']
   nm = ak.nodes[n]['attr_dict']['desc'].split(',')[0]
   # Use informative names as value of  node 'name'  attribute
   ak.nodes[n]['name'] = nm
   
# For now let's screen out this data, the edge weights
#for (s,e,att_dict) in ak.edges(data=True):
#  del att_dict['attr_dict']['value']

# Alternate example
data = json_graph.node_link_data(ak)
with open('ak_graph1.json', 'w') as f:
    json.dump(data, f, indent=4)

An alternative graph to draw.  Just replace the reference to 'ak_graph1.json' in the the main body of the javascript code with 'kn_graph1.json'.

In [63]:
g = nx.karate_club_graph()
for n in g:
    g.nodes[n]['name'] = n
data = json_graph.node_link_data(g)
with open('kn_graph1.json', 'w') as f:
    json.dump(data, f, indent=4)

What we're going to do now is display the *Anna Karenina* graph using [the D3 javascript
library](http://d3js.org/), which is a very innovative set of visualization tools that will work in
most any browser.  In particular, we're going to display the graph in the web page you're looking at, which happens to be an IPython notebook.

We first create an HTML notebook cell using IPython magic command `%%html`. We'll call this **the HTML cell** below. HTML cells are a powerful feature of IPython notebooks. You can insert HTML code into an HTML cell and when the cell is executed, the HTML code in the cell will be passed directly to your browser to be displayed correctly. The HTML cell below defines an HTML page division (a `div`) with the  id `d3-example`, in which our graph will be displayed, but not until after we execute the cell below the HTML cell, which contains javascript code specifying how to display a `d3-example`.  We'll call that the JAVASCRIPT cell. So execute the HTML cell and then the JAVASCRIPT cell, then scroll back to look at the *Anna Karenina* graph displayed in the HTML cell.

In [66]:
%%html
<div id="d3-example"></div>
<style>
.node {stroke: #fff; stroke-width: 1.5px;}
.link {stroke: #999; stroke-opacity: .6;}
</style>

In [67]:
%%javascript
// We load the d3.js library from the Web.
require.config({paths: {d3: "http://d3js.org/d3.v3.min"}});
require(["d3"], function(d3) {
    // The code in this block is executed when the 
    // d3.js library has been loaded.
    
    // First, we specify the size of the canvas containing
    // the visualization (size of the <div> element).
    var width = 500,
        height = 525;

    // We create a color scale.
    var color = d3.scale.category10();

    // We create a force-directed dynamic graph layout.
    var force = d3.layout.force()
        .charge(-120)
        .linkDistance(50)
        .size([width, height]);

    // In the <div> element, we create a <svg> graphic
    // that will contain our interactive visualization.
    var svg = d3.select("#d3-example").select("svg")
    if (svg.empty()) {
        svg = d3.select("#d3-example").append("svg")
                    .attr("width", width)
                    .attr("height", height);
    }
        
    // We load the JSON file.
    d3.json("ak_graph1.json", function(error, graph) {
        // In this block, the file has been loaded
        // and the 'graph' object contains our graph.
        
        // We load the nodes and links in the force-directed
        // graph.
        force.nodes(graph.nodes)
            .links(graph.links)
            .start();

        // We create a <line> SVG element for each link
        // in the graph.
        var link = svg.selectAll(".link")
            .data(graph.links)
            .enter().append("line")
            .attr("class", "link")
            .style("stroke-width", function(d) { return Math.sqrt( d.value); });

        // We create a <circle> SVG element for each node
        // in the graph, and we specify a few attributes.
        var node = svg.selectAll(".node")
            .data(graph.nodes)
            .enter().append("circle")
            .attr("class", "node")
            .attr("r", 5)  // radius
            .style("fill", function(d) {
                // The node color depends on the club.
                // Works only for karate graph
                return color(d.group); 
            })
            .call(force.drag);

        // The name of each node is the node number.
        node.append("title")
            .text(function(d) { return d.name; });

        // We bind the positions of the SVG elements
        // to the positions of the dynamic force-directed graph,
        // at each time step.
        force.on("tick", function() {
            link.attr("x1", function(d) { return d.source.x; })
                .attr("y1", function(d) { return d.source.y; })
                .attr("x2", function(d) { return d.target.x; })
                .attr("y2", function(d) { return d.target.y; });

            node.attr("cx", function(d) { return d.x; })
                .attr("cy", function(d) { return d.y; });
        });
    });
});

<IPython.core.display.Javascript object>

## Community discovery

We now attempt to partition the characters of the novel into communities, using Newman's several greedy modularity maximization algorithms, which has a Python implementation in a `networkx` module called `community.modularity_max`.  Below, we'll use the `group` attribute to represent the community of a character in the D3 graph.  If you scroll through the javascript code you'll see that the D3 coloring of a node is already defined via its `group` attribute.  The D3 color function implements conventions for assigning colors to integers, so that's what we'll pass as values of the `group` attribute. A default color is assigned when a node has no "group", which is why the javascript code didn't when break when we passed it graphs with no `group` attribute.  The `greedy_modularity_communities` function returns a list of node groupings (a list of lists).  We turn that into a dictionary whose keys are node indices and whose values are their communities, represented as integers, so it's quite easy to link it to a D3 color.

In [69]:
#import community
from networkx.algorithms.community import modularity_max
import readwrite_gml
ak = readwrite_gml.read_gml('anna.gml',relabel=False)
partitions = modularity_max.greedy_modularity_communities(ak.to_undirected())
partition_dict = dict((n,i) for (i,p) in enumerate(partitions) for n in p)

# Prepare the graph to have the attributes d3 knows
for n in ak:
   # We delete the 'id' attribute because the json graph link fn 
   # wants to compute a unique id itself and breaks if there already is one.
   del ak.nodes[n]['attr_dict']['id']
   nm = ak.nodes[n]['attr_dict']['desc'].split(',')[0]
   ak.nodes[n]['name'] = nm
   ak.nodes[n]['group'] = partition_dict[n]

# For now let's screen out this data, the edge weights
for (s,e,att_dict) in ak.edges(data=True):
   del att_dict['attr_dict']['value']

data = json_graph.node_link_data(ak)
with open('ak_graph2.json', 'w') as f:
    json.dump(data, f, indent=4)

Finally we'll try keeping the edge weights and inspecting the results in the final graph.  If you look at the java script, you'll see that a link has a style which in turn has a stroke-width that is a function of the `value` attribute of the link.
Up until now, we've been omittin that information from the `ak` graph.  Now we'll include it.  Check out the thickness of teh edges linking major characters like Anna, Vronsky, Karenin, and Stiva.  Think of the links as the springs that are anchoring a character to their position in the network, and the thickness of the link as the strength of that spring.  Now check out the strength of the links holding `Levin` in place.  For those who know the novel, contemplate that.

In [70]:

#import community
from networkx.algorithms.community import modularity_max
import readwrite_gml
ak = readwrite_gml.read_gml('anna.gml',relabel=False)
partitions = modularity_max.greedy_modularity_communities(ak.to_undirected())
partition_dict = dict((n,i) for (i,p) in enumerate(partitions) for n in p)
#partition = community.best_partition(ak.to_undirected())
# Prepare the graph to have the attributes d3 knows
for n in ak:
   # We delete the 'id' attribute because the json graph link fn 
   # wants to compute a unique id itself and breaks if there already is one.
   #del ak.nodes[n]['attr_dict']['id']
   nm = ak.nodes[n]['attr_dict']['desc'].split(',')[0]
   #dd = dict()
   #ak.nodes[n]['attr_dict'] = dd
   ak.nodes[n]['name'] = nm
   ak.nodes[n]['group'] = partition_dict[n]

#Keeping node values
# For now let's screen out this data, the edge weights
#for (s,e,att_dict) in ak.edges(data=True):
#   del att_dict['value']

data = json_graph.node_link_data(ak)
with open('ak_graph3.json', 'w') as f:
    json.dump(data, f, indent=4)

## Using json file generated from Drew Conway's code 

In this example, the json data file was generated by the code above, so that it can be directly be loaded by the java script.  Because the file includes community and name information for each node, the name associated with a node appears when you hover the mouse over  it, and the colors reflect community assignments.  In addition edge values reflect weights.

In [92]:
%%html
<div id="d3-example5"></div>
<style>
.node {stroke: #fff; stroke-width: 1.5px;}
.link {stroke: #999; stroke-opacity: .6;}
</style>

The graph we read in has communities encoded in a `group` attribute.  We modify the javascript code to look there for color.

In [93]:
%%javascript
// We load the d3.js library from the Web.
require.config({paths: {d3: "http://d3js.org/d3.v3.min"}});
require(["d3"], function(d3) {
    // The code in this block is executed when the 
    // d3.js library has been loaded.
    
    // First, we specify the size of the canvas containing
    // the visualization (size of the <div> element).
    var width = 800,
        height = 500;

    // We create a color scale.
    var color = d3.scale.category10();

    // We create a force-directed dynamic graph layout.
    var force = d3.layout.force()
        .charge(-170)
        .linkDistance(20)
        .size([width, height]);

    // In the <div> element, we create a <svg> graphic
    // that will contain our interactive visualization.
    var svg = d3.select("#d3-example5").select("svg")
    if (svg.empty()) {
        svg = d3.select("#d3-example5").append("svg")
                    .attr("width", width)
                    .attr("height", height);
    }
   
d3.json('ak_graph2.json', function(error, graph) {
  force
      .nodes(graph.nodes)
      .links(graph.links)
      .start();

  var link = svg.selectAll(".link")
      .data(graph.links)
    .enter().append("line")
      .attr("class", "link")
      .style("stroke-width", function(d) { return Math.sqrt( d.value); });

  var node = svg.selectAll(".node")
      .data(graph.nodes)
    .enter().append("circle")
      .attr("class", "node")
      .attr("r", 5)
      .style("fill", function(d) { return color(d.group); })
      .call(force.drag);

  node.append("title")
      .text(function(d) { return d.name; });

  force.on("tick", function() {
    link.attr("x1", function(d) { return d.source.x; })
        .attr("y1", function(d) { return d.source.y; })
        .attr("x2", function(d) { return d.target.x; })
        .attr("y2", function(d) { return d.target.y; });

    node.attr("cx", function(d) { return d.x; })
        .attr("cy", function(d) { return d.y; });
  });
});

});

<IPython.core.display.Javascript object>