# Label Propagation
_Label Propagation_ (LPA) is a fast algorithm for finding communities in a graph.
It detects these communities using network structure alone as its guide and doesn't require a pre-defined objective function or prior information about the communities.

One interesting feature of LPA is that we can give nodes preliminary labels to narrow down the range of solutions generated.
This means that it can be used as semi-supervised way of finding communities where we hand-pick some initial communities.

First we'll import the Neo4j driver and Pandas libraries:


In [1]:
from neo4j import GraphDatabase
import pandas as pd
import os

Next let's create an instance of the Neo4j driver which we'll use to execute our queries.


In [2]:
host = os.environ.get("NEO4J_HOST", "bolt://localhost") 
user = os.environ.get("NEO4J_USER", "neo4j")
password = os.environ.get("NEO4J_PASSWORD", "neo")
driver = GraphDatabase.driver(host, auth=(user, password))

Now let's create a sample graph that we'll run the algorithm against.


In [3]:
create_graph_query = '''
MERGE (nAlice:User {id:'Alice'}) SET nAlice.predefined_label=52
MERGE (nBridget:User {id:'Bridget'}) SET nBridget.predefined_label=21
MERGE (nCharles:User {id:'Charles'}) SET nCharles.predefined_label=43
MERGE (nDoug:User {id:'Doug'}) SET nDoug.predefined_label=21
MERGE (nMark:User {id:'Mark'}) SET nMark.predefined_label=19
MERGE (nMichael:User {id:'Michael'}) SET nMichael.predefined_label=52

MERGE (nAlice)-[:FOLLOW]->(nBridget)
MERGE (nAlice)-[:FOLLOW]->(nCharles)
MERGE (nMark)-[:FOLLOW]->(nDoug)
MERGE (nBridget)-[:FOLLOW]->(nMichael)
MERGE (nDoug)-[:FOLLOW]->(nMark)
MERGE (nMichael)-[:FOLLOW]->(nAlice)
MERGE (nAlice)-[:FOLLOW]->(nMichael)
MERGE (nBridget)-[:FOLLOW]->(nAlice)
MERGE (nMichael)-[:FOLLOW]->(nBridget)
MERGE (nCharles)-[:FOLLOW]->(nDoug);
'''

with driver.session() as session:
    result = session.write_transaction(lambda tx: tx.run(create_graph_query))
    print("Stats: " + str(result.consume().metadata.get("stats", {})))

Stats: {'labels-added': 6, 'relationships-created': 10, 'nodes-created': 6, 'properties-set': 12}


Finally we can run the algorithm by executing the following query:


In [4]:
streaming_query = """
CALL gds.labelPropagation.stream({
  nodeProjection:"User", 
  relationshipProjection:"FOLLOW",
  maxIterations: 10}) 
YIELD nodeId, communityId
RETURN gds.util.asNode(nodeId).id as user, communityId
"""

with driver.session() as session:
    result = session.run(streaming_query)     
    df = pd.DataFrame([dict(r) for r in result])

df

Unnamed: 0,user,communityId
0,Alice,7
1,Bridget,7
2,Charles,10
3,Doug,10
4,Mark,10
5,Michael,7


Our algorithm found two communities with 3 members each.
Visually it seems clear that Michael, Bridget, and Alice belong together, as do Doug and Mark.
Only Charles doesn't strongly fit into either side but ends up with Doug and Mark.

We can also call a version of the algorithm that will store the result as a property on a
node. This is useful if we want to run future queries that use the result.

In [5]:
write_query = """
CALL gds.labelPropagation.write({
  nodeProjection:'User', 
  relationshipProjection:'FOLLOW',
  maxIterations:10,
  writeProperty:'partition'})
YIELD createMillis, computeMillis, writeMillis;
"""

with driver.session() as session:
    session.write_transaction(lambda tx: tx.run(write_query))

## Graph Visualisation

Sometimes a picture can tell more than a table of results and this is often the case with graph algorithms. 
Let's see how to create a graph visualization using neovis.js.

First we'll create a div into which we will generate the visualisation.

In [57]:
%%html
<style type="text/css">                
.output_wrapper, .output {
    height:auto !important;
    max-height:600px;
}
.output_scroll {
    box-shadow:none !important;
    webkit-box-shadow:none !important;
}

#viz {
    width: 300px;
    height: 350px;
    font: 22pt arial;
}
</style>  
<div id="viz"></div>

Next we need to define the query that the visualization will be generated from, along with config 
that describes which properties will be used for node size, node colour, and relationship width. 

We'll then define a JavaScript variable that contains all our parameters.

In [58]:
from IPython.core.display import Javascript
import json
from scripts.algo import viz_config, render_image

config = viz_config("Label Propagation")
query = config["query"]
labels_json = config["labels_json"]
relationships_json = config["relationships_json"]

json_graph = {
    "query": query,
    "labels": labels_json,
    "relationships": relationships_json,
    "host": host,
    "user": user,
    "password": password
}

Javascript("""window.jsonGraph={};""".format(json.dumps(json_graph)))

<IPython.core.display.Javascript object>

Now we're ready to call neovis.js and generate our graph visualisation. 
The following code will create an interactive graph into the div defined above.
It will also extract an image representation of the graph and display that in the cell below.

In [59]:
%%javascript
var output_area = this;
requirejs(['neovis.js'], function(NeoVis){    
    var config = {
      container_id: "viz",
      server_url: window.jsonGraph.host,
      server_user: window.jsonGraph.user,
      server_password: window.jsonGraph.password,
      labels: window.jsonGraph.labels,
      relationships: window.jsonGraph.relationships,
      initial_cypher: window.jsonGraph.query
    };
        
    let viz = new NeoVis.default(config);
    viz.render();
    
    viz.onVisualizationRendered(function(ctx) {
      let imageSrc = ctx.canvas.toDataURL();
      let kernel = IPython.notebook.kernel;
      let command = "image_src = '" + imageSrc + "'";
      kernel.execute(command);
      
      var cell_element = output_area.element.parents('.cell');
      var cell_idx = Jupyter.notebook.get_cell_elements().index(cell_element);
      var cell = Jupyter.notebook.get_cell(cell_idx+1);
      cell.set_text("render_image(image_src)")
      cell.execute();
    });
});

<IPython.core.display.Javascript object>

In [62]:
render_image(image_src)