This notebook will be collected automatically at **6pm on Monday** from `/home/data_scientist/assignments/Week13` directory on the course JupyterHub server. If you work on this assignment on the course Jupyterhub server, just make sure that you save your work and instructors will pull your notebooks automatically after the deadline. If you work on this assignment locally, the only way to submit assignments is via Jupyterhub, and you have to place the notebook file in the correct directory with the correct file name before the deadline.

1. Make sure everything runs as expected. First, restart the kernel (in the menubar, select `Kernel` → `Restart`) and then run all cells (in the menubar, select `Cell` → `Run All`).
2. Make sure you fill in any place that says `YOUR CODE HERE`. Do not write your answer in anywhere else other than where it says `YOUR CODE HERE`. Anything you write anywhere else will be removed by the autograder.
3. Do not change the file path or the file name of this notebook.
4. Make sure that you save your work (in the menubar, select `File` → `Save and CheckPoint`)

## Problem 13.3. Neo4J

In this problem, we persist a NetworkX graph in Neo4J.

In [None]:
import networkx as nx
from py2neo import authenticate, Graph, Node, Relationship
from py2neo.database import cypher

from nose.tools import assert_equal, assert_true, assert_is_instance

We use [Zachary's Karete Club](https://en.wikipedia.org/wiki/Zachary%27s_karate_club) data set. For more information, see [Week 10 Problem 2](https://github.com/UI-DataScience/info490-sp16/blob/master/Week10/assignments/w10p2.ipynb).

In [None]:
karate_club = nx.karate_club_graph()

In the following code cell, we read in the current user's netid to obtain a unique database name for this Notebook.

In [None]:
# Filename containing user's netid
fname = '/home/data_scientist/users.txt'
with open(fname, 'r') as fin:
    netid = fin.readline().rstrip()

# We will delete our working directory if it exists before recreating.
dbname = '{0}'.format(netid)

host_ip = '65.52.38.138:7474'
username = 'neo4j'
password = 'Lcdm#info490'

# First we authenticate
authenticate(host_port=host_ip, user=username, password=password)

# Now create database URL
db_url = 'http://{0}/db/{1}'.format(host_ip, dbname)

print('Creating connection to {0}'.format(db_url))
graph = Graph(db_url)

version = graph.dbms.kernel_version
print('Neo4J Kernel version {0}.{1}.{2}'.format(version[0], version[1], version[2]))

The following code cell removes all existing graphs from the Neo4J server.

In [None]:
# Clean out graph database
graph.delete_all()

## Persisting Graphs

- Recreate Zachary's Karate club graph in Neo4J.
- Provide a label `"members"` to all nodes.
- Create a relationship of `"friend of"` for all edges in `karate_club`.

In [None]:
def persist_graph(neo_graph, nx_graph):
    '''
    Persists a NetworkX graph in Neo4J.
    All nodes are labeled "members".
    All edges have connection type "friend of".
    
    Parameters
    ----------
    neo_graph: A py2neo.database.Graph instance.
    nx_graph: A networkx.Graph instance.
    '''
    
    # YOUR CODE HERE
    
    return None

In [None]:
persist_graph(graph, karate_club)

In [None]:
assert_true(all(isinstance(n['name'], str) for n in graph.find('members')))
node_names = [int(n['name']) for n in graph.find('members')]
assert_equal(len(node_names), len(karate_club.nodes()))
assert_equal(set(node_names), set(karate_club.nodes()))

In [None]:
edges = [e for e in graph.match(rel_type='friend of')]
start_nodes = [int(e.start_node()['name']) for e in edges]
end_nodes = [int(e.end_node()['name']) for e in edges]

assert_equal(len(edges), len(karate_club.edges()))
assert_equal(set(start_nodes), {e[0] for e in karate_club.edges()})
assert_equal(set(end_nodes), {e[1] for e in karate_club.edges()})

## Cleanup

In [None]:
# Clean out graph database
graph.delete_all()