### Details

This document contains the details of Task 2 for ICS2205. The task will be marked out of 100%, however it is equivalent to 10% of the total mark for this unit. <br> 
While discussions between individual students are considered as healthy, the final deliverable needs to be that produced by you and **not plagiarised** in any way. The **deadline** to submit this task is **12:00pm Monday 29th November 2021**.<br>
You need to compile your answer to the task described below in this same notebook. Then upload it, together with a duely filled plagiarism form, onto the appropriate space on the VLE. Deliverables submitted late will be penalised or may not be accepted.

### Interfacing NetworkX with Neo4j

Neo4j is an important graph platform and is more than a persistant storage for graph data. It provides graph algorithms that are scaleable and production-ready. In this task you will need to combine Neo4j with NetworkX. To do this you need to use the **nxneo4j** Python library.


Install the latest version of nxneo4j as follows:

In [1]:
! pip install git+https://github.com/ybaktir/networkx-neo4j

Collecting git+https://github.com/ybaktir/networkx-neo4j
  Cloning https://github.com/ybaktir/networkx-neo4j to c:\users\natha\appdata\local\temp\pip-req-build-olc3n2fs
Collecting neo4j-driver
  Downloading neo4j-driver-4.4.0.tar.gz (89 kB)
Building wheels for collected packages: networkx-neo4j, neo4j-driver
  Building wheel for networkx-neo4j (setup.py): started
  Building wheel for networkx-neo4j (setup.py): finished with status 'done'
  Created wheel for networkx-neo4j: filename=networkx_neo4j-0.0.3-py3-none-any.whl size=13779 sha256=c953ab8c120dfe9ce860e26f4b9af5cac0eca2fc45d87ead8b07c3e8ab698a1b
  Stored in directory: C:\Users\natha\AppData\Local\Temp\pip-ephem-wheel-cache-504i9_y2\wheels\cc\60\dd\5b2ed5bf4ba4f10077be9331dc8273624c116a5e506feaa70e
  Building wheel for neo4j-driver (setup.py): started
  Building wheel for neo4j-driver (setup.py): finished with status 'done'
  Created wheel for neo4j-driver: filename=neo4j_driver-4.4.0-py3-none-any.whl size=114955 sha256=b13b976861f

#### Connect to Neo4j

In [1]:
from neo4j import GraphDatabase, basic_auth

For this task you will be using a [Neo4j blank sandbox](https://neo4j.com/sandbox/). Once the instance has started check the connection details tab to find the **Bolt URL** and the **password**. By default the user name is **neo4j**. Update the code below with the details to connect to Neo4j sandbox.

In [2]:
graph = GraphDatabase.driver("bolt://3.227.247.8:7687", auth=basic_auth("neo4j","vices-shop-functions"))

Access the Neo4j sandbox and inspect the database by openning it with the browser

In [3]:
import nxneo4j as nx #using nxneo4j

In [4]:
G = nx.Graph(graph) #create the empty graph

#### Analyse the Game of Thrones dataset

nxneo4j contains a number of built-in datasets. One of these datasets is build around the popular TV series of Game of Thrones. The dataset is based around that created by [Andrew Beveridge](https://networkofthrones.wordpress.com/) and contains the interactions between the characters of the popular TV series. The nodes are labelled "Character" while the relationships include "INTERACTS1", "INTERACTS2", "INTERACTS3" and "INTERACTS45" which represent the interactions between the characters across the various books (1 to 5).

Load the dataset

In [5]:
G.load_got()

Draw the graph using nxneo4j

In [6]:
G.identifier_property  = 'name'
G.relationship_type = '*'
G.node_label = 'Character'
nx.draw(G)

Find how many nodes the graph contains

In [7]:
numberOfNodes = len(G)
print("The number of nodes that the graph contains is: ", numberOfNodes)

The number of nodes that the graph contains is:  796


Compute PageRank, sort the results and print out the first 5 results

In [8]:
got_pr = nx.pagerank(G)
pr_sorted = sorted(got_pr.items(), key=lambda x: x[1], reverse=True)
for character, score in pr_sorted[:5]:
    print(character, score)

Jon-Snow 17.596878939369677
Tyrion-Lannister 17.56810234895046
Jaime-Lannister 13.92547230647357
Cersei-Lannister 13.402357010904186
Daenerys-Targaryen 12.499194895817286


Compute Betweenness Centrality. sort the results and print out the first 5 results

In [9]:
got_bc = nx.betweenness_centrality(G)
bc_sorted = sorted(got_bc.items(), key=lambda x: x[1], reverse=True)
for character, score in bc_sorted[:5]:
    print(character, score)

Jon-Snow 65395.267871654345
Tyrion-Lannister 50202.17398521848
Daenerys-Targaryen 39636.77718662114
Stannis-Baratheon 35984.21182863314
Theon-Greyjoy 35436.85268519103


Now switch to the Neo4j sandbox and access the database through the browser. Query directly the database using Cypher to find out the following:

1. Count the number of edges.
2. Display the graph based on the relationships of the character with the highest PageRank (from above).
3. Degree centrality is simply the number of connections that a node has in the network. In this context the degree centrality of a character is simply the number of other characters that interacted with it. Compute the degree centrality by considering **only** the **INTERACTIONS2** relation.

Add the Cypher queries below:

MATCH ()-[r]->() RETURN count(r) as count

MATCH(n:Character) WHERE n.name = "Jon-Snow" RETURN n

CALL gds.betweenness.stream({nodeProjection: "Character", relationshipProjection:{INTERACTS2:{orientation:'UNDIRECTED'}}})YIELD nodeId, score RETURN gds.util.asNode(nodeId).name AS name, score ORDER BY score DESC LIMIT 10

#### References

1. NetworkX-Neo4j example: https://github.com/ybaktir/networkx-neo4j/blob/master/examples/nxneo4j_tutorial_latest.ipynb
2. Further information to how to use Neo4j from Python: https://neo4j.com/developer/python/