### Details

This document contains the details of Task 2 for ICS2205. The task will be marked out of 100%, however it is equivalent to 10% of the total mark for this unit. <br> 
While discussions between individual students are considered as healthy, the final deliverable needs to be that produced by you and **not plagiarised** in any way. The **deadline** to submit this task is **12:00pm Monday 28th November 2022**.<br>
You need to compile your answer to the task described below in this same notebook. Then upload it, together with a duely filled plagiarism form, onto the appropriate space on the VLE. Deliverables submitted late will be penalised or may not be accepted.

### Interfacing NetworkX with Neo4j

Neo4j is an important graph platform and is more than a persistant storage for graph data. It provides graph algorithms that are scaleable and production-ready. In this task you will need to combine Neo4j with NetworkX. To do this you need to use the **nxneo4j** Python library.


Install the latest version of nxneo4j as follows:

In [1]:
pip install networkx-neo4j

Note: you may need to restart the kernel to use updated packages.


In [2]:
pip install git+https://github.com/ybaktir/networkx-neo4j

Collecting git+https://github.com/ybaktir/networkx-neo4j
  Cloning https://github.com/ybaktir/networkx-neo4j to c:\users\markd\appdata\local\temp\pip-req-build-8w6w0ka0
  Resolved https://github.com/ybaktir/networkx-neo4j to commit 97dc9563bf992ea9714cbdb99cb9e6a41c7cce65
Note: you may need to restart the kernel to use updated packages.


  Running command git clone -q https://github.com/ybaktir/networkx-neo4j 'C:\Users\markd\AppData\Local\Temp\pip-req-build-8w6w0ka0'


#### Connect to Neo4j

In [3]:
from neo4j import GraphDatabase, basic_auth

For this task you can use a [Neo4j blank sandbox](https://neo4j.com/sandbox/). Once the instance has started check the connection details tab to find the **Bolt URL** and the **password**. By default the user name is **neo4j**. Update the code below with the details to connect to Neo4j sandbox. You can also use the Neo4j desktop version.

In [4]:
graph = GraphDatabase.driver("bolt://localhost:7687", auth=basic_auth("neo4j","79598856"))

Access the Neo4j sandbox and inspect the database by openning it with the browser

In [5]:
import nxneo4j as nx #using nxneo4j

In [6]:
G = nx.Graph(graph) #create the empty graph

#### Analyse the Game of Thrones dataset

nxneo4j contains a number of built-in datasets. One of these datasets is build around the popular TV series of Game of Thrones. The dataset is based around that created by [Andrew Beveridge](https://networkofthrones.wordpress.com/) and contains the interactions between the characters of the popular TV series. The nodes are labelled "Character" while the relationships include "INTERACTS1", "INTERACTS2", "INTERACTS3" and "INTERACTS45" which represent the interactions between the characters across the various books (1 to 5).

Load the dataset and draw the graph using nxneo4j **(5 marks)**

In [11]:
#add code here

# Loading the Data set
# Delete previous graph
G.delete_all()
# Gets and Loads all the nodes
G.load_got()
# Choosing node label names
G.identifier_property = 'name'
# Choosing relationship type between nodes
G.relationship_type = '*'
# Choosing type of node
G.node_label = 'Character'
# Draws the graph
nx.draw(G)

Find how many nodes the graph contains **(5 marks)**

In [8]:
#add code here

# Finds the number of nodes
len(G)

796

Compute PageRank, sort the results and print out the first 5 results **(20 marks)**

In [13]:
#add code here

# Calculating the pagerank
response = nx.pagerank(G)

# Getting the pagerank for the first 5 results
sorted_pagerank = sorted(response.items(), key=lambda x: x[1], reverse=True)
for character, score in sorted_pagerank[:5]:
    # Print results
    print(character, score)

Jon-Snow 17.596878939369677
Tyrion-Lannister 17.56810234895046
Jaime-Lannister 13.92547230647357
Cersei-Lannister 13.402357010904186
Daenerys-Targaryen 12.499194895817286


Compute Betweenness Centrality. Sort the results and print out the first 5 results. **(20 marks)**

In [14]:
#add code here

# Calculating the Betweenness Centrality
response = nx.betweenness_centrality(G)

# Getting the betweenness Centrality for the first 5 results
sorted_bw = sorted(response.items(), key=lambda x: x[1], reverse=True)
for character, score in sorted_bw[:5]:
    # Print results
    print(character, score)

Jon-Snow 65395.26787165435
Tyrion-Lannister 50202.17398521848
Daenerys-Targaryen 39636.777186621155
Stannis-Baratheon 35984.21182863313
Theon-Greyjoy 35436.852685191036


Now switch to the Neo4j sandbox (or your desktop version) and access the database through the browser. Query directly the database using Cypher to find out the following:

1. Count the number of edges. **(10 marks)**
2. Display the graph based on the relationships of the character with the highest PageRank (from above). **(20 marks)**
3. Degree centrality is simply the number of connections that a node has in the network. In this context the degree centrality of a character is simply the number of other characters that interacted with it. Compute the degree centrality by considering **only** the **INTERACTIONS2** relation. **(20 marks)**

**Add the Cypher queries below:**

Cypher query (1)

MATCH (n)-[r]->() RETURN COUNT(r)

Cypher query (2)

CALL gds.graph.create('getGraph', 'Character', ["INTERACTS1", "INTERACTS2", "INTERACTS3", "INTERACTS45"])<br>

CALL gds.pageRank.stream('getGraph')<br>
YIELD nodeId, score<br>
WITH gds.util.asNode(nodeId).name AS name, score<br>
WITH apoc.agg.maxItems(name, score) as maxData<br>
MATCH var = (graphJon:Character {name: maxData.items[0]})<--(r)<br>
RETURN var<br>

Cypher query (3)

CALL gds.degree.stream({<br>
        nodeProjection: "Character",<br>
        relationshipProjection: {<br>
            relType: {<br>
                type: "INTERACTS2",<br>
                orientation: "NATURAL",<br>
                properties: {}<br>
                }<br>
            }<br>
        })<br>
    YIELD nodeId, score<br>
    RETURN gds.util.asNode(nodeId).name AS name, score<br>
    ORDER BY score DESC<br>

#### References

1. Further information to how to use Neo4j from Python: https://neo4j.com/developer/python/