In [None]:
!pip install ipython-cypher -q

In [1]:
%load_ext cypher

## The Graph of Thrones

<center><img src="images/got-community.jpeg" align="center"/></center>

**Character Interaction Networks for George R. R. Martin's "A Song of Ice and Fire" saga**

These networks were created by connecting two characters whenever their names (or nicknames) appeared within **15 words** of one another in one of the books in "A Song of Ice and Fire."  
The edge weight corresponds to the number of interactions.

This is the data for the work presented here: https://networkofthrones.wordpress.com by Andrew Beveridge.

### **Data**

All of the network data is available on GitHub. Have fun!

[Network Data for the Books - https://github.com/mathbeveridge/asoiaf](https://github.com/mathbeveridge/asoiaf)
    
[Network Data for the Series - https://github.com/mathbeveridge/gameofthrones](https://github.com/mathbeveridge/gameofthrones)

#### Clean the database

In [2]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
Match(n)-[r]-(m) delete r,n            

187 nodes deleted.
684 relationship deleted.


#### **Load Data**

In [3]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
CREATE CONSTRAINT ON (c:Character) ASSERT c.name IS UNIQUE

0 rows affected.


##### **Livro I**

In [4]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/mathbeveridge/asoiaf/master/data/asoiaf-book1-edges.csv" AS row
with row limit 10 return row

10 rows affected.


row
"{'Target': 'Jaime-Lannister', 'Type': 'Undirected', 'book': '1', 'weight': '3', 'Source': 'Addam-Marbrand'}"
"{'Target': 'Tywin-Lannister', 'Type': 'Undirected', 'book': '1', 'weight': '6', 'Source': 'Addam-Marbrand'}"
"{'Target': 'Daenerys-Targaryen', 'Type': 'Undirected', 'book': '1', 'weight': '5', 'Source': 'Aegon-I-Targaryen'}"
"{'Target': 'Eddard-Stark', 'Type': 'Undirected', 'book': '1', 'weight': '4', 'Source': 'Aegon-I-Targaryen'}"
"{'Target': 'Alliser-Thorne', 'Type': 'Undirected', 'book': '1', 'weight': '4', 'Source': 'Aemon-Targaryen-(Maester-Aemon)'}"
"{'Target': 'Bowen-Marsh', 'Type': 'Undirected', 'book': '1', 'weight': '4', 'Source': 'Aemon-Targaryen-(Maester-Aemon)'}"
"{'Target': 'Chett', 'Type': 'Undirected', 'book': '1', 'weight': '9', 'Source': 'Aemon-Targaryen-(Maester-Aemon)'}"
"{'Target': 'Clydas', 'Type': 'Undirected', 'book': '1', 'weight': '5', 'Source': 'Aemon-Targaryen-(Maester-Aemon)'}"
"{'Target': 'Jeor-Mormont', 'Type': 'Undirected', 'book': '1', 'weight': '13', 'Source': 'Aemon-Targaryen-(Maester-Aemon)'}"
"{'Target': 'Jon-Snow', 'Type': 'Undirected', 'book': '1', 'weight': '34', 'Source': 'Aemon-Targaryen-(Maester-Aemon)'}"


In [5]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/mathbeveridge/asoiaf/master/data/asoiaf-book1-edges.csv" AS row
MERGE (src:Character {name: row.Source})
MERGE (tgt:Character {name: row.Target})
MERGE (src)-[r:INTERACTS1]->(tgt)
ON CREATE SET r.weight = toInt(row.weight), r.book=1

187 nodes created.
1555 properties set.
684 relationships created.
187 labels added.


##### **Livro II**

In [6]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/mathbeveridge/asoiaf/master/data/asoiaf-book2-edges.csv" AS row
MERGE (src:Character {name: row.Source})
MERGE (tgt:Character {name: row.Target})
MERGE (src)-[r:INTERACTS2]->(tgt)
ON CREATE SET r.weight = toInt(row.weight), r.book=2

169 nodes created.
1719 properties set.
775 relationships created.
169 labels added.


##### **Livro III**

In [7]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/mathbeveridge/asoiaf/master/data/asoiaf-book3-edges.csv" AS row
MERGE (src:Character {name: row.Source})
MERGE (tgt:Character {name: row.Target})
MERGE (src)-[r:INTERACTS3]->(tgt)
ON CREATE SET r.weight = toInt(row.weight), r.book=3

142 nodes created.
2158 properties set.
1008 relationships created.
142 labels added.


##### **Livros IV e V**

In [8]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/mathbeveridge/asoiaf/master/data/asoiaf-book45-edges.csv" AS row
MERGE (src:Character {name: row.Source})
MERGE (tgt:Character {name: row.Target})
MERGE (src)-[r:INTERACTS45]->(tgt)
ON CREATE SET r.weight = toInt(row.weight), r.book=45

298 nodes created.
2956 properties set.
1329 relationships created.
298 labels added.


#### **DB SCHEMA**

<center><img src="images/graph-schema.png" align="center"/></center>

### Quantos personagens temos?

In [9]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
MATCH (c:Character)
RETURN count(c) as nr_personagens

1 rows affected.


nr_personagens
796


### Quantas interações existem em cada livro?

In [10]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
MATCH ()-[r]->()
RETURN r.book as book, count(r) as interacoes
ORDER BY book

4 rows affected.


book,interacoes
1,684
2,775
3,1008
45,1329


### Número de interações da Arya-Stark ao longo dos livros

In [11]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
MATCH (c:Character {name:'Arya-Stark'})-[r]->()
RETURN r.book as book, count(r) as interacoes
ORDER BY book

4 rows affected.


book,interacoes
1,27
2,37
3,36
45,24


### Pesonagems que Arya-Stark mais se interage

In [12]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
MATCH (c:Character {name:'Arya-Stark'})-[r]->(c2:Character)
RETURN sum(r.weight) as interacoes, c2.name as personagem, r.book
ORDER BY interacoes desc limit 10

10 rows affected.


interacoes,personagem,r.book
104,Sansa-Stark,1
51,Hot-Pie,2
48,Sandor-Clegane,3
46,Gendry,3
39,Mordane,1
39,Joffrey-Baratheon,1
37,Jon-Snow,1
36,Gendry,2
31,Lommy-Greenhands,2
30,Eddard-Stark,1


### Interação entre Arya-Stark e Sansa-Stark ao longo dos livros

In [13]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
MATCH (c:Character {name:'Arya-Stark'})-[r]->(c2:Character {name:'Sansa-Stark'}) 
return c.name , r.weight as interacoes, c2.name

4 rows affected.


c.name,interacoes,c2.name
Arya-Stark,104,Sansa-Stark
Arya-Stark,20,Sansa-Stark
Arya-Stark,23,Sansa-Stark
Arya-Stark,8,Sansa-Stark


--- 

### Diâmetro da rede

> The diameter (or geodesic) of a network is defined as the longest shortest path in the network.

<center><img src="images/network-diameter.gif" align="center"/></center>

Caminho curto mais longo na rede no segundo livro.

In [14]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
MATCH (a:Character), (b:Character) WHERE id(a) > id(b)
MATCH p = shortestPath((a)-[:INTERACTS2*]-(b))
WITH length(p) AS len, p
ORDER BY len DESC
LIMIT 5
RETURN nodes(p) AS path, len

5 rows affected.


path,len
"[{'name': 'Steffon-Varner'}, {'name': 'Eldon-Estermont'}, {'name': 'Bryce-Caron'}, {'name': 'Renly-Baratheon'}, {'name': 'Tywin-Lannister'}, {'name': 'Amory-Lorch'}, {'name': 'Gendry'}, {'name': 'Cutjack'}, {'name': 'Tarber'}]",8
"[{'name': 'Murch'}, {'name': 'Gariss'}, {'name': 'Aggar'}, {'name': 'Ramsay-Snow'}, {'name': 'Roose-Bolton'}, {'name': 'Amory-Lorch'}, {'name': 'Gendry'}, {'name': 'Cutjack'}, {'name': 'Kurz'}]",8
"[{'name': 'Steffon-Varner'}, {'name': 'Eldon-Estermont'}, {'name': 'Bryce-Caron'}, {'name': 'Renly-Baratheon'}, {'name': 'Tywin-Lannister'}, {'name': 'Amory-Lorch'}, {'name': 'Gendry'}, {'name': 'Cutjack'}, {'name': 'Kurz'}]",8
"[{'name': 'Murch'}, {'name': 'Gariss'}, {'name': 'Aggar'}, {'name': 'Ramsay-Snow'}, {'name': 'Robb-Stark'}, {'name': 'Renly-Baratheon'}, {'name': 'Bryce-Caron'}, {'name': 'Eldon-Estermont'}, {'name': 'Steffon-Varner'}]",8
"[{'name': 'Murch'}, {'name': 'Gariss'}, {'name': 'Aggar'}, {'name': 'Ramsay-Snow'}, {'name': 'Roose-Bolton'}, {'name': 'Amory-Lorch'}, {'name': 'Gendry'}, {'name': 'Cutjack'}, {'name': 'Tarber'}]",8


--- 

### Medidas de Centralidade

<center><img src="images/measures.png" align="center"/></center>

--- 

### Degree Centrality

> The Degree Centrality algorithm **measures the number of incoming and outgoing relationships** from a node, and helps us find the **most popular nodes** in a graph.

The following query finds the most popular characters in the 1st book, based on the number of character interactions:

In [16]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
CALL algo.degree.stream("Character", "INTERACTS2", {
  direction: "BOTH",
  weightProperty: "weight"
})
YIELD nodeId, score
RETURN algo.asNode(nodeId).name AS personagem, score
ORDER BY score DESC
LIMIT 10

10 rows affected.


personagem,score
Tyrion-Lannister,829.0
Joffrey-Baratheon,629.0
Cersei-Lannister,515.0
Bran-Stark,486.0
Stannis-Baratheon,461.0
Arya-Stark,424.0
Jon-Snow,360.0
Renly-Baratheon,357.0
Robb-Stark,344.0
Catelyn-Stark,320.0


---

### Betweenness Centrality 

<center><img src="images/betweenness-centrality.png" align="center"/></center>

> Betweenness centrality **identifies nodes that are strategically positioned in the network**, meaning that information will often travel through that person.  
Such an intermediary position gives that person **power and influence**.

Betweenness centrality is a raw count of the number of short paths that go through a given node. For example, if a node is located on a bottleneck between two large communities, then it will have high betweenness.

In [17]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
CALL algo.betweenness.stream("Character", "INTERACTS1", {
  direction: "BOTH"
})
YIELD nodeId, centrality
RETURN algo.asNode(nodeId).name as personagem, centrality
ORDER BY centrality DESC
LIMIT 10

10 rows affected.


personagem,centrality
Eddard-Stark,4638.534951255039
Robert-Baratheon,3682.391035767813
Tyrion-Lannister,3272.6060155260348
Jon-Snow,2952.057281565677
Catelyn-Stark,2604.755646755593
Daenerys-Targaryen,1484.2780232288708
Robb-Stark,1255.6896562838224
Drogo,1115.0946392450378
Bran-Stark,960.0319135675136
Sansa-Stark,639.0769144474225


In [18]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
CALL algo.betweenness("Character", "INTERACTS1", {direction: "BOTH", writeProperty: "book1BetweennessCentrality"})

1 rows affected.


loadMillis,computeMillis,writeMillis,nodes,minCentrality,maxCentrality,sumCentrality
5,8,3,796,-1.0,-1.0,-1.0


--- 

### Closeness Centrality

Closeness centrality is a **way of detecting nodes that are able to spread information very efficiently through a graph.**  
The closeness centrality of a node **measures its average farness (inverse distance) to all other nodes**. 

**Nodes with a high closeness score have the shortest distances to all other nodes.**

In [20]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
CALL algo.closeness.stream("Character", "INTERACTS2", {
  direction: "BOTH"
})
YIELD nodeId, centrality
RETURN algo.asNode(nodeId).name  as personagem, centrality
ORDER BY centrality DESC
LIMIT 10

10 rows affected.


personagem,centrality
Robb-Stark,0.4777777777777778
Eddard-Stark,0.4574468085106383
Robert-Baratheon,0.448695652173913
Jaime-Lannister,0.4471403812824956
Tyrion-Lannister,0.4440619621342513
Joffrey-Baratheon,0.4387755102040816
Arya-Stark,0.4350758853288364
Catelyn-Stark,0.4336134453781513
Cersei-Lannister,0.4307178631051753
Sansa-Stark,0.4307178631051753


--- 

### PageRank

<center><img src="images/pagerank.svg.png" width="360px" height="360px" align="center"/></center>

> PageRank captures how effectively you are taking advantage of your network contacts. 

In our context, PageRank centrality nicely captures narrative tension.   
Indeed, major developments occur when two important characters interact.

In [23]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
CALL algo.pageRank.stream("Character", "INTERACTS3", {iterations:20, dampingFactor:0.85})
YIELD nodeId, score
RETURN algo.asNode(nodeId).name  as personagem, score
ORDER BY score DESC
LIMIT 10

10 rows affected.


personagem,score
Tyrion-Lannister,5.63370146796147
Tywin-Lannister,4.113908375703381
Varys,3.802156675270728
Stannis-Baratheon,3.042802365479643
Sansa-Stark,2.87449812875204
Walder-Frey,2.5437637083167632
Samwell-Tarly,2.1651436854794675
Robb-Stark,1.9518855025131447
Tommen-Baratheon,1.7722262736362548
Viserys-Targaryen,1.7228457572694054


In [24]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
CALL algo.pageRank("Character", "INTERACTS1", {direction: "BOTH", writeProperty:'book1PageRank'})

1 rows affected.


nodes,iterations,loadMillis,computeMillis,writeMillis,dampingFactor,write,writeProperty
796,20,4,6,2,0.85,True,book1PageRank


---

### Community Detection

<center><img src="images/community.png" width="360px" height="360px" align="center"/></center>

In [25]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
CALL algo.labelPropagation(
  'MATCH (c:Character) RETURN id(c) as id',
  'MATCH (c:Character)-[rel]-(c2) RETURN id(c) as source, id(c2) as target, SUM(rel.weight) as weight',
  {graph:'cypher', partitionProperty: 'community'})

1 rows affected.


loadMillis,computeMillis,writeMillis,postProcessingMillis,nodes,communityCount,iterations,didConverge,p1,p5,p10,p25,p50,p75,p90,p95,p99,p100,weightProperty,write,partitionProperty,writeProperty
57,2,0,2,796,127,1,False,1,1,1,1,2,3,8,38,91,117,weight,True,community,community


In [26]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
MATCH (c:Character)
WHERE exists(c.community) return count(distinct c.community)

1 rows affected.


count(distinct c.community)
127


### Querying Communities

In [27]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
MATCH (c:Character)
WHERE exists(c.community)
RETURN c.community as community, count(c) AS count
ORDER BY count DESC limit 10

10 rows affected.


community,count
224,117
279,91
232,59
333,52
222,48
209,43
331,38
269,31
242,28
260,23


> It’d be good to know who are the **influential people in each community**. To do that we’ll need to calculate a **PageRank score for each character across all the books**:

In [28]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
CALL algo.pageRank('MATCH (c:Character) RETURN id(c) as id', 'MATCH (c:Character)-[rel]-(c2) RETURN id(c) as source, id(c2) as target, SUM(rel.weight) as weight', {graph:'cypher', writeProperty: 'pageRank'})

1 rows affected.


nodes,iterations,loadMillis,computeMillis,writeMillis,dampingFactor,write,writeProperty
796,20,36,8,1,0.85,True,pageRank


In [29]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
MATCH (c:Character)
WHERE exists(c.community)
WITH c ORDER BY c.pageRank DESC
RETURN c.community as cluster, count(*) AS count, collect(c.name)[..10] as influencers
ORDER BY count DESC limit 10

10 rows affected.


cluster,count,influencers
224,117,"['Tyrion-Lannister', 'Jaime-Lannister', 'Cersei-Lannister', 'Sansa-Stark', 'Joffrey-Baratheon', 'Tywin-Lannister', 'Petyr-Baelish', 'Varys', 'Tommen-Baratheon', 'Gregor-Clegane']"
279,91,"['Jon-Snow', 'Samwell-Tarly', 'Mance-Rayder', 'Jeor-Mormont', 'Aemon-Targaryen-(Maester-Aemon)', 'Janos-Slynt', 'Eddison-Tollett', 'Tormund', 'Bowen-Marsh', 'Craster']"
232,59,"['Daenerys-Targaryen', 'Barristan-Selmy', 'Jorah-Mormont', 'Hizdahr-zo-Loraq', 'Quentyn-Martell', 'Drogo', 'Daario-Naharis', 'Rhaegar-Targaryen', 'Irri', 'Viserys-Targaryen']"
333,52,"['Theon-Greyjoy', 'Asha-Greyjoy', 'Ramsay-Snow', 'Roose-Bolton', 'Aeron-Greyjoy', 'Balon-Greyjoy', 'Lorren', 'Jeyne-Poole', 'Hagen', 'Aggar']"
222,48,"['Catelyn-Stark', 'Robb-Stark', 'Edmure-Tully', 'Lysa-Arryn', 'Robert-Arryn', 'Brynden-Tully', 'Jon-Umber-(Greatjon)', 'Ryman-Frey', 'Rickard-Karstark', 'Lothar-Frey']"
209,43,"['Arya-Stark', 'Gendry', 'Yoren', 'Amory-Lorch', 'Beric-Dondarrion', 'Lem', 'Hot-Pie', 'Harwin', 'Anguy', 'Tom-of-Sevenstreams']"
331,38,"['Stannis-Baratheon', 'Davos-Seaworth', 'Renly-Baratheon', 'Melisandre', 'Selyse-Florent', 'Randyll-Tarly', 'Axell-Florent', 'Cressen', 'Mathis-Rowan', 'Godry-Farring']"
269,31,"['Brienne-of-Tarth', 'Loras-Tyrell', 'Aerys-II-Targaryen', 'Addam-Marbrand', 'Vargo-Hoat', 'Cleos-Frey', 'Dick-Crabb', 'Walton', 'Lyle-Crakehall', 'Pia']"
242,28,"['Eddard-Stark', 'Robert-Baratheon', 'Jory-Cassel', 'Jon-Arryn', 'Brandon-Stark', 'Tomard', 'Alyn', 'Lyanna-Stark', 'Vayon-Poole', 'Wyl-(guard)']"
260,23,"['Bran-Stark', 'Rodrik-Cassel', 'Luwin', 'Rickon-Stark', 'Hodor', 'Jojen-Reed', 'Osha', 'Walder-Frey-(son-of-Merrett)', 'Nan', 'Mikken']"


---

### Intra community PageRank

> We can also calculate the **PageRank within communities**.   

Run the following query to calculate the page rank for the **2nd largest community**:

In [30]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
MATCH (c:Character) WHERE EXISTS(c.community)
WITH c.community AS communityId, COUNT(*) AS count
ORDER BY count DESC
SKIP 1 LIMIT 1
CALL apoc.cypher.doIt(
  "CALL algo.pageRank(
    'MATCH (c:Character) WHERE c.community =" + communityId + " RETURN id(c) as id',
    'MATCH (c:Character)-[rel]->(c2) WHERE c.community =" + communityId + " AND c2.community =" + communityId + " RETURN id(c) as source,id(c2) as target, sum(rel.weight) as weight',
    {graph:'cypher', writeProperty: 'communityPageRank'}) YIELD nodes RETURN count(*)", {})
YIELD value
RETURN value

1 rows affected.


value
{'count(*)': 1}


In [31]:
%%cypher http://neo4j:dextra2020@172.19.0.2:7474/db/data
MATCH (c:Character) WHERE exists(c.community)
WITH c.community AS communityId, COUNT(*) AS count
ORDER BY count DESC
SKIP 1 LIMIT 1
MATCH (c:Character) WHERE c.community = communityId
RETURN c.name as personagem, c.communityPageRank as communityPageRank
ORDER BY c.communityPageRank DESC
LIMIT 10

10 rows affected.


personagem,communityPageRank
Jon-Snow,2.251582142415878
Ygritte,1.6258163579338325
Samwell-Tarly,1.587915125876718
Spare-Boot,1.2903841640295468
Tormund,1.0416310369731947
Small-Paul,0.981316454247987
Xhondo,0.8886139258854897
Satin,0.8880798984466348
Pypar,0.6120499401186054
Val,0.602313577589014


---

### Referências

[Exploratory Data Analysis](http://guides.neo4j.com/data_science/01_eda.html)  
[Graph Algorithms](https://guides.neo4j.com/sandbox/graph-algorithms/)  
[A Song of Ice and Fire - Dataset](https://github.com/mathbeveridge/asoiaf/tree/master/data)  
[Game of Thrones - HBO dataset](https://github.com/mathbeveridge/gameofthrones)  
[The Neo4j Graph Algorithms User Guide v3.5](https://neo4j.com/docs/graph-algorithms/current/)  
[Personalized Product Recommendations with Neo4j](http://guides.neo4j.com/sandbox/recommendations)  
[cypher-refcard](https://neo4j.com/docs/cypher-refcard/current/?ref=browser-guide)  
[Papers With Code](https://paperswithcode.com/area/graphs)  

Andrew Beveridge and Michael Chemers, "The Game of 'The Game of Thrones': Networked Concordances and Fractal Dramaturgy", in: Paola Brembilla and Ilaria De Pacalis (eds.), Reading Contemporary Serial Television Universes: A Narrative Ecosystem Framework, Routledge, 2018.

---

PLAY ON NEO4J :play http://guides.neo4j.com/sandbox/legis-graph