# Neo4j案例——《权力的游戏》人物关系分析

**数据来源：**《Network of Thrones》Andrew Beveridge and Jie Shan
- https://www.maa.org/sites/default/files/pdf/Mathhorizons/NetworkofThrones.pdf
- stormofswords.csv

**数据模型：**(:Character {name})-[:INTERACTS {weight}]->(:Character {name})
- 带有标签Character的节点代表小说中的角色
- 用单向关系类型INTERACTS代表小说中的角色有过接触
- 节点属性会存储角色的名字name
- 两角色间接触的次数作为关系的属性——权重weight

## Outline
1. 导入原始数据
2. 人物网络分析（人物数量、最短路径、关键节点、节点中心度）
3. 使用python-igraph（PageRank、社区发现算法）

In [1]:
!pip install py2neo

Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple


In [2]:
from py2neo import Graph

# 连接Neo4j数据库，输入地址、用户名、密码
graph = Graph('http://localhost:7474', auth=('neo4j', 'neo4j'))

## 导入原始数据

In [3]:
# 清空数据库原有的图，确保环境空白
graph.run('MATCH (n) DETACH DELETE n')
graph.run('DROP CONSTRAINT ON (c:Character) ASSERT c.name IS UNIQUE')

In [4]:
# 首先创建节点c，并做唯一限制性约束，c.name唯一，保证schema的完整性
graph.run('CREATE CONSTRAINT ON (c:Character) ASSERT c.name IS UNIQUE')
# 一旦约束创建会相应地创建索引，这将有助于通过角色名字查询的性能

In [5]:
# 将数据下载到本地，并加载到Neo4j中
# MERGE匹配某个节点，如果不存在就创建，存在就返回
for record in graph.run('''
LOAD CSV WITH HEADERS FROM "file:/stormofswords.csv" AS row
MERGE (src:Character {name: row.Source})
MERGE (tgt:Character {name: row.Target})
MERGE (src)-[r:INTERACTS]->(tgt)
SET r.weight = toInteger(row.Weight)
RETURN count(*) AS paths_created
'''):
    print(record)

352


In [6]:
# 可视化整个图形
for record in graph.run('''
MATCH p=(:Character)-[r:INTERACTS]-(:Character)
RETURN p, r.weight LIMIT 30
'''):
    print(record['p'], 'weight=', record['r.weight'])

(Aemon)<-[:INTERACTS {}]-(Stannis) weight= 4
(Aemon)<-[:INTERACTS {}]-(Jon) weight= 30
(Aemon)<-[:INTERACTS {}]-(Robert) weight= 4
(Aemon)-[:INTERACTS {}]->(Samwell) weight= 31
(Aemon)-[:INTERACTS {}]->(Grenn) weight= 5
(Aerys)-[:INTERACTS {}]->(Tywin) weight= 8
(Aerys)-[:INTERACTS {}]->(Tyrion) weight= 5
(Aerys)-[:INTERACTS {}]->(Robert) weight= 6
(Aerys)-[:INTERACTS {}]->(Jaime) weight= 18
(Alliser)<-[:INTERACTS {}]-(Janos) weight= 9
(Alliser)<-[:INTERACTS {}]-(Jon) weight= 15
(Alliser)-[:INTERACTS {}]->(Mance) weight= 5
(Amory)-[:INTERACTS {}]->(Oberyn) weight= 5
(Arya)<-[:INTERACTS {}]-(Eddard) weight= 18
(Arya)<-[:INTERACTS {}]-(Robb) weight= 15
(Arya)<-[:INTERACTS {}]-(Sansa) weight= 22
(Arya)-[:INTERACTS {}]->(Tyrion) weight= 5
(Arya)-[:INTERACTS {}]->(Thoros) weight= 18
(Arya)-[:INTERACTS {}]->(Sandor) weight= 46
(Arya)-[:INTERACTS {}]->(Roose) weight= 5
(Arya)-[:INTERACTS {}]->(Robert) weight= 4
(Arya)-[:INTERACTS {}]->(Rickon) weight= 8
(Arya)-[:INTERACTS {}]->(Jon) weight= 7

## 人物网络分析

In [7]:
# 人物数量
for record in graph.run('''
MATCH (c:Character) 
RETURN count(c)
'''):
    print(record)

107


In [8]:
# 统计每个角色接触的其它角色数目
# 最小值、最大值、平均值、标准差
for record in graph.run('''
MATCH (c:Character)-[:INTERACTS]->()
WITH c, count(*) AS num
RETURN min(num) AS min, max(num) AS max, avg(num) AS avg, stdev(num) AS stdev
'''):
    print('min=', record['min'], 'max=', record['max'], 'avg=', record['avg'], 'stdev=', record['stdev'])

min= 1 max= 24 avg= 4.957746478873241 stdev= 6.2276723918750845


In [9]:
# 网络直径/两点间最短路径的最大值
for record in graph.run('''
MATCH (a:Character), (b:Character) WHERE id(a) > id(b)
MATCH p=shortestPath((a)-[:INTERACTS*]-(b))
RETURN length(p) AS len, [x IN nodes(p) | x.name] AS path
ORDER BY len DESC LIMIT 20
'''):
    print('len=', record['len'], 'path=', record['path'])

len= 6 path= ['Cressen', 'Davos', 'Stannis', 'Robert', 'Daenerys', 'Belwas', 'Illyrio']
len= 6 path= ['Illyrio', 'Belwas', 'Daenerys', 'Robert', 'Stannis', 'Davos', 'Shireen']
len= 6 path= ['Karl', 'Craster', 'Jon', 'Robert', 'Daenerys', 'Belwas', 'Illyrio']
len= 6 path= ['Lancel', 'Kevan', 'Tyrion', 'Viserys', 'Daenerys', 'Belwas', 'Illyrio']
len= 6 path= ['Illyrio', 'Belwas', 'Barristan', 'Jaime', 'Robb', 'Hodor', 'Jojen']
len= 6 path= ['Illyrio', 'Belwas', 'Daenerys', 'Robert', 'Tywin', 'Oberyn', 'Amory']
len= 6 path= ['Illyrio', 'Belwas', 'Daenerys', 'Robert', 'Sansa', 'Bran', 'Luwin']
len= 6 path= ['Karl', 'Craster', 'Samwell', 'Stannis', 'Tywin', 'Oberyn', 'Amory']
len= 6 path= ['Salladhor', 'Davos', 'Stannis', 'Robert', 'Daenerys', 'Belwas', 'Illyrio']
len= 6 path= ['Bowen', 'Janos', 'Tyrion', 'Viserys', 'Daenerys', 'Belwas', 'Illyrio']
len= 6 path= ['Nan', 'Bran', 'Sansa', 'Robert', 'Daenerys', 'Belwas', 'Illyrio']
len= 5 path= ['Jorah', 'Rhaegar', 'Robert', 'Sansa', 'Bran', 'J

In [10]:
# 两个角色之间的任意最短路径——shortestPath函数
for record in graph.run('''
MATCH (catelyn:Character {name:"Catelyn"}), (drogo:Character {name:"Drogo"})
MATCH p=shortestPath((catelyn)-[INTERACTS*]-(drogo))
RETURN p
'''):
    print(record['p'])

(Catelyn)-[:INTERACTS {}]->(Jaime)-[:INTERACTS {}]->(Barristan)<-[:INTERACTS {}]-(Jorah)-[:INTERACTS {}]->(Drogo)


In [11]:
# 两个角色之间的所有最短路径——allShortestPaths函数
for record in graph.run('''
MATCH (catelyn:Character {name:"Catelyn"}), (drogo:Character {name:"Drogo"})
MATCH p=allShortestPaths((catelyn)-[INTERACTS*]-(drogo))
RETURN p
'''):
    print(record['p'])

(Catelyn)-[:INTERACTS {}]->(Tyrion)-[:INTERACTS {}]->(Robert)<-[:INTERACTS {}]-(Daenerys)-[:INTERACTS {}]->(Drogo)
(Catelyn)-[:INTERACTS {}]->(Tyrion)<-[:INTERACTS {}]-(Viserys)<-[:INTERACTS {}]-(Daenerys)-[:INTERACTS {}]->(Drogo)
(Catelyn)-[:INTERACTS {}]->(Sansa)-[:INTERACTS {}]->(Robert)<-[:INTERACTS {}]-(Daenerys)-[:INTERACTS {}]->(Drogo)
(Catelyn)-[:INTERACTS {}]->(Jaime)-[:INTERACTS {}]->(Barristan)<-[:INTERACTS {}]-(Jorah)-[:INTERACTS {}]->(Drogo)
(Catelyn)-[:INTERACTS {}]->(Jaime)-[:INTERACTS {}]->(Robert)<-[:INTERACTS {}]-(Daenerys)-[:INTERACTS {}]->(Drogo)
(Catelyn)-[:INTERACTS {}]->(Stannis)<-[:INTERACTS {}]-(Robert)<-[:INTERACTS {}]-(Daenerys)-[:INTERACTS {}]->(Drogo)
(Catelyn)<-[:INTERACTS {}]-(Eddard)-[:INTERACTS {}]->(Robert)<-[:INTERACTS {}]-(Daenerys)-[:INTERACTS {}]->(Drogo)
(Catelyn)-[:INTERACTS {}]->(Cersei)-[:INTERACTS {}]->(Robert)<-[:INTERACTS {}]-(Daenerys)-[:INTERACTS {}]->(Drogo)
(Catelyn)-[:INTERACTS {}]->(Jaime)-[:INTERACTS {}]->(Barristan)<-[:INTERACTS {}]-

In [12]:
# 关键节点
# 定义：在网络中，如果一个节点位于其它两个节点所有的最短路径上，即称为关键节点

# collect将所有值收集作为一个列表返回
for record in graph.run('''
MATCH (a:Character), (b:Character) WHERE id(a) > id(b)
MATCH p=allShortestPaths((a)-[:INTERACTS*]-(b)) WITH collect(p) AS paths, a, b  
MATCH (c:Character) WHERE all(x IN paths WHERE c IN nodes(x)) AND NOT c IN [a,b]
RETURN a.name, b.name, c.name AS PivotalNode SKIP 300 LIMIT 30
'''):
    print('a.name:', record['a.name'], 'b.name:', record['b.name'], 'PivotalNode:', record['PivotalNode'])

a.name: Jojen b.name: Daenerys PivotalNode: Robert
a.name: Jojen b.name: Davos PivotalNode: Samwell
a.name: Jojen b.name: Eddard PivotalNode: Bran
a.name: Jojen b.name: Eddison PivotalNode: Samwell
a.name: Jojen b.name: Gendry PivotalNode: Arya
a.name: Jojen b.name: Gendry PivotalNode: Bran
a.name: Jojen b.name: Gilly PivotalNode: Samwell
a.name: Jojen b.name: Gregor PivotalNode: Arya
a.name: Jojen b.name: Gregor PivotalNode: Bran
a.name: Jojen b.name: Hoster PivotalNode: Bran
a.name: Jojen b.name: Hoster PivotalNode: Catelyn
a.name: Jojen b.name: Irri PivotalNode: Daenerys
a.name: Jojen b.name: Irri PivotalNode: Robert
a.name: Jojen b.name: Janos PivotalNode: Samwell
a.name: Jon b.name: Aerys PivotalNode: Robert
a.name: Jon b.name: Amory PivotalNode: Oberyn
a.name: Jon b.name: Belwas PivotalNode: Robert
a.name: Jon b.name: Daario PivotalNode: Daenerys
a.name: Jon b.name: Daario PivotalNode: Robert
a.name: Jon b.name: Daenerys PivotalNode: Robert
a.name: Jon b.name: Edmure PivotalNode:

In [13]:
# 上述结果显示Robert是Jojen和Daenerys的关键节点，意味着Jojen和Daenerys的所有最短路径都经过Robert
# 可视化Jojen和Daenerys所有最短路径，进行验证
for record in graph.run('''
MATCH (jojen:Character {name:"Jojen"}), (daenerys:Character {name:"Daenerys"})
MATCH p=allShortestPaths((jojen)-[:INTERACTS*]-(daenerys))
RETURN p
'''):
    print(record['p'])

(Jojen)-[:INTERACTS {}]->(Samwell)<-[:INTERACTS {}]-(Aemon)<-[:INTERACTS {}]-(Robert)<-[:INTERACTS {}]-(Daenerys)
(Jojen)<-[:INTERACTS {}]-(Bran)<-[:INTERACTS {}]-(Arya)-[:INTERACTS {}]->(Robert)<-[:INTERACTS {}]-(Daenerys)
(Jojen)<-[:INTERACTS {}]-(Bran)<-[:INTERACTS {}]-(Sansa)-[:INTERACTS {}]->(Robert)<-[:INTERACTS {}]-(Daenerys)
(Jojen)-[:INTERACTS {}]->(Samwell)<-[:INTERACTS {}]-(Stannis)<-[:INTERACTS {}]-(Robert)<-[:INTERACTS {}]-(Daenerys)
(Jojen)-[:INTERACTS {}]->(Samwell)<-[:INTERACTS {}]-(Jon)-[:INTERACTS {}]->(Robert)<-[:INTERACTS {}]-(Daenerys)
(Jojen)<-[:INTERACTS {}]-(Bran)<-[:INTERACTS {}]-(Eddard)-[:INTERACTS {}]->(Robert)<-[:INTERACTS {}]-(Daenerys)
(Jojen)-[:INTERACTS {}]->(Meera)<-[:INTERACTS {}]-(Jon)-[:INTERACTS {}]->(Robert)<-[:INTERACTS {}]-(Daenerys)
(Jojen)<-[:INTERACTS {}]-(Bran)-[:INTERACTS {}]->(Jon)-[:INTERACTS {}]->(Robert)<-[:INTERACTS {}]-(Daenerys)


**节点中心度(Centrality Measures)**

节点中心度给出网络中节点的重要性的相对度量。

有许多不同的方式来度量中心度，每种方式都代表不同类型的“重要性”。
- 度中心性(Degree Centrality)
- 加权度中心性(Weighted Degree Centrality)
- 介数中心性(Betweenness Centrality)
- 紧度中心性(Closeness Centrality)

In [14]:
# 度中心性(Degree Centrality)
# 度中心性是最简单的度量，即为某个节点在网络中的联结数

# 本案例中，某个角色的度中心性是指该角色接触的其他角色数
for record in graph.run('''
MATCH (c:Character)-[:INTERACTS]-()
RETURN c.name AS character, count(*) AS degree 
ORDER BY degree DESC
'''):
    print(record['character'], record['degree'])

Tyrion 36
Jon 26
Sansa 26
Robb 25
Jaime 24
Tywin 22
Cersei 20
Arya 19
Catelyn 18
Joffrey 18
Robert 18
Samwell 15
Bran 14
Daenerys 14
Stannis 14
Sandor 13
Eddard 12
Gregor 12
Mance 12
Lysa 10
Loras 9
Brynden 8
Edmure 8
Renly 8
Walder 8
Brienne 7
Meryn 7
Oberyn 7
Varys 7
Petyr 7
Margaery 7
Balon 6
Beric 6
Janos 6
Jorah 6
Kevan 6
Rhaegar 6
Rickon 6
Barristan 6
Ilyn 6
Aemon 5
Craster 5
Davos 5
Lothar 5
Meera 5
Podrick 5
Shae 5
Tommen 5
Thoros 5
Elia 5
Qhorin 5
Aerys 4
Belwas 4
Bronn 4
Daario 4
Gendry 4
Gilly 4
Hodor 4
Irri 4
Jojen 4
Melisandre 4
Myrcella 4
Rattleshirt 4
Roose 4
Val 4
Ygritte 4
Grenn 4
Theon 4
Roslin 4
Pycelle 4
Drogo 4
Alliser 3
Eddison 3
Hoster 3
Robert Arryn 3
Viserys 3
Dalla 3
Marillion 3
Mace 3
Jon Arryn 2
Luwin 2
Missandei 2
Rickard 2
Anguy 2
Nan 2
Jeyne 2
Bowen 2
Styr 2
Olenna 2
Ellaria 2
Chataya 2
Amory 1
Shireen 1
Walton 1
Illyrio 1
Karl 1
Aegon 1
Kraznys 1
Rakharo 1
Worm 1
Cressen 1
Salladhor 1
Qyburn 1
Orell 1
Lancel 1
Ramsay 1
Doran 1


In [15]:
# 加权度中心性(Weighted Degree Centrality)

# 对某个角色的INTERACTS关系所有weight相加得到加权度中心性
for record in graph.run('''
MATCH (c:Character)-[r:INTERACTS]-()
RETURN c.name AS character, sum(r.weight) AS weightedDegree
ORDER BY weightedDegree DESC
'''):
    print(record['character'], record['weightedDegree'])

Tyrion 551
Jon 442
Sansa 383
Jaime 372
Bran 344
Robb 342
Samwell 282
Arya 269
Joffrey 255
Daenerys 232
Cersei 226
Tywin 204
Catelyn 184
Hodor 177
Mance 160
Stannis 146
Meera 139
Sandor 137
Robert 128
Jojen 125
Brienne 122
Gregor 117
Eddard 108
Lysa 108
Edmure 98
Margaery 96
Jorah 89
Petyr 89
Davos 87
Walder 87
Ygritte 82
Rickon 81
Grenn 81
Loras 76
Oberyn 76
Beric 75
Craster 75
Aemon 74
Gilly 69
Belwas 67
Podrick 64
Barristan 63
Melisandre 62
Thoros 60
Bronn 59
Gendry 59
Qhorin 59
Brynden 55
Renly 55
Kevan 50
Varys 49
Meryn 47
Shae 45
Rattleshirt 44
Rhaegar 42
Theon 38
Aerys 37
Janos 37
Drogo 35
Lothar 34
Irri 33
Roslin 32
Ilyn 32
Tommen 31
Val 31
Daario 30
Missandei 30
Alliser 29
Balon 29
Elia 29
Jeyne 28
Eddison 24
Hoster 24
Pycelle 24
Styr 23
Marillion 23
Dalla 21
Mace 20
Robert Arryn 19
Viserys 19
Myrcella 18
Nan 18
Roose 17
Salladhor 16
Anguy 15
Worm 14
Olenna 12
Jon Arryn 11
Rickard 11
Qyburn 11
Bowen 11
Walton 10
Illyrio 10
Kraznys 10
Ellaria 10
Chataya 9
Luwin 8
Aegon 8
Rakharo

In [19]:
# 删除原有图投影
for record in graph.run('''
CALL gds.graph.drop('myGraph', false) YIELD graphName
'''):
    print(record)

'myGraph'


In [20]:
# 介数中心性(Betweenness Centrality)
# 在网络中，一个节点的介数中心性是指其它两个节点的所有最短路径都经过这个节点，则这些所有最短路径数即为此节点的介数中心性

# 使用Neo4j Graph Data Science(GDS)库
# 首先创建图投影
for record in graph.run('''
CALL gds.graph.project('myGraph', 'Character', 'INTERACTS')
'''):
    print(record)

{'Character': {'label': 'Character', 'properties': {}}}	{'INTERACTS': {'orientation': 'NATURAL', 'aggregation': 'DEFAULT', 'type': 'INTERACTS', 'properties': {}}}	'myGraph'	107	352	15


In [21]:
# 调用GDS库的betweenness函数
for record in graph.run('''
CALL gds.betweenness.stream('myGraph') YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC
'''):
    print(record['name'], record['score'])

Tyrion 332.97460317460315
Samwell 244.63571428571433
Stannis 226.20476190476188
Robert 208.6230158730159
Mance 138.66666666666669
Jaime 119.99563492063493
Sandor 114.33333333333333
Jon 111.26666666666667
Janos 90.65
Aemon 64.59761904761905
Davos 54.0
Lysa 50.616666666666674
Tywin 50.471428571428575
Gregor 48.51666666666667
Renly 42.016666666666666
Cersei 38.17460317460317
Craster 35.0
Sansa 32.84285714285714
Joffrey 31.616666666666667
Bran 30.899999999999995
Loras 28.4
Viserys 26.833333333333332
Edmure 22.249999999999996
Robb 19.871825396825393
Kevan 17.7
Beric 16.083333333333332
Arya 15.590079365079365
Varys 11.0
Jorah 10.0
Rhaegar 9.916666666666666
Walder 9.916666666666666
Brynden 9.25
Jojen 8.0
Meera 8.0
Oberyn 7.5
Catelyn 6.567857142857142
Melisandre 4.5
Belwas 4.0
Val 4.0
Brienne 3.9166666666666665
Hoster 3.6666666666666665
Balon 3.0
Daario 3.0
Lothar 1.6666666666666665
Shae 1.45
Podrick 1.25
Tommen 1.15
Irri 1.0
Rickon 0.9166666666666665
Hodor 0.8333333333333333
Bronn 0.5
Meryn 0

In [22]:
# 紧度中心性(Closeness Centrality)
# 紧度中心性是指到网络中所有其他角色的平均距离的倒数

# 使用Neo4j Graph Data Science(GDS)库
# 使用上述创建过的图投影

# 调用GDS库的closeness函数
for record in graph.run('''
CALL gds.beta.closeness.stream('myGraph') YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC
'''):
    print(record['name'], record['score'])

Bran 1.0
Catelyn 1.0
Missandei 1.0
Rhaegar 1.0
Robb 1.0
Sansa 1.0
Viserys 1.0
Aegon 1.0
Kraznys 1.0
Rakharo 1.0
Worm 1.0
Jon 0.8571428571428571
Rickon 0.8571428571428571
Arya 0.8
Tyrion 0.8
Jaime 0.7857142857142857
Cersei 0.75
Jorah 0.75
Jeyne 0.75
Lysa 0.7142857142857143
Roose 0.7142857142857143
Robert 0.6764705882352942
Theon 0.6666666666666666
Meera 0.6428571428571429
Tywin 0.6363636363636364
Drogo 0.6363636363636364
Marillion 0.6363636363636364
Joffrey 0.631578947368421
Brienne 0.6190476190476191
Hodor 0.6
Rickard 0.6
Robert Arryn 0.6
Ramsay 0.6
Jojen 0.5833333333333334
Nan 0.5833333333333334
Gregor 0.5806451612903226
Belwas 0.5714285714285714
Daario 0.5714285714285714
Edmure 0.5652173913043478
Tommen 0.56
Sandor 0.5588235294117647
Stannis 0.5581395348837209
Brynden 0.5555555555555556
Meryn 0.5555555555555556
Elia 0.5555555555555556
Hoster 0.5454545454545454
Luwin 0.5454545454545454
Ilyn 0.5384615384615384
Petyr 0.5365853658536586
Walder 0.5161290322580645
Kevan 0.5151515151515151


## 使用python-igraph

In [23]:
!pip install python-igraph

Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple


In [24]:
from igraph import Graph as IGraph

In [25]:
# 从Neo4j构建一个igraph实例
# 传入py2neo查询结果对象到igraph的TupleList构造器，创建igraph实例

query = '''
MATCH (c1:Character)-[r:INTERACTS]->(c2:Character)
RETURN c1.name, c2.name, r.weight AS weight
'''
ig = IGraph.TupleList(graph.run(query), weights=True)
print(ig)

IGRAPH UNW- 107 352 --
+ attr: name (v), weight (e)
+ edges (vertex names):
       Aemon -- Samwell, Grenn, Robert, Jon, Stannis
     Samwell -- Aemon, Grenn, Mance, Jon, Bran, Meera, Jojen, Stannis,
Craster, Eddison, Gilly, Janos, Bowen, Qhorin, Melisandre
       Grenn -- Aemon, Samwell, Jon, Eddison
       Aerys -- Tywin, Tyrion, Robert, Jaime
       Tywin -- Aerys, Tyrion, Robert, Jaime, Oberyn, Joffrey, Gregor, Cersei,
Brynden, Balon, Podrick, Walder, Stannis, Robb, Petyr, Lysa, Pycelle, Varys,
Tommen, Kevan, Val, Mace
      Tyrion -- Aerys, Tywin, Robert, Jaime, Oberyn, Arya, Sandor, Joffrey,
Gregor, Cersei, Balon, Loras, Bronn, Podrick, Catelyn, Stannis, Robb, Petyr,
Lysa, Sansa, Pycelle, Meryn, Shae, Elia, Varys, Ilyn, Viserys, Renly, Janos,
Margaery, Myrcella, Kevan, Ellaria, Mace, Chataya, Doran
      Robert -- Aemon, Aerys, Tywin, Tyrion, Jaime, Arya, Thoros, Sandor, Jon,
Cersei, Barristan, Stannis, Sansa, Daenerys, Rhaegar, Eddard, Renly, Jon Arryn
       Jaime -- Aerys, Tyw

In [26]:
# PageRank——特征向量中心性(Eigenvector Centrality)算法
# 在igraph实例中运行PageRank算法，然后把结果写回Neo4j，在角色节点创建一个pagerank属性存储igraph计算的值

pg = ig.pagerank()
pgvs = []
# ig.vs:图的顶点序列
for p in zip(ig.vs, pg):
    print(p)
    pgvs.append({'name':p[0]['name'], 'pg':p[1]})
print(pgvs)

(igraph.Vertex(<igraph.Graph object at 0xfffcf7361a40>, 0, {'name': 'Aemon'}), 0.007328980991947572)
(igraph.Vertex(<igraph.Graph object at 0xfffcf7361a40>, 1, {'name': 'Samwell'}), 0.021619725923803457)
(igraph.Vertex(<igraph.Graph object at 0xfffcf7361a40>, 2, {'name': 'Grenn'}), 0.0065125330245105004)
(igraph.Vertex(<igraph.Graph object at 0xfffcf7361a40>, 3, {'name': 'Aerys'}), 0.005477506014302078)
(igraph.Vertex(<igraph.Graph object at 0xfffcf7361a40>, 4, {'name': 'Tywin'}), 0.02570016262642541)
(igraph.Vertex(<igraph.Graph object at 0xfffcf7361a40>, 5, {'name': 'Tyrion'}), 0.0428849819999633)
(igraph.Vertex(<igraph.Graph object at 0xfffcf7361a40>, 6, {'name': 'Robert'}), 0.022292016521362864)
(igraph.Vertex(<igraph.Graph object at 0xfffcf7361a40>, 7, {'name': 'Jaime'}), 0.028727587587471192)
(igraph.Vertex(<igraph.Graph object at 0xfffcf7361a40>, 8, {'name': 'Alliser'}), 0.005162125869510502)
(igraph.Vertex(<igraph.Graph object at 0xfffcf7361a40>, 9, {'name': 'Mance'}), 0.018095

In [27]:
# UNWIND进行列表遍历
write_clusters_query = '''
UNWIND $nodes AS n
MATCH (c:Character) WHERE c.name = n.name
SET c.pagerank = n.pg
'''
graph.run(write_clusters_query, nodes=pgvs)

In [28]:
# 在Neo4j图中查询最高PageRank值的节点
for record in graph.run('''
MATCH (n:Character)
RETURN n.name AS name, n.pagerank AS pagerank
ORDER BY pagerank DESC LIMIT 30
'''):
    print(record['name'], record['pagerank'])

Tyrion 0.0428849819999633
Jon 0.03582869669163556
Robb 0.030171146655947625
Sansa 0.03000971666010856
Daenerys 0.02881425425830273
Jaime 0.028727587587471192
Tywin 0.02570016262642541
Robert 0.022292016521362864
Cersei 0.0222873275897735
Arya 0.022050209663844467
Samwell 0.021619725923803457
Catelyn 0.02118215995142648
Joffrey 0.020353375452309377
Bran 0.018868841806170256
Mance 0.018095171947238795
Stannis 0.01802013176519559
Sandor 0.014890666079839627
Eddard 0.013969322258977044
Gregor 0.013933847334997949
Davos 0.013406264700364075
Lysa 0.012945163039228616
Jorah 0.011545271681757216
Loras 0.011034754649094819
Oberyn 0.010531487628639497
Edmure 0.010290932718974886
Brynden 0.010232443796594933
Barristan 0.010201851332678225
Walder 0.01003360640273268
Rhaegar 0.010020348713084546
Renly 0.00974333831826283


In [29]:
# 社区发现算法——用来找出图中的社区聚类
# 使用igraph实现的随机游走算法(walktrap)来找到在社区中频繁有接触的角色社区，在社区之外角色不怎么接触
# 然后把社区发现的结果导入Neo4j，其中每个角色所属的社区用一个整数来表示

clusters = IGraph.community_walktrap(ig, weights='weight').as_clustering()
nodes = [{'name': node['name']} for node in ig.vs]
for node in nodes:
    idx = ig.vs.find(name=node['name']).index
    node['community'] = clusters.membership[idx]
print(nodes)

[{'name': 'Aemon', 'community': 0}, {'name': 'Samwell', 'community': 0}, {'name': 'Grenn', 'community': 0}, {'name': 'Aerys', 'community': 1}, {'name': 'Tywin', 'community': 1}, {'name': 'Tyrion', 'community': 1}, {'name': 'Robert', 'community': 1}, {'name': 'Jaime', 'community': 1}, {'name': 'Alliser', 'community': 0}, {'name': 'Mance', 'community': 0}, {'name': 'Amory', 'community': 1}, {'name': 'Oberyn', 'community': 1}, {'name': 'Arya', 'community': 2}, {'name': 'Thoros', 'community': 2}, {'name': 'Sandor', 'community': 2}, {'name': 'Roose', 'community': 3}, {'name': 'Rickon', 'community': 4}, {'name': 'Jon', 'community': 0}, {'name': 'Joffrey', 'community': 1}, {'name': 'Gregor', 'community': 1}, {'name': 'Gendry', 'community': 2}, {'name': 'Cersei', 'community': 1}, {'name': 'Brynden', 'community': 3}, {'name': 'Bran', 'community': 4}, {'name': 'Beric', 'community': 2}, {'name': 'Anguy', 'community': 2}, {'name': 'Balon', 'community': 1}, {'name': 'Loras', 'community': 1}, {'name

In [30]:
write_clusters_query = '''
UNWIND $nodes AS n
MATCH (c:Character) WHERE c.name = n.name
SET c.community = toInteger(n.community)
'''
graph.run(write_clusters_query, nodes=nodes)

In [31]:
# 在Neo4j中查询有多少个社区以及每个社区的成员数
for record in graph.run('''
MATCH (c:Character)
WITH c.community AS cluster, collect(c.name) AS members
RETURN cluster, members
ORDER BY cluster ASC
'''):
    print('cluster:', record['cluster'], 'members:', record['members'])

cluster: 0 members: ['Aemon', 'Alliser', 'Craster', 'Eddison', 'Gilly', 'Janos', 'Jon', 'Mance', 'Rattleshirt', 'Samwell', 'Val', 'Ygritte', 'Grenn', 'Karl', 'Bowen', 'Dalla', 'Orell', 'Qhorin', 'Styr']
cluster: 1 members: ['Aerys', 'Amory', 'Balon', 'Brienne', 'Bronn', 'Cersei', 'Gregor', 'Jaime', 'Joffrey', 'Jon Arryn', 'Kevan', 'Loras', 'Lysa', 'Meryn', 'Myrcella', 'Oberyn', 'Podrick', 'Renly', 'Robert', 'Robert Arryn', 'Sansa', 'Shae', 'Tommen', 'Tyrion', 'Tywin', 'Varys', 'Walton', 'Petyr', 'Elia', 'Ilyn', 'Pycelle', 'Qyburn', 'Margaery', 'Olenna', 'Marillion', 'Ellaria', 'Mace', 'Chataya', 'Doran']
cluster: 2 members: ['Arya', 'Beric', 'Eddard', 'Gendry', 'Sandor', 'Anguy', 'Thoros']
cluster: 3 members: ['Brynden', 'Catelyn', 'Edmure', 'Hoster', 'Lothar', 'Rickard', 'Robb', 'Roose', 'Walder', 'Jeyne', 'Roslin', 'Ramsay']
cluster: 4 members: ['Bran', 'Hodor', 'Jojen', 'Luwin', 'Meera', 'Rickon', 'Nan', 'Theon']
cluster: 5 members: ['Belwas', 'Daario', 'Daenerys', 'Irri', 'Jorah', 