### Centrality: 
    Who are the most important nodes in the graph? 

- Degree
- Closeness 
- Betweenness
- Eigenvector

In [4]:
import networkx as nx 

In [5]:
import pandas as pd 
fields = ['Source','Target']
got = pd.read_csv('data/got1.csv', usecols=fields)
got = got.append(pd.read_csv('data/got2.csv', usecols=fields), ignore_index = True)
got = got.append(pd.read_csv('data/got3.csv', usecols=fields), ignore_index = True)
got = got.append(pd.read_csv('data/got4.csv', usecols=fields), ignore_index = True)
got = got.append(pd.read_csv('data/got5.csv', usecols=fields), ignore_index = True)

In [6]:
GOT = nx.from_pandas_edgelist(got,source='Source',target='Target')
print(nx.info(GOT))

Name: 
Type: Graph
Number of nodes: 796
Number of edges: 2823
Average degree:   7.0930


Degree Centrality:

Degree: The number of nodes a node is connected to 

Centrality:

$C_D(u) = \frac{deg(u)}{(n-1)}$

In [10]:
degree_centrality = nx.degree_centrality(GOT)
type(degree_centrality)

dict

In [5]:
sorted(degree_centrality.items(), key= lambda x: x[1], reverse=True)[:5]

[('karev', 0.22580645161290322),
 ('sloan', 0.16129032258064516),
 ('torres', 0.12903225806451613),
 ('grey', 0.12903225806451613),
 ('izzie', 0.12903225806451613)]

In [14]:
sorted(GA.degree(),key = lambda x : x[1], reverse=True)[:5]

[('karev', 7), ('sloan', 5), ('torres', 4), ('grey', 4), ('izzie', 4)]

### Closeness centrality:

Average distance to all other nodes

$C_C(u) = \frac{n-1}{\sum_{v=1}^{n-1} d(v,u)}$

Reciprocal of this is the average distance to all other nodes.|

In [15]:
closeness_centrality = nx.closeness_centrality(GA)
closeness_centrality

{'lexi': 0.26253101736972706,
 'sloan': 0.2892290869327502,
 'karev': 0.2892290869327502,
 'owen': 0.19173613628126135,
 'yang': 0.1594814591498342,
 'altman': 0.2337604949182501,
 'torres': 0.29937747594793435,
 'arizona': 0.21600653327888933,
 'derek': 0.2337604949182501,
 'grey': 0.2216170925848345,
 'izzie': 0.24731182795698925,
 "o'malley": 0.2708653353814644,
 'colin': 0.13228307076769194,
 'preston': 0.13228307076769194,
 'kepner': 0.21067303863002787,
 'addison': 0.2892290869327502,
 'nancy': 0.21067303863002787,
 'olivia': 0.2337604949182501,
 'mrs. seabury': 0.21067303863002787,
 'chief': 0.07373271889400922,
 'adele': 0.05161290322580645,
 'ellis grey': 0.08602150537634408,
 'thatch grey': 0.07373271889400922,
 'susan grey': 0.05161290322580645,
 'bailey': 0.06451612903225806,
 'tucker': 0.04301075268817204,
 'hank': 0.18752215526409075,
 'denny': 0.18752215526409075,
 'finn': 0.17236884978820463,
 'steve': 0.17236884978820463,
 'ben': 0.04301075268817204,
 'avery': 0.196143

In [16]:
sorted(closeness_centrality.items(), key= lambda x: x[1], reverse=True)[:5]

[('torres', 0.29937747594793435),
 ('sloan', 0.2892290869327502),
 ('karev', 0.2892290869327502),
 ('addison', 0.2892290869327502),
 ("o'malley", 0.2708653353814644)]

In [17]:
1/closeness_centrality['torres']

3.340264650283554

### Betweeness Centrality 

Quantifies the number of times a node acts like a bridge(or a broker) along the shortest path between two other nodes 

$C_B(v) = \sum_{S,J \in V} \frac{\sigma(s,t|v)}{\sigma(s,t)}$

Denominator is the total number of shortest paths from node s to node t and numerator is the number of those that pass through v. 

In [22]:
betweenness_centrality = nx.betweenness_centrality(GA)
betweenness_centrality

{'lexi': 0.07741935483870968,
 'sloan': 0.248100358422939,
 'karev': 0.2048745519713262,
 'owen': 0.12903225806451613,
 'yang': 0.09247311827956989,
 'altman': 0.16344086021505377,
 'torres': 0.14440860215053763,
 'arizona': 0.0,
 'derek': 0.038602150537634404,
 'grey': 0.10078853046594982,
 'izzie': 0.10311827956989246,
 "o'malley": 0.11702508960573477,
 'colin': 0.0,
 'preston': 0.0,
 'kepner': 0.0,
 'addison': 0.09480286738351255,
 'nancy': 0.0,
 'olivia': 0.01064516129032258,
 'mrs. seabury': 0.0,
 'chief': 0.0064516129032258064,
 'adele': 0.0,
 'ellis grey': 0.008602150537634409,
 'thatch grey': 0.0064516129032258064,
 'susan grey': 0.0,
 'bailey': 0.002150537634408602,
 'tucker': 0.0,
 'hank': 0.0,
 'denny': 0.0,
 'finn': 0.0,
 'steve': 0.0,
 'ben': 0.0,
 'avery': 0.0}

In [23]:
sorted(betweenness_centrality.items(), key= lambda x: x[1], reverse=True)[:5]

[('sloan', 0.248100358422939),
 ('karev', 0.2048745519713262),
 ('altman', 0.16344086021505377),
 ('torres', 0.14440860215053763),
 ('owen', 0.12903225806451613)]

### Eigenvector Centrality 

Node with high eigen vector cardinality are connected to nodes which have high eigen vector. 
Not just the quantity but also the quality. Here quality is the connectness

${\displaystyle x_{v}={\frac {1}{\lambda }}\sum _{t\in M(v)}x_{t}={\frac {1}{\lambda }}\sum _{t\in G}a_{v,t}x_{t}}$



In [24]:
eigenvector_centrality = nx.eigenvector_centrality(GA)
eigenvector_centrality

{'lexi': 0.26424680180934534,
 'sloan': 0.3226885189834077,
 'karev': 0.502765929935718,
 'owen': 0.0340951785615966,
 'yang': 0.012046878049189772,
 'altman': 0.10443160377884472,
 'torres': 0.3609266644703912,
 'arizona': 0.10564217608811323,
 'derek': 0.12570927644873783,
 'grey': 0.15107970429307171,
 'izzie': 0.2842593314893894,
 "o'malley": 0.3020100450564303,
 'colin': 0.0035273360425288774,
 'preston': 0.0035273360425288774,
 'kepner': 0.14715740970512992,
 'addison': 0.2784034899141397,
 'nancy': 0.09445089593868505,
 'olivia': 0.235554604339933,
 'mrs. seabury': 0.14715740970512992,
 'chief': 1.0663186023876088e-06,
 'adele': 6.156393321318697e-07,
 'ellis grey': 1.2312786642593052e-06,
 'thatch grey': 1.0663186023876088e-06,
 'susan grey': 6.156393321318697e-07,
 'bailey': 4.7944137268011994e-08,
 'tucker': 3.390162458034995e-08,
 'hank': 0.0832011015170876,
 'denny': 0.0832011015170876,
 'finn': 0.044221065458551645,
 'steve': 0.044221065458551645,
 'ben': 3.390162458034995

In [27]:
max_value = max(eigenvector_centrality.items(), key= lambda x: x[1])

ec_scaled = {}
for k in eigenvector_centrality.keys():
    ec_scaled[k] = eigenvector_centrality[k]/max_value[1]

sorted(ec_scaled.items(), key= lambda x: x[1], reverse=True)[:5]

[('karev', 1.0),
 ('torres', 0.7178821057276854),
 ('sloan', 0.6418265434665902),
 ("o'malley", 0.6006971178318394),
 ('izzie', 0.5653909991986408)]