### Closeness Centrality

Closeness Centrality of a node is a measure of centrality in a network, that measures the nearness (as opposite from the distance) from a node to all other nodes. Nodes having a high closeness score have the shortest distances to all other nodes. Closeness centrality is based on the idea that nodes with short distance to other nodes can spread information very productively through the network. This is important for the availability of knowledge and resources.

To calculate closeness, one begins by calculating, for each pair of nodes in the network, the length of the shortest path from one to the other (geodesic distance). Then for each node, one sums up the total distance from the node to all other nodes. The greater the raw closeness centrality, the greater the time it takes on average for information originating at random points in the network to arrive at the node. Equally, one can interpret closeness as the potential ability of a node to reach all other nodes as quickly as possible. Nodes having a high closeness centrality are nearby all other nodes and have advantages in accessing resources in a network or having a good overview of the agents in a network.

It is important to note that raw closeness centrality is an inverse measure of centrality in that it is nodes with smaller scores that are the most central. On the other hand normalized closeness centrality is not an inverse measure, as nodes with higher centrality score are more central. Our algorithm returns normalized closeness centrality score.

In [1]:
#Loading library
import py2neo as py2neo

In [2]:
#Accessing local Neo4j Server
py2neo.authenticate("localhost:7474", "neo4j", "neo4j")

In [3]:
graph = py2neo.Graph("http://localhost:7474/db/data/")

In [4]:
#Calculating coefficients
query = """
CALL algo.closeness(
'MATCH (u:Person) RETURN id(u) as id',
'MATCH (u1:Person)-[:HAS_CONTACT]->(u2:Person) RETURN id(u1) as source, id(u2) as target',
{graph:'cypher', write: true, concurrency:4});
"""

In [5]:
#Running query 
results = graph.data(query)

In [6]:
results

[{'computeMillis': 24, 'loadMillis': 408, 'nodes': 133, 'writeMillis': 88}]

In [30]:
#Calculating coefficients
query = """
CALL algo.closeness.stream(
'MATCH (u:Person) RETURN id(u) as id',
'MATCH (u1:Person)-[:HAS_CONTACT]->(u2:Person) RETURN id(u1) as source, id(u2) as target',
{graph:'cypher', write: true, concurrency:4})
YIELD nodeId, centrality;
"""

In [31]:
#Running query 
results = graph.data(query)

In [29]:
results[:10]

[{'centrality': 0.42038216560509556, 'nodeId': 1401},
 {'centrality': 0.4217252396166134, 'nodeId': 1402},
 {'centrality': 0.43564356435643564, 'nodeId': 1403},
 {'centrality': 0.45993031358885017, 'nodeId': 1404},
 {'centrality': 0.4664310954063604, 'nodeId': 1405},
 {'centrality': 0.4074074074074074, 'nodeId': 1406},
 {'centrality': 0.38372093023255816, 'nodeId': 1407},
 {'centrality': 0.45674740484429066, 'nodeId': 1408},
 {'centrality': 0.43137254901960786, 'nodeId': 1412},
 {'centrality': 0.4489795918367347, 'nodeId': 1413}]

In [33]:
#Sorting dictionary containing list
from operator import itemgetter
newlist = sorted(results, key=itemgetter('centrality'),reverse=True) 

In [34]:
#Printing top 10 triangle nodes
newlist[:10]

[{'centrality': 0.4870848708487085, 'nodeId': 1549},
 {'centrality': 0.48175182481751827, 'nodeId': 1460},
 {'centrality': 0.47653429602888087, 'nodeId': 1476},
 {'centrality': 0.4731182795698925, 'nodeId': 1454},
 {'centrality': 0.4697508896797153, 'nodeId': 1484},
 {'centrality': 0.4697508896797153, 'nodeId': 1525},
 {'centrality': 0.4697508896797153, 'nodeId': 1570},
 {'centrality': 0.4664310954063604, 'nodeId': 1405},
 {'centrality': 0.4664310954063604, 'nodeId': 1441},
 {'centrality': 0.4647887323943662, 'nodeId': 1424}]

In [35]:
#Creating nodeid array
top_10 = []
for a in range(10):
    top_10.append(list(newlist[:10][a].values())[0])

In [36]:
#algo.triangleCount.stream returns nodeid's. In order to get person names of this nodes we need to query nodeids
query = """
MATCH (person:Person) 
WHERE ID(person) in ["""+str(top_10).strip('[]')+"""] 
RETURN person.name
"""

In [37]:
graph.data(query)

[{'person.name': 'Andy Wachowski'},
 {'person.name': 'J.T. Walsh'},
 {'person.name': 'Jay Mohr'},
 {'person.name': 'Greg Kinnear'},
 {'person.name': 'Robin Williams'},
 {'person.name': 'Victor Garber'},
 {'person.name': 'Bruno Kirby'},
 {'person.name': 'Matthew Fox'},
 {'person.name': 'Jan de Bont'},
 {'person.name': 'James Thompson'}]