# Degree and Closeness Centrality




## Node Importance
![](images/3-1.png)

### Different ways of thinking about "importance"

* Degree: Number of friends
5 most important nodes are: 34, 1, 33, 3, 2

* Average proximity to other nodes.
5 most important nodes are:
1, 3, 34, 32, 9

* Fraction of shortest paths that pass through node.
5 most imporant nodes are 1, 34, 22, 3, 32

### Network Centrality

Centrality measures identify the most important nodes in a network:
* Influential nodes in a social network
* Nodes that disseminate information to many nodes or prevent epidemics
* Hubs in a transportation network
* Importnat pages on the Web
* Nodes that prevent the network from breaking up

### Centrality Measures

* Degree centrality
* Closeness Centrality
* Betweenness centrality
* Load centrality
* Page Rank
* Katz centrality
* Percolation centrality

### Degree Centrality

**Assumption**: Important nodes have many connections

The most basic measure of centrality: number of neighbors.

**Undirected networks**: use degree 

**Directed networks**: use in-degree or out-degree or combination

### Degree Centrality - Undirected Networks

$C_{deg}(v)=\frac{d_v}{|N|-1}$, where *N* is the set of nodes in the network and $d_v$ is the degree of node $v$.

If a node is connected to every single nodes in the network, the $C_{deg}$ = 1. If not connected to any nodes in the network, the  $C_{deg}$ = 0.


In [1]:
import networkx as nx

G=nx.karate_club_graph()
G=nx.convert_node_labels_to_integers(G,first_label=1)
degCent=nx.degree_centrality(G)
degCent[34]

0.5151515151515151

17 connections and (34-1)  nodes/ -1 to exclude node '34'

### Degree Centrality - In-Directed Networks

$C_{indeg}(v)=\frac{d^{in}_v}{|N|-1}$, where *N* is the set of nodes in the network and ${d^{in}_v}$ is the in-degree of node $v$.


In [None]:
indegCent = nx.in_degree_centrality(G)
indegCent['A']

### Degree Centrality - Directed Networks

$C_{indeg}(v)=\frac{d^{out}_v}{|N|-1}$, where *N* is the set of nodes in the network and ${d^{out}_v}$ is the in-degree of node $v$.

In [None]:
outdegCent = nx.out_degree_centrality(G)
outdegCent['A']

### Closeness Centrality

**Assumption**: important nodes are close to other nodes. measure distance between two nodes by looking at the shortest path, the length of the shortest path between them.


$$C_{close}(v)=\frac{|N|-1}{{\sum_{u\in N}}{\{v\}^{d(v,u)}}}$$

the closeness centrality of node V is going to be by taking the ratio of the number of nodes in the network minus one divided by the sum over all the other nodes in the network, and the distance between node V and those nodes. So, that's the sum and the denominator in the definition of centrality.

d(v,u) - length of shortest path from v to u

In [3]:
closeCent=nx.closeness_centrality(G)
closeCent[32]

0.5409836065573771

How 0.541 was derived?

In [6]:
sum(nx.shortest_path_length(G,32).values())

61

In [9]:
(len(G.nodes())-1)/61

0.5409836065573771

### Disconnected Nodes

How to measure the closeness centrality of a node when it cannot reach all other nodes?

What is the closeness centrality of node L? ![](images/3-2.png)

* Option 1: Consider only nodes that L can reach:

$$C_{close}(L)=\frac{|R(L)|}{\sum_{u\in R(L)}^{d(L,u)}}$$

where R(L is the set of nodes L can reach

So $C_{close}(L)=\frac{1}{1}=1$

**Problem** Centrality of 1 is too high for a node that can only reach one other node!

* Option 2: Consider only nodes that L can reach and **normalize by the fraction of nodes L can reach**:

$$C_{close}(L)=[\frac{|R(L)|}{|N-1|}]\frac{|R(L)|}{\sum_{u\in R(L)}^{d(L,u)}}$$

$C_{close}(L)=[\frac{1}{14}]\frac{1}{1}=0.071$

In [None]:
closeCent = nx.closeness_centrality(G, normalized=False) #option1
closeCent = nx.closeness_centrality(G, normalized=True) # option2