In [3]:
import networkx as nx
import operator

# **Page Rank**

Assigns a score of importance to each node. Important nodes are those with many in-links from important pages. This algorithm can be used for any type of network, bt is more useful for directed graphs.

$n$ = number of nodes in the network

$k$ = number of steps

1. Assign all nodes a PageRank of $1\over{n}$
2. Perform the **Basic PageRank Rule** $k$ times

**Basic PageRank Rule**: Each node gives an equal share of its current PageRank to all the nodes it links to. The new PageRank of each node is the sum of all the PageRank it received from other nodes.


**Interpretation**: The PageRank of a node at step $k$ is the probability that a random walker lands on the node after taking $k$ steps.

### **Page Rank Problem**

In Graphs with circular edges ($A$ points only to $B$ and $B$ points only to $A$). The fix is to introduce a damping parameter $\alpha$

Random walk of k steps with damping parameter $\alpha$: Start ona random node. Then: 
* With probability $\alpha$: chose an outgoing edge at random and follow it to the next node.
* With probability of $1 - \alpha$: chose a node at random and go to it.

Repeat $k$ times. Usually $\alpha$ is between $0.8$ and $0.9$

For the graph:

<img src='../assets/directed_graph.png' width=300px>

In [4]:
D = nx.read_adjlist(
    '../assets/directed_graph.txt', 
    nodetype=str,
    create_using=nx.DiGraph()
)

pagerank = nx.pagerank(D)
list(sorted(pagerank.items(), key=operator.itemgetter(1), reverse=True))

[('E', 0.12279407788337132),
 ('D', 0.11060435714398248),
 ('C', 0.1033176070454691),
 ('A', 0.08702583777446707),
 ('B', 0.0861715928964544),
 ('M', 0.079558477358766),
 ('G', 0.0673122066572118),
 ('J', 0.06454583255993146),
 ('L', 0.06097461273800988),
 ('O', 0.058586304561140626),
 ('F', 0.04677040884884919),
 ('N', 0.039165637954896804),
 ('K', 0.031108071424217582),
 ('I', 0.022727057514291306),
 ('H', 0.019337917638941077)]

In [5]:
# With damping

pagerank = nx.pagerank(D, alpha=0.8)
list(sorted(pagerank.items(), key=operator.itemgetter(1), reverse=True))

[('E', 0.11321421490944021),
 ('D', 0.10116229095243477),
 ('C', 0.09536395696878604),
 ('A', 0.08501610646428298),
 ('M', 0.08248080082009636),
 ('B', 0.0808676033541354),
 ('G', 0.07284589253716836),
 ('J', 0.06862683666636804),
 ('L', 0.06388914910589583),
 ('O', 0.06134488265253614),
 ('F', 0.05058040840321028),
 ('N', 0.0404033404331596),
 ('K', 0.03409119197565565),
 ('I', 0.026984105237620605),
 ('H', 0.023129219519209793)]