## Page Rank

#### Flow Formulation

Consider the following graph

![Graph](SampleGraph.png)

The Importance of a node is the proportional to the importance of the incoming node

thus 

$r_y\:=\:\frac{r_y}{2} + \frac{r_a}{2}$

$r_a\:=\:\frac{r_y}{2} + r_m$

$r_m\:=\:\frac{r_a}{2}$

We can therefore say
$r_j\:=\:\sum_{i \to y}\frac{r_i}{d_i}$

where $d_i$ is the out degree of node i. We add additional constraint $r_y+r_m+r_a = 1$. We can use gaussian elimination to solve these three unknown with the four available equations.

#### Matrix formulation.

The above formulation can be expressed in terms of matrix. If we have a link between i and j $i \to j$, then we set $M_{ji} = \frac{1}{d_i}$ else set $M_{ji} = 0$. By doing this we ensure that the sum of values in the column add to 1. In other words, M is column stochastic.

For the matrix M, if we read values across the rows, we essentially see all the incoming edges and their weight.

Recall for all $i \to j$, we have $r_j\:=\:\sum_{i \to y}\frac{r_i}{d_i}$ We therefore can express the above formulation in matrix form as $r\:=\:M \cdot r$. Also for the above graph, the flow equations given above can be expressed as matrix as follows

$\begin{bmatrix}
    r_y \\
    r_a \\
    r_m  
\end{bmatrix}$
=
$\begin{bmatrix}
    \frac{1}{2}       & \frac{1}{2} & 0 \\
    \frac{1}{2}      & 0 & 1 \\
    0       & \frac{1}{2} & 0  
\end{bmatrix} \cdot$
$\begin{bmatrix}
    r_y \\
    r_a \\
    y_m  
\end{bmatrix}$


Lets compute the pagerank below by finding eigen vectors and values

In [17]:
from numpy import linalg as LA
import numpy as np
M = np.matrix([[0.5, 0.5, 0], [0.5, 0, 1], [0, 0.5, 0]], dtype = np.float32) #for float64 we get rounding error
ev, evec = LA.eig(M)
ranks = evec[:, ev == 1]
#ranks need to sum up to 1, normalize them
ranks /= sum(ranks)
print('Page ranks are ', ranks.flatten())

Page ranks are  [[ 0.40000001  0.40000001  0.2       ]]



#### Power Iteration Method

The following method is power iteration methos where we iterate to solve $r = M \cdot r$ 
till $\vert r^{(t)} - r^{(t + 1)} \vert < \epsilon$. Let us implement Power iteration method below

In [31]:
def pagerank_pa(M, epsilon = 0.00000001):
    numpages = M.shape[1]
    r = np.matrix(np.ones(numpages)).reshape((-1, 1)) / numpages
    r_new = M * r
    iters = 1
    while np.max(np.abs(r_new - r)) > epsilon:
        r = r_new
        r_new = M * r
        iters += 1
    
    return r_new, iters


r, iters = pagerank_pa(M)
print('Page rank by power iteration is ', r.flatten(), ', found in', iters, ' iterations')

Page rank by power iteration is  [[ 0.4  0.4  0.2]] , found in 81  iterations


#### Random Walk Interpretation