In [1]:
import numpy as np

# https://glowingpython.blogspot.com/2011/05/four-ways-to-compute-google-pagerank.html
def get_eigenvector(A):
    n = A.shape[1]
    _, v = np.linalg.eig(A)
    
    return np.abs(np.real(v[:n, 0]) / np.linalg.norm(v[:n, 0], 1))

def print_page_scores(vector):
    i = 1
    for page in vector:
        print("Page {} has score {:.3f}".format(i, page))
        i+= 1
        
def print_matrix(matrix):
    for row in matrix:
        for value in row:
            print("{:.2f}".format(value), end="\t")
        print("")

# Exercises

## Exercise 1

In [2]:
A = np.array([
    [0,   0,   1/2, 1/2, 0],
    [1/3, 0,   0,   0,   0],
    [1/3, 1/2, 0,   1/2, 1],
    [1/3, 1/2, 0,   0,   0],
    [0,   0,   1/2, 0,   0]
])

vector = get_eigenvector(A)
print_page_scores(vector)

Page 1 has score 0.245
Page 2 has score 0.082
Page 3 has score 0.367
Page 4 has score 0.122
Page 5 has score 0.184


It would appear that page 3's score has increased above that of page 1.
If a page 5 is added, the following two items will happen:
    1. The "vote" of page 3 will now split between page 1 and page 5, therefore reducing page 1's score
    2. The "vote" of page 5 will boost the score of page 3.

## Exercise 2

In [3]:
A = np.array([
    [0, 1, 0, 0, 0, 0, 0],
    [1, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 1, 0, 0, 0],
    [0, 0, 1, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 1, 1/2],
    [0, 0, 0, 0, 1, 0, 1/2],
    [0, 0, 0, 0, 0, 0, 0]
])

eigenvalues, _ = np.linalg.eig(A)

print(eigenvalues, len(np.where(eigenvalues == 1)[0]))

[ 1. -1.  1. -1.  1. -1.  0.] 3


The example web has 3 separate subwebs, similar to figure 2.2. 
As you can see, there are three eigenvalues equal to 1. This demonstrates 
that the dimension of V_1(A) equals (or exceeds) the number of components in the web.

## Exercise 3

In [4]:
A = np.array([
    [0, 1, 0, 0, 1/3],
    [1, 0, 0, 0, 0],
    [0, 0, 0, 1, 1/3],
    [0, 0, 1, 0, 1/3],
    [0, 0, 0, 0, 0],
])

eigenvalues, _ = np.linalg.eig(A)

print(eigenvalues, len(np.where(eigenvalues == 1)[0]))

[ 1. -1.  1. -1.  0.] 2


## Exercise 4

In [18]:
A = np.array([
    [0, 0, 0, 1/2],
    [1/3, 0, 0, 0],
    [1/3, 1/2, 0, 1/2],
    [1/3, 1/2, 0, 0]
])

eigenvalues, eigenvectors = np.linalg.eig(A)

print(eigenvalues)

[ 0.        +0.j         -0.28067662+0.26395346j -0.28067662-0.26395346j
  0.56135324+0.j        ]


Thus, the largest positive (Perron) enigenvalue is 0.56135324

In [33]:
v = eigenvectors[:4, 3]
v_scaled = np.abs(v) / np.linalg.norm(v, 1)

array([0.20664504, 0.12270648, 0.43864676, 0.23200172])

The resulting ranking seems resonable. Page 3, linked to by all other pages, is the most important. Page 4, linked to by page 1 and 2 scores the second, while page 1 and page2 are only linked to by one page with 1/2 and 1/3 vote namely.

## Exercise 5 
Prove that in any web the importance score of a page with no backlinks is zero.

Let $x_{i}$ denote the importance score of page i. If page i has no backlinks, then the $i_{th}$ row of the adjacency matrix A is a zero-vector. 
Thus for $$AX = \lambda X$$ where $\lambda_{i}=1$
$$x_{i} = 0$$

## Exercise 6

- 1.

To transpose page i and page j,
we need to transpose both the outgoing links from page i, j and backlinks to page i, j,

Compared with the original link matrix A,
the new link matrix $\overline{A}$ swaps row i with row j, and swaps column i with column j.
Thus, \begin{align}P\overline{A} = AP\newline    
(\overline{A}P = PA)\end{align}
Also, $P^{2}$ = I
Therefore, $P\overline{A}P = A$


- 2.

Suppose X is the eigenvector of A, then $$AX = \lambda X $$
$$ PAX = \lambda PX $$
From 1, $$ \overline{A}P = PA $$ 
Thus, $$ \overline{A}PX = \lambda PX$$
Therefore, $y=PX$ is an eigenvector for $\overline{A}$ with eigenvalue $\lambda$


- 3.

Suppose $X_{i}$ is the eigenvector of A where $\lambda_{i}=1$

Thus, $$ AX_{i} = \lambda_{i} X_{i}$$

Since P is an identity matrix, then $$PX_{i} = X_{i}$$

From 2, we have $$ \overline{A}PX = \lambda PX$$

Thus, $$ \overline{A}PX_{i} = \lambda_{i} PX_{i}$$

From (8) (10), $$ \overline{A}X_{i} = \lambda_{i}X_{i}$$

Therefore, $X_{i}$ is also the eigenvector for $\overline{A}$ with eigenvalue $\lambda_{i}=1$, so the importance scores are left unchanged after the single permutation.

For multiple permutations, we can draw the conclusion of the importance scores unchanged by iterating over the above steps.

## Exercise 7 

Since A is a $n \times n$ column-stochastic matrix, 

Thus,
$$\forall i, a_{ij} \in [0, 1]$$

$$\sum_{j\in n} a_{ij} = 1$$

For matrix S, 
$$\forall i,j, s_{ij}= \frac{1}{n}$$
$$\sum_{j\in n} s_{ij} = 1$$

Since $$m \in [0,1], 1-m \in [0,1]$$
then $$ 0 \leq (1-m)a_{ij} + ms_{ij} \leq (1-m) + m = 1$$

Also for any column j
$$\sum_{j} (1-m)A+mS \\
= \sum_{j} (1-m)a_{ij}+ms_{ij} \\
= (1-m)\sum_{j}a_{ij}+m\sum_{j}s_{ij}\\
=(1-m) + m = 1$$

Therefore, $M=(1-m)A+mS$ is also a column-stochastic matrix

## Exercise 8

Suppose A and B are both $n \times n$ column-stochastic matrices, 

Thus,
$$\forall i, a_{ij} \in [0, 1], \sum_{j} a_{ij} = 1$$
$$\forall i, b_{ij} \in [0, 1], \sum_{j} b_{ij} = 1$$

For $$M = A \cdotp B$$

$$m_{ij} = \sum_{k = 1}^{n}a_{ik}b_{kj} = a_{i1}b_{1j}+a_{i2}b_{2j}+ ... + a_{in}b_{nj}$$

Thus, for each column j:

\begin{align}\sum_{i=1}^{n}m_{ij} \\
= \sum_{j = 1}^{n}\sum_{k = 1}^{n}a_{ik}b_{kj} \\
= (a_{11}b_{1j}+a_{12}b_{2j}+ ... + a_{1n}b_{nj}) + (a_{21}b_{1j}+a_{22}b_{2j}+ ... + a_{2n}b_{nj}) + ... + (a_{n1}b_{1j}+a_{n2}b_{2j}+ ... + a_{nn}b_{nj})\\
= b_{1j}(a_{11}+a_{21}+...+a_{n1}) + b_{2j}(a_{12}+a_{22}+...+a_{n2})+ ... + b_{nj}(a_{1n}+a_{2n}+...+a_{nn}) \\
= b_{1j} + b_{2j} + ... + b_{nj} = 1\end{align}

Therefore, $M=A \cdotp B$ is also a column-stochastic matrix

## Exercise 9
A page with no backlinks will have a score of $\frac{m}{n}$ because the array S would give it an score of $\frac{1}{n}$ (as the array must be column stochastic and equally weighted for all pages) and then by exercise 5, the page would have a score of 0 from the array A.

Therefore, the score of a particular page would be the score given by
    $$(1-m)A + mS = (1-m)*0 + m\frac{1}{n} = \frac{m}{n}$$

Note how A resolves to 0 because the page has no backlinks.

## Exercise 10

Consider that A is a link matrix with non-negative values in position i,j if there exists a link between the two pages.

Further, consider that when multiplying A by itself (A^2), the position i,j will be non-negative if there exists a link from i,k and from k,j (where 0 <= k <= n and k !== i and k != j). This is due to the way that matrix multipication is defined [EXPAND FORMULA?]

We can expand this even further by saying that for any two nodes, we can determine if it will reach in p steps by expanding the previous statement.
Consider the following linear graph, A, which contains nodes 0 through n, which has links from node i to node i+1 for i in 0 through n-1.
Now, by induction, assume that we can get from node 0 to k in k steps by raising A to the kth power.
Now, we prove that we can reach from node 0 to k+1 in k+1 steps by using the previous statement and apply one additional multiplication of A^k to A.
[NEED TO PROVE, A[i][j] is 0 IFF a->c is not reachable in k steps]



...

...

...

## Exercise 11 

In [45]:
# Figure 2.1 with addition of page 5 that links to page 3 and page 3 also links to page 5.
A = np.array([
    [0,   0,   1/2, 1/2, 0],
    [1/3, 0,   0,   0,   0],
    [1/3, 1/2, 0,   1/2, 1],
    [1/3, 1/2, 0,   0,   0],
    [0,   0,   1/2, 0,   0]
])

# Let S denote an n x n matrix with all entries 1/n
S = np.ones((5,5)) / 5

# Calculate the new ranking by finding the eigenvector of M
# use m = 0.15
m = 0.15
M = (1-m)*A + m * S

vector = get_eigenvector(M)
print_page_scores(vector)

Page 1 has score 0.237
Page 2 has score 0.097
Page 3 has score 0.349
Page 4 has score 0.138
Page 5 has score 0.178


## Exercise 12

In [6]:
A = np.array([
    [0,   0,   1/2, 1/2, 0, 1/5],
    [1/3, 0,   0,   0,   0, 1/5],
    [1/3, 1/2, 0,   1/2, 1, 1/5],
    [1/3, 1/2, 0,   0,   0, 1/5],
    [0,   0,   1/2, 0,   0, 1/5],
    [0,   0,   0,   0,   0, 0  ]
])

# Let S denote an n x n matrix with all entries 1/n
S = np.ones((6,6)) / 6

# Calculate the new ranking by finding the eigenvector of M
# use m = 0.15
m = 0.15
M = (1-m)*A + m * S

print("The matrix M (weighted average of A and S) is:")
print_matrix(M)

The matrix M (weighted average of A and S) is:
0.02	0.02	0.45	0.45	0.02	0.20	
0.31	0.02	0.02	0.02	0.02	0.20	
0.31	0.45	0.02	0.45	0.88	0.20	
0.31	0.45	0.02	0.02	0.02	0.20	
0.02	0.02	0.45	0.02	0.02	0.20	
0.02	0.02	0.02	0.02	0.02	0.02	


In [78]:
vector = get_eigenvector(M)
print_page_scores(vector)

Page 1 has score 0.231
Page 2 has score 0.095
Page 3 has score 0.340
Page 4 has score 0.135
Page 5 has score 0.174
Page 6 has score 0.025


## Exercise 13

In [7]:
# example of 3 separate subwebs with seven pages
A = np.array([
    [0, 1, 0, 0, 0, 0, 0],
    [1, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 1, 0, 0, 0],
    [0, 0, 1, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 1, 1/2],
    [0, 0, 0, 0, 1, 0, 1/2],
    [0, 0, 0, 0, 0, 0, 0]
])

# Let S denote an n x n matrix with all entries 1/n
S = np.ones((7,7)) / 7

# Calculate the new ranking by finding the eigenvector of M
# use m = 0.15
m = 0.15
M = (1-m)*A + m * S

print("The matrix M (weighted average of A and S) is:")
print_matrix(M)

The matrix M (weighted average of A and S) is:
0.02	0.87	0.02	0.02	0.02	0.02	0.02	
0.87	0.02	0.02	0.02	0.02	0.02	0.02	
0.02	0.02	0.02	0.87	0.02	0.02	0.02	
0.02	0.02	0.87	0.02	0.02	0.02	0.02	
0.02	0.02	0.02	0.02	0.02	0.87	0.45	
0.02	0.02	0.02	0.02	0.87	0.02	0.45	
0.02	0.02	0.02	0.02	0.02	0.02	0.02	


In [8]:
vector = get_eigenvector(M)
print_page_scores(vector)

Page 1 has score 0.000
Page 2 has score 0.000
Page 3 has score 0.000
Page 4 has score 0.000
Page 5 has score 0.250
Page 6 has score 0.250
Page 7 has score 0.500


## Exercise 14

## Exercise 15

## Exercise 16 

A = [[0,  1/2,  1/2],
     [0,  0,    1/2],
     [1,  1/2,  0  ]]

𝜆𝑖 = {1, -1/2, -1/2}

Multiplicity of -1/2 = 2

Since for -1/2 the dimension of the eigenspace is not equal to the multiplicity of the eigenvalue, A is NOT diagonizable.
    

$A = $
\begin{bmatrix} 
        0 & 1/2 & 1/2 \\
        0 & 0 & 1/2 \\
        1 & 1/2 & 0
\end{bmatrix}

M=(1−m)A+mS = 

\begin{bmatrix} 
        m/3 & 1/2-m/6 & 1/2-m/6 \\
        m/3 & m/3 & 1/2-m/6 \\
        1-2m/3 & 1/2-m/6 & m/3
\end{bmatrix}


The eigenvalue 𝜆 of M can be calculated by: 

$|M - 𝜆I| = 0, 𝜆_{1} = 𝜆_{2} = 1/2(m-1), 𝜆_{3} = 1$

where 𝜆 = 1/2(m-1) has the algebraic multiplicity is 2

The geometric multiplicity is given by the nullity of R = M - 𝜆I = 
\begin{bmatrix} 
        1/2-m/6 & 1/2-m/6 & 1/2-m/6 \\
        m/3 & 1/2-m/6 & 1/2-m/6 \\
        1-2m/3 & 1/2-m/6 & 1/2-m/6
\end{bmatrix}

The row echelon matrix, Rref is 

\begin{bmatrix} 
        m/3 & 1/2-m/6 & 1/2-m/6 \\
        1/2-m/2 & 0 & 0 \\
        0 & 0 & 0
\end{bmatrix}

whose nullity is 1.

Thus, the dimension of the eigenspace corresponding to 𝜆=1/2(m-1) is 1, smaller than the algebraic multiplicity. This shows matrix M is not diagonalizable.

## Exercise 17 - INCOMPLETE
The value of m correlates to an egalitarian ranking of all the web pages and can range from 0 to 1 where 0 is the original formula while 1 is the pure egalitarian approach (where all web pages will have the same score).

INCOMPLETE: (How does this affect computation time)