In [50]:
import numpy as np

# https://glowingpython.blogspot.com/2011/05/four-ways-to-compute-google-pagerank.html
def get_eigenvector(A):
    n = A.shape[1]
    _, v = np.linalg.eig(A)
    
    return np.abs(np.real(v[:n, 0]) / np.linalg.norm(v[:n, 0], 1))

def print_page_scores(vector):
    i = 1
    for page in vector:
        print("Page {} has score {:.3f}".format(i, page))
        i+= 1

# Exercises

## Exercise 1

In [51]:
A = np.array([
    [0,   0,   1/2, 1/2, 0],
    [1/3, 0,   0,   0,   0],
    [1/3, 1/2, 0,   1/2, 1],
    [1/3, 1/2, 0,   0,   0],
    [0,   0,   1/2, 0,   0]
])

vector = get_eigenvector(A)
print_page_scores(vector)

Page 1 has score 0.245
Page 2 has score 0.082
Page 3 has score 0.367
Page 4 has score 0.122
Page 5 has score 0.184


It would appear that page 3's score has increased above that of page 1.
If a page 5 is added, the following two items will happen:
    1. The "vote" of page 3 will now split between page 1 and page 5, therefore reducing page 1's score
    2. The "vote" of page 5 will boost the score of page 3.

## Exercise 2

In [52]:
A = np.array([
    [0, 1, 0, 0, 0, 0, 0],
    [1, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 1, 0, 0, 0],
    [0, 0, 1, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 1, 1/2],
    [0, 0, 0, 0, 1, 0, 1/2],
    [0, 0, 0, 0, 0, 0, 0]
])

eigenvalues, _ = np.linalg.eig(A)

print(eigenvalues, len(np.where(eigenvalues == 1)[0]))

[ 1. -1.  1. -1.  1. -1.  0.] 3


The example web has 3 separate subwebs, similar to figure 2.2. 
As you can see, there are three eigenvalues equal to 1. This demonstrates 
that the dimension of V_1(A) equals (or exceeds) the number of components in the web.

## Exercise 3

In [53]:
A = np.array([
    [0, 1, 0, 0, 1/3],
    [1, 0, 0, 0, 0],
    [0, 0, 0, 1, 1/3],
    [0, 0, 1, 0, 1/3],
    [0, 0, 0, 0, 0],
])

eigenvalues, _ = np.linalg.eig(A)

print(eigenvalues, len(np.where(eigenvalues == 1)[0]))

[ 1. -1.  1. -1.  0.] 2


## Exercise 4

## Exercise 5 - INCOMPLETE
Prove that in any web the importance score of a page with no backlinks is zero.

This may need some code examples, however, the proof is seemingly self-evident.
A page's importance score is the sum of each backlink times the score of the page divided by the number of links that page has.
Therefore, if a page has 0 backlinks, it will have a score of zero.

## Exercise 6

## Exercise 7 - INCOMPLETE
Show that the product of two column-stochastic matrices is also column-stochastic.

Recall, a square matrix is called column-stochastic matrix if all of its entries are nonnegative and the entries in each column sum to one.

Consider a the following matrices, A and B, which are both column-stochastic. Let the matrices be of size n x n where n is some nonnegative integer.

Let us refer to elements A[i][j] as the i'th row and j'th column.

Therefore, the element of AB[i][j] = sum(A[i][k] * B[k][j] for k in range(1,n))

Now, to determine the sum of a particular column, c, of AB is sum(AB[i][c] for i in range(1,n))

## Exercise 8

## Exercise 9
A page with no backlinks will have a score of m/n because the array S would give it an score of 1/n (as the array must be column stochastic and equally weighted for all pages) and then by exercise 5, the page would have a score of 0 from the array A.

Therefore, the score of a particular page would be the score given by
    (1-m)A + mS = (1-m)*0 + m (1/n) = m/n

Note how A resolves to 0 because the page has no backlinks.

## Exercise 10

## Exercise 11 - INCOMPLETE

In [54]:
# Something feels missing here. I don't believe A is correct in the context of this formula 3.2
A = np.array([
    [0,   0,   1/2, 1/2, 0],
    [1/3, 0,   0,   0,   0],
    [1/3, 1/2, 0,   1/2, 1],
    [1/3, 1/2, 0,   0,   0],
    [0,   0,   1/2, 0,   0]
])

# Let S denote an n x n matrix with all entries 1/n
S = np.ones((5,5)) / (5*5)

# Calculate the new ranking by finding the eigenvector of M
# use m = 0.15
m = 0.15
M = (1-m)*A + m * S

vector = get_eigenvector(M)
print_page_scores(vector)

Page 1 has score 0.243
Page 2 has score 0.085
Page 3 has score 0.363
Page 4 has score 0.126
Page 5 has score 0.182


## Exercise 12

In [55]:
A = np.array([
    [0,   0,   1/2, 1/2, 0, 1/6],
    [1/3, 0,   0,   0,   0, 1/6],
    [1/3, 1/2, 0,   1/2, 1, 1/6],
    [1/3, 1/2, 0,   0,   0, 1/6],
    [0,   0,   1/2, 0,   0, 1/6],
    [0,   0,   0,   0,   0, 0  ]
])

# Let S denote an n x n matrix with all entries 1/n
S = np.ones((6,6)) / (6*6)

# Calculate the new ranking by finding the eigenvector of M
# use m = 0.15
m = 0.15
M = (1-m)*A + m * S

vector = get_eigenvector(M)
print_page_scores(vector)

Page 1 has score 0.242
Page 2 has score 0.084
Page 3 has score 0.362
Page 4 has score 0.125
Page 5 has score 0.182
Page 6 has score 0.005


## Exercise 13

## Exercise 14

## Exercise 15

## Exercise 16

## Exercise 17 - INCOMPLETE
The value of m correlates to an egalitarian ranking of all the web pages and can range from 0 to 1 where 0 is the original formula while 1 is the pure egalitarian approach (where all web pages will have the same score).

INCOMPLETE: (How does this affect computation time)