# Week 1 - Quiz

## Question 1:

Consider three Web pages with the following links:

<img src="https://d396qusza40orc.cloudfront.net/mmds/images/otc_pagerank2.gif" style="float:left">

<br style="clear:both" />

Suppose we compute PageRank with a β of 0.7, and we introduce the additional constraint that the sum of the PageRanks of the three pages must be 3, to handle the problem that otherwise any multiple of a solution will also be a solution. Compute the PageRanks a, b, and c of the three pages A, B, and C, respectively. Then, identify from the list below, the true statement.

<ol>
<li> a + c = 2.595
<li> a + c = 2.035
<li> b + c = 2.5
<li> b + c = 2.735
</ol>



In [38]:
import numpy as np

# Probability of following the link
b = 0.7

# Rank matrix
r = np.matrix([1/3, 1/3, 1/3]).T

# Graph
M = np.matrix([[0, 0, 0], [1/2, 0, 0], [1/2, 1, 1]])

# Random teleport matrix
N = np.matrix([1/3, 1/3, 1/3]).T

# Convergence criteria
e = 1 / 10000

def page_rank(r):
    while True:
        old = r
        r = b * M * r + (1 - b) * N
        diff = np.absolute(old - r).max()
        if(diff < e):
            return r

r = page_rank(r)    
    
# Sum of pagerank should equal 3    
r = r*3

a, b, c = r.flat[0], r.flat[1], r.flat[2]

print("1:", round(a + c, 3) == 2.595)
print("2:", round(a + c, 3) == 2.035)
print("3:", round(b + c, 3) == 2.5)
print("4:", round(b + c, 3) == 2.735)

1: True
2: False
3: False
4: False


## Question 2

Consider three Web pages with the following links:

<img src="https://d396qusza40orc.cloudfront.net/mmds/images/otc_pagerank3.gif" style="float:left">

<br style="clear:both" />

Suppose we compute PageRank with β=0.85. Write the equations for the PageRanks a, b, and c of the three pages A, B, and C, respectively. Then, identify in the list below, one of the equations.

<ol>
<li>a = c + 0.15b
<li>0.95b = 0.475a + 0.05c
<li>85b = 0.575a + 0.15c
<li>0.85a = c + 0.15b
</ol>

In [36]:
# Probability of following the link
b = 0.85

# Graph
M = np.matrix([[0, 0, 1], [1/2, 0, 0], [1/2, 1, 0]])

# Re-initialize rank matrix
r = np.matrix([1/3, 1/3, 1/3]).T

r = page_rank(r)

a, b, c = r.flat[0], r.flat[1], r.flat[2]

print("1:", round(a, 3) ==  round(c + 0.15 * b, 3))
print("2:", round(0.95 * b, 3) == round(0.475 * a + 0.05 * c, 3))
print("3:", round(85 * b, 3) == round(0.575 * a + 0.15 * c, 3))
print("4:", round(0.85 * a, 3) == round(0.575 * a + 0.15 * c,3))

1: False
2: True
3: False
4: False


##  Question 3

Consider three Web pages with the following links:

<img src="https://d396qusza40orc.cloudfront.net/mmds/images/otc_pagerank3.gif" style="float:left">

<br style="clear:both" />

Assuming no "taxation," compute the PageRanks a, b, and c of the three pages A, B, and C, using iteration, starting with the "0th" iteration where all three pages have rank a = b = c = 1. Compute as far as the 5th iteration, and also determine what the PageRanks are in the limit. Then, identify the true statement from the list below.

<ol>
<li>Interation 5, b = 9/16
<li>After iteration 4, c = 11/8
<li>After iteration 5, b = 5/8 V
<li>In the limit, a = 1
</ol>

In [37]:
# Re-initialize rank matrix
r = np.matrix([1, 1, 1]).T

def page_rank_modified(r):
    # loop to arbitrary large number...
    for i in range(1, 10000000):
        old = r
        r = M * r
        if i == 4:
            r_4 = r
        if i == 5:
            r_5 = r     
        diff = np.absolute(old - r).max()
        if(diff < e):
            return r, r_4, r_5
        
r, r_4, r_5 = page_rank_modified(r)

a, b, c = r.flat[0], r.flat[1], r.flat[2]

print("1:", round(r_5.flat[1], 3) == 9/16)
print("2:", round(r_4.flat[2], 3) == 11/8)
print("3:", round(r_5.flat[1], 3) == 5/8)
print("4:", round(a, 3) == 1)

1: False
2: False
3: True
4: False


## Question 4

Suppose our input data to a map-reduce operation consists of integer values (the keys are not important). The map function takes an integer i and produces the list of pairs (p,i) such that p is a prime divisor of i. For example, $map(12) = [(2,12), (3,12)]$.
The reduce function is addition. That is, $reduce(p, [i_1, i_2, ...,i_k])$ is $(p,i_1+i_2+...+i_k).$

Compute the output, if the input is the set of integers 15, 21, 24, 30, 49. Then, identify, in the list below, one of the pairs in the output.

<ol>
<li>(7,119)
<li>(7,70)
<li>(2,102)
<li>(5,49)
</ol>