# Networks: structure, evolution & processes
**Internet Analytics - Lab 2**

---

**Group:** *Your group letter.*

**Names:**

* *Name 1*
* *Name 2*
* *Name 3*

---

#### Instructions

*This is a template for part 4 of the lab. Clearly write your answers, comments and interpretations in Markodown cells. Don't forget that you can add $\LaTeX$ equations in these cells. Feel free to add or remove any cell.*

*Please properly comment your code. Code readability will be considered for grading. To avoid long cells of codes in the notebook, you can also embed long python functions and classes in a separate module. Don’t forget to hand in your module if that is the case. In multiple exercises, you are required to come up with your own method to solve various problems. Be creative and clearly motivate and explain your methods. Creativity and clarity will be considered for grading.*

In [1]:
import networkx as nx
import numpy as np
from collections import Counter

In [2]:
def directed_graph_from_tsv(pathname):
    file = open(pathname, 'r');
    G = nx.DiGraph()
    for line in file:
        content = line.split('\t')
        u = int(content[0])
        Vs = map(lambda v: int(v), content[1].split())
        for v in Vs:
            G.add_edge(u, v)
    file.close()
    return G

In [3]:
# Question for these two functions: is having a maximum number of hopes the right stop criteria?
def random_surfer(G, max_hops):
    counter = Counter()
    u = choice(G.nodes())
    for i in range(max_hops):
        links = list(G[u].keys())
        if len(links) == 0: # Dangling node found!
            break
        v = choice(links)
        counter[v] += 1
        u = v
    return {page: counter[page] / sum(counter.values()) for page in counter.keys()}

#### Exercise 2.13

In [4]:
def random_surfer_improved(G, max_hops, damping_factor=0.15):
    counter = Counter()
    u = choice(G.nodes())
    for i in range(max_hops):
        links = list(G[u].keys())
        # if restart or dangling node start at random
        if np.random.binomial(1, damping_factor) or len(links) == 0:
            v = choice(G.nodes())
        else: # choose link at random
            v = choice(links)
        counter[v] += 1
        u = v
    return counter

---

### 2.4.2 Power Iteration Method

#### Exercise 2.14: Power Iteration method

In [7]:
g = directed_graph_from_tsv('../data/components.graph')

In [8]:
g.nodes()

[0, 1, 2, 3, 4, 5, 6, 7]

In [9]:
g.adjacency_list()

[[1], [2, 3], [0], [2], [5, 6], [6], [7], [4]]

In [11]:
nx.degree_mixing_matrix(g, normalized=False)

array([[ 0.,  0.,  0.],
       [ 0.,  4.,  2.],
       [ 0.,  2.,  2.]])

In [13]:
def transition_matrix(graph):
    H = np.zeros((len(graph), len(graph)))
    for u in graph:
        out_degree_u = nx.degree(g, u)
        for v in graph:
            H[u, v] = 1 / out_degree_u if graph.has_edge(u, v) else 0
    return H
        
def GoogleMatrix(graph, theta):
    N = len(graph)
    H = transition_matrix(graph)
    w = nx.degree(g)
    w[i > 0] = 1
    w = np.reshape(w, (N, 1))
    onesT = np.reshape(np.ones(), (1, N))
    Hhat = H + (1 / N) (w @ ones)
    return theta * Hhat + (1-theta)* (onesT.T @ onesT) / N

def PageRank(google_matrix, pi_0):
    prev = 0
    pi_k = pi_0
    while (pi_k != pi_k1):
        prev = pi_k
        pi_k = pi_k @ google_matrix 
    return pi_k

---

### 2.4.3 Gaming the system *(Bonus)*

#### Exercise 2.15 *(Bonus)*