# Networks: structure, evolution & processes
**Internet Analytics - Lab 2**

---

**Group:** *K*

**Names:**

* *Robin Lang*
* *Kim Lan Phan Hoang*
* *Julien Harbulot*

---

#### Instructions

*This is a template for part 4 of the lab. Clearly write your answers, comments and interpretations in Markodown cells. Don't forget that you can add $\LaTeX$ equations in these cells. Feel free to add or remove any cell.*

*Please properly comment your code. Code readability will be considered for grading. To avoid long cells of codes in the notebook, you can also embed long python functions and classes in a separate module. Don’t forget to hand in your module if that is the case. In multiple exercises, you are required to come up with your own method to solve various problems. Be creative and clearly motivate and explain your methods. Creativity and clarity will be considered for grading.*

In [1]:
import epidemics_helper
import json

import random
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

import networkx as nx
from networkx.readwrite import json_graph

---

## 2.4 PageRank

### 2.4.1 Random Surfer Model

#### Exercise 2.12

In [2]:
# open the files and store them as graphs
absorbing_file = open("../data/absorbing.graph", "rb")
absorbing_graph = nx.read_adjlist(absorbing_file, create_using=nx.DiGraph())

components_file = open("../data/components.graph", "rb")
components_graph = nx.read_adjlist(components_file, create_using=nx.DiGraph())

In [3]:
def random_surfer(G, N = 5):
    nodes = nx.nodes(G)
    num_nodes = len(nodes)
    prob = np.zeros(num_nodes)
    first_node = random.choice(nodes)
    
    succ = G.successors(first_node)
    
    if len(succ) == 0:
        prob[int(first_node)] = 1
        return prob
    
    for n in succ:
        prob[int(n)] = 1/len(succ)
    
    for i in range(N-1):
        prob_next = np.zeros(num_nodes)
        for j in nodes:
            if prob[int(j)] > 0:
                succ = G.successors(j)
                if len(succ) > 0:
                    for n in succ:
                        prob_next[int(n)] += prob[int(j)] / len(succ)
                else:
                    prob_next[int(j)] += prob[int(j)]
        prob = prob_next
    return prob

In [4]:
random_surfer(absorbing_graph)

array([ 0.        ,  0.83333333,  0.        ,  0.11111111,  0.05555556])

In [5]:
random_surfer(components_graph)

array([ 0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0.5,  0.5])

#### Exercise 2.13

In [6]:
def method1(G, N = 5):
    nodes = nx.nodes(G)
    num_nodes = len(nodes)
    prob = np.zeros(num_nodes)
    first_node = random.choice(nodes)
    
    succ = G.successors(first_node)
    
    if len(succ) == 0:
        prob[int(first_node)] = 1
        return prob
    
    for n in succ:
        prob[int(n)] = 1/len(succ)
    
    for i in range(N-1):
        prob_next = np.zeros(num_nodes)
        for j in nodes:
            if prob[int(j)] > 0:
                succ = G.successors(j)
                if len(succ) > 0:
                    for n in succ:
                        prob_next[int(n)] += prob[int(j)] / len(succ)
                else:
                    prob_next[int(random.choice(nodes))] += prob[int(j)]
        prob = prob_next
    return prob

In [7]:
method1(absorbing_graph, 100)

array([ 0.06360044,  0.46061609,  0.06360044,  0.0096789 ,  0.40250413])

In [8]:
method1(components_graph, 100)

array([ 0.28571351,  0.28571278,  0.28571572,  0.14285799,  0.        ,
        0.        ,  0.        ,  0.        ])

In [9]:
def method2(G, N = 5, damp = 0.15):
    nodes = nx.nodes(G)
    num_nodes = len(nodes)
    prob = np.zeros(num_nodes)
    first_node = random.choice(nodes)
    
    succ = G.successors(first_node)
    
    if len(succ) == 0:
        prob[int(first_node)] = 1
        return prob
    
    for n in succ:
        prob[int(n)] = 1/len(succ)
        
    for i in range(N-1):
        if np.random.choice([True, False], p=[damp, 1-damp]):
            print("1-2-SWITCH")
        prob_next = np.zeros(num_nodes)
        for j in nodes:
            if prob[int(j)] > 0:
                succ = G.successors(j)
                if len(succ) > 0:
                    for n in succ:
                        prob_next[int(n)] += prob[int(j)] / len(succ)
                else:
                    prob_next[int(j)] += prob[int(j)]
        prob = prob_next
    return prob

In [10]:
method2(absorbing_graph, 100)

array([ 0.,  1.,  0.,  0.,  0.])

---

### 2.4.2 Power Iteration Method

#### Exercise 2.14: Power Iteration method

In [11]:
def danglingNodes(G):
    nbNodes = nx.number_of_nodes(G)
    w = np.zeros(nbNodes);
    
    for i in G.nodes_iter():
        outDegreeCurrent = G.out_degree(i)
        if(outDegreeCurrent == 0):
            w[int(i)] = 1
    return w


def markovMatrix(G):
    nbNodes = G.number_of_nodes()
    H = np.zeros([nbNodes,nbNodes])
    
    for n in G.nodes_iter():
        outDegreeCurrent = G.out_degree(n)
        for i in G.neighbors(n):
            H[int(n),int(i)] = 1/outDegreeCurrent
    return H
    
    
def powerIteration(G, run, theta):
    
    N = G.number_of_nodes()
    
    w = danglingNodes(G)
    
    matrix = np.empty(N)
    matrix.fill(1/N)
    
    H = markovMatrix(G)

    H_hat = np.add(H, w/N)

    Google = (theta*H_hat) + ((1-theta)/N)
    
    for i in range(run):
        
        matrix = matrix @ Google
        
    return matrix


m = powerIteration(absorbing_graph,2,0.8)
print(m)
sum(m)

[ 0.09333333  0.38666667  0.09333333  0.11466667  0.07733333]


0.76533333333333342

---

### 2.4.3 Gaming the system *(Bonus)*

#### Exercise 2.15 *(Bonus)*