# Networks: structure, evolution & processes
**Internet Analytics - Lab 2**

---

**Group:** *D*

**Names:**

* Lejal Glaude Emma
* Bickel Marc
* Cadoux Cyril

---

#### Instructions

*This is a template for part 2 of the lab. Clearly write your answers, comments and interpretations in Markodown cells. Don't forget that you can add $\LaTeX$ equations in these cells. Feel free to add or remove any cell.*

*Please properly comment your code. Code readability will be considered for grading. To avoid long cells of codes in the notebook, you can also embed long python functions and classes in a separate module. Don’t forget to hand in your module if that is the case. In multiple exercises, you are required to come up with your own method to solve various problems. Be creative and clearly motivate and explain your methods. Creativity and clarity will be considered for grading.*

---

## 2.2 Network sampling

#### Exercise 2.7: Random walk on the Facebook network

In [2]:
import requests
import random # To chose a user's random friend
import math

##### Reasoning

Here we study a part of the facebook network using a Random Walk.
We begin with a precise user, we take his age and then we continue the walk with one of his friends randomly chosen.

In [3]:
def get_Age_Friend_And_NbFriends(id) :
    

    # Base url of the API
    URL_TEMPLATE = 'http://iccluster118.iccluster.epfl.ch:5050/v1.0/facebook?user={user_id}'
    # The actual url to call 
    port = 5051
    url = URL_TEMPLATE.format(user_id=id, p=port)
    # Execute the HTTP Get request
    response = requests.get(url)
    # Format the json response as a Python dict
    data = response.json()
    
    # Return the age of the user and one of his friends to continue the 'random walk'  
    return data['age'], random.choice(data['friends']), len(data['friends'])


def mean_age(start_id, number_iterations) :
    
    cumul = 0
    actual_id = start_id
    
    for k in range (0, number_iterations):
        
        age, new_id, weight =  get_Age_Friend_And_NbFriends(actual_id)        
        cumul += age
        actual_id = new_id # Continue the random walk on a random friend
        
        # Print the state
        if (k%47 == 0) :
            print ("\r", 100*float(k)/number_iterations, " %", end='')
    
    print("\r100 %")
    
    return float(cumul)/number_iterations # Returns the mean age


In [4]:
def print_converted(decimal_age):
    
    n_years = math.floor(decimal_age)
    
    # Get the integer number of months
    decimal_months = (decimal_age - n_years)*12
    n_months = math.floor(decimal_months)
    
    #Get the integer number of days
    decimal_days = (decimal_months - n_months)*30 # We make the approximation that there are 30 days in a month
    n_days = math.floor(decimal_days)
    
    print("The mean age that we computed is ", decimal_age, " which is roughly ", n_years, " years, ",
          n_months, " months and ",
          n_days, " days.")


Here we try to get the mean age with 5000 iterations.

In [5]:
meanAge = mean_age("f30ff3966f16ed62f5165a229a19b319", 5000)

100 %


In [6]:
print_converted(meanAge)

The mean age that we computed is  24.2778  which is roughly  24  years,  3  months and  10  days.


#### Exercise 2.8

1. Comparing to the facebook study, we are really far from the real average age. This is surely because the starting user is

In [7]:
print(get_Age_Friend_And_NbFriends("f30ff3966f16ed62f5165a229a19b319")[0])

19


years old. It seems logical that his friends-network contains mostly young persons. In other words, our first approach is naive, and is sensitive to 'local values'. So here we have a biais that we have to "erase".



To do this, we can apply a de-biaising formula from the slide 11 of lecture 4, and we obtain the desired result.


In [9]:
def mean_age_unbiaised(start_id, number_iterations) :
    
    cumul = 0
    sum_one_over_weight = 0
    actual_id = start_id
    
    for k in range (0, number_iterations):
        
        # Init
        age, new_id, weight =  get_Age_Friend_And_NbFriends(actual_id)
             
        # Computation
        sum_one_over_weight += 1.0/weight
        cumul += (float(age)/weight)
        
        # Continue the random walk on a random friend
        actual_id = new_id
        
        # Print the state
        if (k%47 == 0) :
            print ("\r", 100*float(k)/number_iterations, " %", end='')
    
    print("\r100 %")
    
    return cumul/sum_one_over_weight # Returns the mean age


In [18]:
unbiased_age = mean_age_unbiaised("f30ff3966f16ed62f5165a229a19b319",5000)

100 %


In [19]:
print_converted(unbiased_age)

The mean age that we computed is  42.897942526318666  which is roughly  42  years,  10  months and  23  days.


This result is much closer to the value announced by facebook !