##### MY470 Computer Programming

### Week 5 Assignment, MT 2020

#### \*\*\* Due 12:00 noon on Monday, November 9 \*\*\*

---
### Simulating contagion on a network

In this assignment, you are asked to write a program that simulates the contagion of disease or information on an empirical network. In academic research, contagion models have been used to study the properties of different types of networks. In practice, contagion models are extremely valuable to predict the spread of contagious disease such as the flu or STDs.

We will use the simplest of contagion models — the SI model. SI stands for susceptible and infected. The SI model assumes that once a susceptible individual is infected, there is no recovery. This is a good representation for the spread of non-curable but non-deadly infectious disease such as Herpes simplex or for the spread of new technologies and knowledge.

In the SI model we will implement, we will start with a population where everyone is susceptible. Then we will randomly pick a small number of individuals ("Patients 0") and infect them. In the next period, all the contacts of the infected individuals will get infected (thus, we will assume that the probability of transmission is 1). And so on. We will repeat the process until everyone in the network is infected or until a certain number of periods have passed (the latter is necessary for networks that are not connected and have separate components; in such situations, it is possible that the contagion never reaches some individuals). 

We will run the model on a real network. For simplicity, we will reuse the co-authorship network we analyzed in Assignment 3. So think about the contagion in this case as learning about a new research technique, empirical finding, or theoretical result.

#### Hints

Use docstrings to describe your methods. We will subtract points from your mark if you do not use appropriate description of your code.


### Problem 1: Working in a team

Work with your assigned partner to complete and submit the assignment. You can meet in person to discuss the division of labor but we expect you to use GitHub to communicate when coding your part and merging your contributions. We will  review the Issues, Pull request, and Wiki stats for your repository. You will get the full points for this problem if we find sufficient evidence that you have made use of GitHub as a collaboration tool. 

#### Hints

One reasonable way to divide the work is to assign Problems 2 and 3 to Student A and Problems 4 and 5 to Student B.


### Problem 2: Class for network

Create a class called `UndirectedNetwork`. The class should have the following data attributes:

* `nodes` — a dictionary where the node id is a key and the value is a list with the ids of the node's neighbors (coauthors for our data); initially empty

and the following methods:

* `add_node` — takes node_id and initializes it as a key to `nodes` if it is not already there
* `add_neighbors` — takes two arguments: ego_id and alter_id and adds alter_id to ego_id's list of neighbors and ego_id to alter_id's list of neighbors, if they are not already there
* `get_node_ids` — returns the ids of the nodes in the network one at a time
* `get_node_neighbors` — takes node_id and returns its neighbors one at a time

Calling the `print()` function on a `UndirectedNetwork` object should print the number of nodes in the network, e.g. "Undirected network with 455 nodes".


In [77]:
# Enter your answer to Problem 2 below. 

class UndirectedNetwork:
    
    def __init__(self, nodes = {}):
        """Creates dictionary of ids and neighbours."""
        self.nodes = nodes          
    
    def add_node(self, node_id):
        """Initializes node_id as key, 
        if not already there.
        """
        if node_id not in self.nodes:
            self.nodes[node_id] = []
    
    def add_neighbors(self, ego_id, alter_id):
        """Adds alter_id to ego_id's list of neighbors
        and vice versa, if not already there.
        """  
        if ego_id != alter_id:
            if alter_id not in self.nodes[ego_id]:
                self.nodes[ego_id].append(alter_id)
                
            if ego_id not in self.nodes[alter_id]:
                self.nodes[alter_id].append(ego_id)
    
    def get_node_ids(self):
        """Returns ids of nodes in the network
        one at a time.
        """
        for i in self.nodes:
            yield i
    
    def get_node_neighbors(self, node_id):
        """Takes node_id and returns its neighbors,
        one at a time."""
        for i in self.nodes[node_id]:
            yield i
        
    def __str__(self):
        """Print number of nodes in the network."""
        return 'Undirected network with ' + str(len(self.nodes)) + ' nodes.'

### Problem 3: Create an instance of the network class

Read the data from the file "ca-GrQc.txt" and save it in an instance of the UndirectedNetwork class you created. Call print on the instance.

#### Hints

Feel free to reuse code from Assignment 3. If your code had mistakes, make sure you fix them when you copy the code here.


In [78]:
# Enter your answer to Problem 3 below. 

coauthors = []
for line in open('ca-GrQc.txt', 'r'):
    if line[0] != '#':
        strlst = line.strip().split('\t')
        coauthors.append([int(i) for i in strlst])

network = UndirectedNetwork()

for i in coauthors:
    network.add_node(i[0])
    network.add_node(i[1])
    
    network.add_neighbors(i[0], i[1])
print(network)

Undirected network with 5242 nodes.


---
### Problem 4: Class for SI model


Create a class called `SIModel` that has the following data attributes:

* `network` — an instance of class UndirectedNetwork taken at instantiation
* `susceptible_nodes` — a list of ids for nodes that are not yet infected; initially includes all nodes from `network`
* `infected_nodes` — a list of ids for nodes that are infected; initially empty
* `num_infected` — keeps track of the number of infected nodes; initially `0`

and the following methods:

* `initialize` — randomly selects `n` number of nodes and infects them; then prints the number of infected nodes
* `update` — iterates over the susceptible nodes in random order and infects those who have at least one infected neighbor; then prints the number of infected nodes. The process should be asynchronous, in the sense that a node immediately becomes infected and will then infect any susceptible neighbors who are yet to be iterated over.
* `run` — repeats `update` until all nodes are infected or until `update` has been run 30 times

Calling the `print()` function on a `SIModel` object should print `num_infected`.

#### Hints

In this problem you will need to use functions from the `random` module. You can read more about it [here](https://docs.python.org/3/library/random.html).

Make sure the methods update all the relevant data attributes when called.

In [79]:
import random 

class SIModel:

    def __init__(self, network=UndirectedNetwork()):
        """Creates an SIModel given an instance of UndirectedNetwork."""
        self.network = network 
        self.susceptible_nodes = [node for node in network.get_node_ids()]
        self.infected_nodes = []
        self.num_infected = 0
    
    def initialize(self, n):
        """Assumes n is an integer.
        Infects a randomly selected n number of nodes in the network.
        Returns the number of infected nodes.
        """
        self.num_infected = n
        
        # Randomly select n number of nodes to be infected
        self.infected_nodes = random.sample(self.susceptible_nodes, self.num_infected)
        
        # Update susceptible nodes by removing selected infected nodes
        for node in self.infected_nodes:
            self.susceptible_nodes.remove(node)
        return "Number of infected nodes is " + str(self.num_infected)
    
    def update(self):
        """Iterates over the susceptible nodes in random order.
        Infects nodes that have at one infected neighbours.
        """   
        # Shuffle the order of susceptible nodes. 
        random.shuffle(self.susceptible_nodes) 
    
        for node in self.susceptible_nodes:
            for neighbor in self.network.get_node_neighbors(node):
                if neighbor in self.infected_nodes:
                    self.infected_nodes.append(node) 
                    self.susceptible_nodes.remove(node)
                    self.num_infected += 1 
                    break
        return "Number of infected nodes is " + str(self.num_infected)
    
    def run(self):
        """Repeats the 'update' method until 30 times 
        or until all nodes are infected. 
        """
        time = 0
        while self.num_infected <= len(self.network.nodes) and time <= 30:
            self.update()
            time += 1 
        
    def __str__(self):
        """Returns number of infected nodes."""
        return str(self.num_infected)

---
### Problem 5: Run the model

Run SIModel using the network from Problem 2. You should initialize the simulation with 3 seeds.


In [80]:
model = SIModel(network)
model.initialize(3)
model.update()
model.run()
print(model)

4158


---

### Evaluation

| Problem | Mark     | Comment   
|:-------:|:--------:|:----------------------
| 1       |   /1    |   
| 2       |   /4    |      
| 3       |   /1    | 
| 4       |   /5    | 
| 5       |   /1    |
| Legibility   |   /2    | 
| Modularity   |   /2    | 
| Efficiency   |   /4    | 
|**Total**|**/20**  | 
