### Simulating contagion on a network

In this project, I write a program that simulates the contagion of disease or information on an empirical network. In academic research, contagion models have been used to study the properties of different types of networks. In practice, contagion models are extremely valuable to predict the spread of contagious disease such as the flu or STDs.

I will use the SI contagion model. The SI model assumes that once a susceptible individual is infected, there is no recovery. This is a good representation for the spread of non-curable but non-deadly infectious disease such as Herpes simplex or for the spread of new technologies and knowledge. 

In the SI model I will implement, we will start with a population where everyone is susceptible. Then I will randomly pick a small number of individuals ("Patients 0") and infect them. In the next period, all the contacts of the infected individuals will get infected (thus, we will assume that the probability of transmission is 1). And so on. I will repeat the process until everyone in the network is infected or until a certain number of periods have passed (the latter is necessary for networks that are not connected and have separate components; in such situations, it is possible that the contagion never reaches some individuals). 

I will run the model on a real network. For simplicity, I will reuse the co-authorship network I analyzed previously.

### Class for network

In [1]:
# For our network example, we know that the data file contains 
# both the i-j and the j-i edges so all the checks 
# in add_neighbors() are unnecessary. However, this may not be 
# the case in another dataset and the power of classes is that
# they can cover many different situations and circumstances.

class UndirectedNetwork(object):
    """A class used to represent a network."""
    
    def __init__(self):
        """Create a new empty network."""        
        self.nodes = {}
    
    def add_node(self, node_id):
        """Take node_id and add it to the network if it is not there."""
        if node_id not in self.nodes:
            self.nodes[node_id] = []
    
    def add_neighbors(self, ego_id, alter_id):
        """Take ego_id and alter_id and update ego_id's list of neighbors."""
        
        # Make sure nodes are added to the network
        self.add_node(ego_id)
        self.add_node(alter_id)  
        
        # Add the neighbors if they are not duplicates        
        if alter_id not in self.nodes[ego_id]:
            self.nodes[ego_id].append(alter_id)
        if ego_id not in self.nodes[alter_id]:
            self.nodes[alter_id].append(ego_id)
         
    def get_node_ids(self):
        """Return the network node ids one at a time."""        
        for i in self.nodes:
            yield i
    
    def get_node_neighbors(self, node_id):
        """Take node_id and return its neighbors one at a time."""        
        for i in self.nodes[node_id]:
            yield i

    def __str__(self):
        """Print the number of nodes in the network."""        
        return "Undirected network with " + str(len(self.nodes)) + " nodes"
        

### Instance of the network class

In [2]:
net = UndirectedNetwork()

for line in open('../data/ca-GrQc.txt', 'r'):
    # Ignore the comment lines at the beginning of the file
    if line[0] != '#':    
        strlst = line.strip().split('\t')
        if strlst[0] != strlst[1]: # Remove self-loops
            net.add_neighbors(int(strlst[0]), int(strlst[1]))

print(net)

Undirected network with 5241 nodes



### Class for SI model

In [3]:
# Typically, we should import modules before any code starts
# but we will accept import here since it only comes up for
# this problem and after
import random as ran

class SIModel(object):
    """A class used to simulate susceptible-infected contagion on a network."""
    
    def __init__(self, net):
        """Assume net is an object of type UndirectedNetwork.
        Create a new SI model using net.
        """        
        self.network = net
        self.susceptible_nodes = [i for i in net.get_node_ids()]
        self.infected_nodes = []
        self.num_infected = 0
    
    
    def initialize(self, n):
        """Assume n is an integer.
        Randomly select n nodes and infect them.
        Print the number of infected nodes.
        """        
        patients0 = ran.sample(self.susceptible_nodes, n)
        self.infected_nodes.extend(patients0)
        for i in patients0:
            self.susceptible_nodes.remove(i)
        self.num_infected = n
        print(self)
        
        
    def update(self):
        """Iterate over all susceptible nodes in random order and 
        infect those who have at least one infected neighbor.
        Implement asynchronous updating.
        Print the number of infected nodes.
        """        
        # Remember not to iterate over a list you are changing
        temp = self.susceptible_nodes[:]
        ran.shuffle(temp)
        for i in temp:
            
            # Get an iterator over i's neighbors
            nbrs = self.network.get_node_neighbors(i)
            
            # Infect if at least one neighbor is infected
            # Here, I am summing bools, where False = 0, True = 1
            if sum([(j in self.infected_nodes) for j in nbrs]) > 0:
                self.infected_nodes.append(i)
                self.susceptible_nodes.remove(i)
                self.num_infected += 1
        print(self)
        
        
    def run(self, num_iter=100):
        """Run update and print the number of infected nodes 
        until all nodes are infected or until update has been run 30 times.
        """        
        p = 0
        # While there are any susceptible nodes 
        # and for not more than num_iter iterations
        while self.susceptible_nodes and p < num_iter:
            self.update()
            p +=1
    
    
    def __str__(self):
        """Print the number of infected nodes."""        
        return str(self.num_infected)
    


### Running the model

In [4]:
# The output will vary because the similation is initialized 
# with a random process. For replication purposes, we will
# fix the random seed.
ran.seed(2)
si = SIModel(net) 
si.initialize(3)
si.run(num_iter=30)


3
416
2616
3818
4118
4157
4158
4158
4158
4158
4158
4158
4158
4158
4158
4158
4158
4158
4158
4158
4158
4158
4158
4158
4158
4158
4158
4158
4158
4158
4158
