Our project is aimed to simulate a covid outbreak throughout the Bristol first year population, using the SIR method of simulating epidemics. This approach splits the population into 3 categories – susceptible, infected or recovered. New infections can then be modelled based on the interactions of susceptible and infected people. We sought to analyse the effect of international students entering the population and coming in from areas with different infection rates compared to the local population, as well as considering the effect of social interactions and university holidays on the covid prevalence in the population. In order to run, the programme must be given initial conditions which include a text file of the names of all the people in the population and initial infection proportions for the top five most prevelent nationalities entering the university at the start of term.

The Script is comprised of 4 modules. The Main.py, Metrics.py, Functions.py and Assignments.py.

In [1]:
from random import choices , randint , choice
import numpy as np
import csv

with open('names.txt' , 'r') as names:
        Names = (names.read().replace('\n' , ' ')).split(' ') # creates a list of names from a text file

Document = open('InitialConditions.csv') # imports the values from a csv
DocumentContents = csv.reader(Document)  
Rows = []
for row in DocumentContents:
    Rows.append(row)
Nations , NationWeight , InfWeight = [] , [] , [] 
for entry in Rows[1:]:
    if entry == []:
        break
    Nations.append(entry[0])
    NationWeight.append(float(entry[1]))
    InfWeight.append(float(entry[2]))
InfDict = {j:InfWeight[i] for i , j in enumerate(Nations)} 

# Graphs wanted by the user 
Document2 = open('GraphSelection.csv') 
DocumentContents2 = csv.reader(Document2)  
Rows2 = []
for row in DocumentContents2:
    Rows2.append(row)
Plot ,PlotOrNot, Day , Sir = [] , [] , [] , []
for entry in Rows2[1:]:
    if entry == []:
        break

    Plot.append((entry[0]))
    PlotOrNot.append((entry[1]))
    Day.append((entry[2]))
    Sir.append((entry[3]))


###Initial conditions

Halls = ['East' , 'West' , 'North' , 'South']
HallWeight = [0.1 , 0.2 , 0.4 , 0.3]

Country = Nations
CountryWeight = NationWeight
CountryInfDict = InfDict

# Country = ['China' , 'US' , 'UK' , 'EU']
# CountryWeight = [.1 , .1 , .7 , .1]

# CountryInfDict = {'China':0.01 , 'UK':0.001 , 'US':0.002 , 'EU':.005} #infect % of individual nations

PopulationSize = 1000
Names = Names[:PopulationSize] #trims the names list - idk if necessary

Status = ['I' , 'S' ,'R' ]

Courses =[ 'STEM' , 'ARTS' ]

ParameterValuesDict = {}

ParameterValuesDict['Halls'] = Halls
ParameterValuesDict['Origin'] = Country
ParameterValuesDict['Name'] = Names
ParameterValuesDict['Course'] = Courses



###Initial Allocations
def Jokes(subject):
    if subject == 'STEM':
        return 0
    else:
        return 2   ##Unsure if we carry this into final version 

def InfStatus(Nation):
    affected = CountryInfDict[Nation]
    InfWeight = [affected , 1-affected , 0] #Currently starts of w/ a recovered rate of 0
    InfList = choices(Status , InfWeight )
    return InfList[0]  #The zero indices is used to prevent a list of length 1 being returned

class Person:
    def __init__(self, StudentNumber ):
        self.StudentNumber = (StudentNumber)
        self.Halls = choices(Halls, HallWeight)[0]
        self.Origin = choices(Country, CountryWeight)[0]
        self.Name = Names[StudentNumber]
        self.SIR = InfStatus(self.Origin) 
        self.Course = choice(Courses)
        
        # Performs a normal distribution and adds it to an array and picks a nunber in this so that there is a spread of sociability
        self.Social = choice(np.random.normal(4,1, 1000)) # min(randint(1,10)+Jokes(self.Course) , 10) #Used to determine n# of non uni interactions
        self.DayInfected = [0,0]
        self.DayRecovered = [0,0]

StudentObjects = [Person (i) for i in range(PopulationSize)] #Allocates every student a number

#print(vars(StudentObjects[100]))

We sought to use agent-based modelling of infections, meaning our model considered individual interactions between actors to produce new infections. Each actor was created as a class instance with attributes including halls, origin and course - all of which where used in conjunction with a randomly attributed social number to determine the number of social interactions each student took part in and as a result the number of opportunities for infection. We chose classes as it allowed us to store a large number of attributes for a single object, while also allowing the set of actors in our population to be stored in lists.
For attributes such as halls and country of origin these were allocated with randomly from a weighted distribution, allowing for the creation of unique actors. For the country of origin attribute our weightings are imported from a .CSV file allowing us to model a variety of scenarios and the consequences external populations, and their relative infection rate have on this hypothetical covid outbreak. These properties initialise themselves in the self-constructor, as do all other attributes. Since our class only takes one integer argument we were able to create instances using simple iteration, creating as many objects as specified in our population variable.

In [2]:
import pandas as pd 
import random as r
import Assignments as s

# Function to create dataframe of students information  -- returns dataframe of all the students attributes 
def CreateDataFrame():
    
    # Creation of empty variable list to append column headings
    Columns = []

    # Create empty dictionary for values of each column for each student
    StudentsDict = {}

    # Iterates through the attributes of each student defined by the Person class and assigns an empty list to each one
    for keys, values in vars(s.StudentObjects[0]).items():
        Columns.append(keys)
        StudentsDict[keys] = []

    # Iterate through Student Numbers 
    for StudentNumber in range(s.PopulationSize):  

        # Extract student as an object 
        StudentInfo = s.StudentObjects[StudentNumber]

        # Extract student attributes iteratively and add to the list which is a value of the attribute key
        for column in Columns:
            value = getattr(StudentInfo, column)
            StudentsDict[column].append(value) 
            
     # Convert dictionary into a pandas dataframe 
    StudentInfoDataFrame = pd.DataFrame(StudentsDict) 
    
    return StudentInfoDataFrame

print(CreateDataFrame())
        

     StudentNumber  Halls Origin         Name SIR Course    Social  \
0                0   East     UK      Michael   S   STEM  3.591004   
1                1  South     EU  Christopher   S   STEM  3.432075   
2                2   West     UK      Jessica   S   STEM  3.987709   
3                3  North     UK      Matthew   S   ARTS  3.805069   
4                4  North     UK       Ashley   S   STEM  4.877696   
..             ...    ...    ...          ...  ..    ...       ...   
995            995  South     UK         Mike   S   STEM  4.275520   
996            996  North     UK        Chloe   S   STEM  4.978867   
997            997   East  China       Alecia   S   ARTS  3.844541   
998            998  South     UK          Sam   S   ARTS  1.495135   
999            999   West     UK        Rocio   S   ARTS  2.665871   

    DayInfected DayRecovered  
0        [0, 0]       [0, 0]  
1        [0, 0]       [0, 0]  
2        [0, 0]       [0, 0]  
3        [0, 0]       [0, 0]  
4   

Metrics.py contains the functions which prepare the data for visualisation aswell as collecting the relevant data which the user wants to see. The main function in this module is CreateDataFrame(). This function does not take any arguments. It iterates through the attributes of the Person class and adds these as columns. Subsequently it adds the attribute data to a column list for each student. The Function then returns a pandas data frame of this data. 
Another very important function in the Metrics.py module is the CollectSIRdata(S,I,R) function. This uses the SIRinfoFilter function to collect all the people of the same SIR statuses and adds the proportion of the population they make up to a list to be used later in the Main data series for each status. In the Main.py, this function is called every day in order to collect the data on how the infection function for that day has affected the population. There is an easy option to collect the frequency aswell as a variable earlier on in the function that doesnt take into accoun the population size. 

In [3]:
# Function to collect SIR data for population -- returns a dataframe of proportiona and frequency against SIR 
def CollectSIRdata(S,I,R):
    
    # Uses SIRinfoFilter to find all the people of SIR values respectively 
    todaysFrequencyS = len(SIRinfoFilter('S', None, None, None, None, True,True))
    todaysFrequencyI = len(SIRinfoFilter('I', None, None, None, None, True,True))
    todaysFrequencyR = len(SIRinfoFilter('R', None, None, None, None, True,True))
    
    # Find Proportion
    todaysProportionS = todaysFrequencyS/s.PopulationSize
    todaysProportionI = todaysFrequencyI/s.PopulationSize
    todaysProportionR = todaysFrequencyR/s.PopulationSize
    
    proportionsSIR = [todaysProportionS,todaysProportionI,todaysProportionR]
    
    S.append(proportionsSIR[0])
    I.append(proportionsSIR[1])
    R.append(proportionsSIR[2])     

The most important and versatile function in the Metrics.py module is the SIRinfoFilter function. This function is reused throughout the code as it allows a user to return a dataframe containing all the agents of the population that have attributes specified as an argument in the function. Another feauture of the function is a variable which allows a conditional day for the function to be called aswell. 

In [4]:
def SIRinfoFilter(SIRstatus, Origin, Name, Halls, Course, Day,CollectDataDay):
    
    if Day == CollectDataDay:
        
        # Reuses CreateDataframe function to produce a data frame of all students data
        StudentInfoDataFrame = CreateDataFrame()
        
        #   Checks to see if each respective attribute to be filtered by was defined when inputting the function 
        if isinstance(SIRstatus,str) == True:    
            
            # If it was defined then it adds the filtering condition tot the dataframe
            StudentInfoDataFrameFiltered = StudentInfoDataFrame[(StudentInfoDataFrame['SIR'] == SIRstatus)]   
        else:
            
            # If no infomation was given to filter then it does not filter it by the variable and says so to the user       
            StudentInfoDataFrameFiltered = StudentInfoDataFrame[~(StudentInfoDataFrame['SIR'] == SIRstatus)] 
            
        
        # These steps are repeated for each attribute that could be filtered by  
        if isinstance(Origin,str) == True:       
            StudentInfoDataFrameFiltered = StudentInfoDataFrameFiltered[(StudentInfoDataFrameFiltered['Origin'] == Origin) ]
        else:        
            StudentInfoDataFrameFiltered = StudentInfoDataFrameFiltered[~(StudentInfoDataFrameFiltered['Origin'] == Origin) ]
            
        
        if isinstance(Name,str) == True:       
            StudentInfoDataFrameFiltered = StudentInfoDataFrameFiltered[(StudentInfoDataFrameFiltered['Name'] == Name) ]
        else:         
            StudentInfoDataFrameFiltered = StudentInfoDataFrameFiltered[~(StudentInfoDataFrameFiltered['Name'] == Name) ]
            
        
        if isinstance(Halls,str) == True:
            StudentInfoDataFrameFiltered = StudentInfoDataFrameFiltered[(StudentInfoDataFrameFiltered['Halls'] == Halls) ]
        else:         
            StudentInfoDataFrameFiltered = StudentInfoDataFrameFiltered[~(StudentInfoDataFrameFiltered['Halls'] == Halls) ]
                 
            
        if isinstance(Course,str) == True:
            StudentInfoDataFrameFiltered = StudentInfoDataFrameFiltered[(StudentInfoDataFrameFiltered['Course'] == Course) ]
        else:        
            StudentInfoDataFrameFiltered = StudentInfoDataFrameFiltered[~(StudentInfoDataFrameFiltered['Course'] == Course) ]
             
        
        # The filtered dataframe is returned
        return  StudentInfoDataFrameFiltered    
                         

To infect people we initially use our infectRandomStudent function which randomly infects a size of the student population. Then we use our Infected Function with arguments, (Day,DateRange, S,I,R), which assigns a value n which is a number that defines how likely an infectious person is to infect the person they interact with and that value of n=0.2. Furthermore through our function we iterate through the different SIR statuses over each day and for those who are infected we check for how long they are infected for. If a student has been infected for longer than 14 days they will recover and their date of recovery will be added to the attribute. This way we can keep track of people recovering from being infected. Then we iterate through how many people a person has come into contact with each day and multiply this social number, being the number of people they interact with, and the variable n to infect people who come into contact with infected people and keeps note of the date of infection and stores it as an attribute. Recovery time is 30 days and once a person passes those 30 days he or she becomes susceptible again. We used classes to create objects each holding different values for the data on students and used for loops to range over the student population in order to iterate through each person. 

In [5]:
def InfectRandomStudent(Day,DateRange):
    
    randomStudent = r.randint(0,s.PopulationSize-1)
    
    if s.StudentObjects[randomStudent].SIR == 'S':
        s.StudentObjects[randomStudent].SIR = 'I'  
        s.StudentObjects[randomStudent].DayInfected = ([DateRange[Day],Day])
        
def infected(Day , DateRange,S,I,R):
    
    # Number that defines how likely an infectious person is to infect the person they interact with
    N = 0.2
        
    
    for StudentNumber in range(s.PopulationSize):
        
        # Susceptible people 
        if s.StudentObjects[StudentNumber].SIR == 'S': 
            continue    
        
        # Infected people 
        elif s.StudentObjects[StudentNumber].SIR == 'I':
            
            # If a student has been infected for longer than 14 days they will recover and their date of recovery will be added to the attribute
            if (Day - s.StudentObjects[StudentNumber].DayInfected[-1] ) >= r.choice(np.random.normal(14,6, s.PopulationSize)):
                s.StudentObjects[StudentNumber].SIR = 'R'
                s.StudentObjects[StudentNumber].DayRecovered = [DateRange[Day],Day]
                continue
            else:
                
                # Iterating through the number of people this infected person runs into each day by multiplying social number with number of susceptible people 
                for SingleInfection in range(int(round(S[-1]*s.StudentObjects[StudentNumber].Social*N))):
                    
                    # Picks a student at random
                    randomStudent = r.randint(0,s.PopulationSize-1)
                                     
                    if s.StudentObjects[randomStudent].SIR == 'S':
                        s.StudentObjects[randomStudent].SIR = 'I'
                        
                        # Adds date of infection as attribute
                        s.StudentObjects[randomStudent].DayInfected = ([DateRange[Day],Day])
                    else:
                        continue            
            
        # Recovery time is 30 days       
        elif s.StudentObjects[StudentNumber].SIR == 'R':
            if (Day - s.StudentObjects[StudentNumber].DayRecovered[-1] ) >= r.choice(np.random.normal(30,3, s.PopulationSize)):
                s.StudentObjects[StudentNumber].SIR = 'S'
            else:
                s.StudentObjects[StudentNumber].SIR == 'R'

The main.py file produces an array with the dates from September until present to initiate the ‘SIR’ time series. For each ‘SIR’ status an empty list is created in order to start collecting data respectively.  University breaks such as Christmas, Easter and Reading weeks have also been accounted for. This is because during breaks people tend to travel to return home thus increasing the chances of them being infected. We managed to do this by applying ranges for these specific dates and in some instances randomly infecting some people to obtain infections due to traveling. For weekdays we add normal infection data .In addition, for weekend days (Sat, Sun) we decided to iterate through the students and apply a weighted choice of whether or not an individual decides to study on a weekend day along with other weighted choices with lists of places to study, go out and activities people enjoy to entertain. Each of the variables in these lists adds a number of people you interact with, increasing or decreasing your social value. This is to give us changing values for the interactions each person has and make it as realistic as possible. Finally we format the ‘SIR’ values for plotting into a time series and then a function that plots the main ‘SIR’ time series for the whole simulation is called. In the end an excel file in the directory with all the attributes of the population is created.

In [6]:
import pandas as pd
from datetime import date as d 
import Assignments as s
import argparse as ap
import Metrics as m
import Functions as f
import matplotlib.pyplot as plt
import plotly.express as px
import random as r
import numpy as np

# Todays Date 
Today = d.today()
Today = Today.strftime("%d/%m/%Y")
    
# Produces an array of the date from september until now to intiate the SIR time series 
DateRange= pd.date_range('17/09/2021', Today)

# Creater the SIR time series from the start of university 2021 until now
SIRSeries = pd.DataFrame(DateRange, columns=['Date'])

# Empty lists to collect people of SIR statuses respectively 
S = []
I = []
R = []

collectionDayHalls = int(s.Day[0])
collectionDayCountries = int(s.Day[2])
collectionDayCourse = int(s.Day[1])

# Going through each days until now 
for Day in range(0,len(DateRange)):
    
    # Data collection functions 
    m.BarChartHalls(Day, DateRange, collectionDayHalls)
    m.BarChartCountries(Day, DateRange,collectionDayCountries)
    m.BarChartCourse(Day, DateRange, collectionDayCourse)
    # Holidays only add the data and do not change the infection numbers
    if Day in range (30,37):
        
        # Each time this is called it collects the data and adds it to the time series 
        m.CollectSIRdata(S,I,R)
                
        # Collects whatever data you want at the start of the reading week 1
        if Day == 31:            
            
            # Randomly Infects 10 percent of the pupolation when they travel home - this function can be tailored to wight certain ethnicities 
            for StudentNumber in range(0,round(s.PopulationSize/20)):                 
                    f.InfectRandomStudent(Day,DateRange)
        else:
            continue
            
        # Christomas hols
    elif Day in range (90,120):        
        m.CollectSIRdata(S,I,R)               
        if Day == 118:           
            for StudentNumber in range(0,round(s.PopulationSize/90)):                   
                   f.InfectRandomStudent(Day,DateRange)                    
        elif Day == 119:         
           
            # Collects whatever data you want at the start of the Christmas holidays- if you want data on a day other than a collect data day then call today as the collectdataday
            ChristmasData = m.SIRinfoFilter('I', None, None, None, None, Day,Day)     
        else:
            continue    
        
    #   Easter hols
    elif Day in range (200,214):        
        m.CollectSIRdata(S,I,R)
        
        # Collects whatever data you want at the start of the reading week 2
        if Day == 213:            
            for StudentNumber in range(0,round(s.PopulationSize/17)):                 
                    f.InfectRandomStudent(Day,DateRange)                   
            ReadingWeek2Data = m.SIRinfoFilter('I', None, None, None, None,Day,Day)
        else:
            continue
    else:
        
        # Week days add normal infecton data            
        if 0<= Day%7 <=5:            
            m.CollectSIRdata(S,I,R)               
            f.infected( Day , DateRange,S,I,R)           
            continue
        
        # Weekends adds extra people to everyones social number
        elif 6<= Day%7 <=7:       
            m.CollectSIRdata(S,I,R)       
            for StudentNumber in range(s.PopulationSize):            
                Library = 2 # adds 2 people
                Home = 0 # adds 0 people
                Coffeeshop = 2 # adds 2 people
                Exercise = 1 # adds 1 person
                Friends = 3 #adds 3 people
                Clubs = 4 # adds 4 people
                Restaurants = 2 # adds 2 people
                Parties = 4 # adds 4 people
                StudyTime = [Library, Home, Coffeeshop] # places to study list
                Hobbies = [Exercise, Friends, Home] # activities/hobbies list
                Nightlife = [Clubs, Restaurants, Parties, Home] # nightlife places list for weekends
                WStudyTime = [True, False] # WeekEnd StudyTime
                ChoiceWS = np.random.choice(WStudyTime, p=[0.7,0.3]) # A weighted assumed choice whether someone studies on the weekend (70% YES/ 30% NO)
                if ChoiceWS == True: 
                    s.StudentObjects[StudentNumber].Social = np.random.choice(StudyTime, p=[0.5, 0.2, 0.3]) + np.random.choice(Hobbies, p=[0.4, 0.4, 0.2]) + np.random.choice(Nightlife, p=[0.25,0.2,0.4,0.15]) # assigns each person a social value
                else:
                    s.StudentObjects[StudentNumber].Social = np.random.choice(Hobbies, p=[0.4, 0.4, 0.2]) + np.random.choice(Hobbies, p=[0.4, 0.4, 0.2]) + np.random.choice(Nightlife, p=[0.25,0.2,0.4,0.15]) #if someone does not study over the weekend then 2x hobbies higher social value
                                        
            f.infected( Day , DateRange,S,I,R) 
        
# Formats the SIR vals for plotting into a time series 
SIRSeries['S'] = S
SIRSeries['I'] = I           
SIRSeries['R'] = R
SIRSeries.set_index('Date',inplace=True)

# Function that plots the main SIR time series for the whole simulation
if s.PlotOrNot[3] == 'TRUE':
    m.MainTimeSeriesPlot(SIRSeries)
else:
    print('Simulation finished')

# Creates an excel file in the directory with all the attributes of the population 
m.SaveMainDataframeOfAttributes()        

The result of these functions and classes enable us to be able to manipulate a lot of data in quite a few ways. The following code is largely reused, with variable and position changes in order to show different plots, such as the distribution of SIR quantities within different courses, halls and nationalities.

In [7]:
def BarChartCountries(Day, DateRange,CollectDataDay):
    
    if Day == CollectDataDay:
        
        countries = 'China' , 'US' , 'UK' , 'EU'

        SIRinputChoice = input('input S,I,R for SIR vals in y axis for countries plot:')
        
        if SIRinputChoice == 'S':
            yAxisName = "Susceptible"
        elif SIRinputChoice == 'I':
            yAxisName = "Infected"
        else:
           yAxisName = "Recovered"
        
        china = len(SIRinfoFilter(SIRinputChoice, 'China', None, None, None, Day,Day))
        us = len(SIRinfoFilter(SIRinputChoice, 'US', None, None, None, Day,Day))
        uk = len(SIRinfoFilter(SIRinputChoice, 'UK', None, None, None, Day,Day))
        eu = len(SIRinfoFilter(SIRinputChoice, 'EU', None, None, None, Day,Day))
        
        SIR = [china,us,uk,eu]
    
        plt.bar(countries,SIR)
        plt.title('Nationality infection data:'+ str(DateRange[Day]))
        plt.xlabel('Nationality')
        plt.ylabel(yAxisName)
        plt.show()
        df = (CreateDataFrame())
        MainData = pd.ExcelWriter('MainSpreadsheet.xlsx')
        df.to_excel(MainData)
        MainData.save()