Skip to content

Network Generation

Ryan Seamus McGee edited this page Aug 9, 2020 · 7 revisions

The Basic SEIRS Network Model and Extended SEIRS Network Model implement models of epidemic dynamics for populations with structured contact networks (as opposed to standard mean-field compartment models, which assume uniform mixing of the population). When using these network models, a graph specifying the contact network must be specified, where each node represents an individual in the population and edges connect individuals who have regular interactions.

Contents:

Interpretation of Contact Networks in SEIRS+

In the SEIRS+ framework, the contact network defines the set of close contacts for each individual in the population (black edges). Close contacts are individuals with whom one has non-cursory (e.g., repeated, sustained, and/or physical) interactions on a regular basis, such as housemates, family members, close coworkers, close friends, etc. Casual contacts -- individuals with whom one has incidental, brief, or superficial contact on an infrequent basis (e.g., at the grocery store, on transit, at a public event, in the elevator) -- are also represented in these models in the form of a parallel mode of mean-field global transmission. The product of the network locality parameter p and the the respective global and local transmissibility parameters set the relative frequency and weight of transmission among close (local network) and casual (global) contacts in the modeled population.

Properties of Real-World Contact Networks

Every population has unique patterns of interactions, and the unique properties of different contact network structures can have important impacts on epidemic dynamics and outcomes. It is important to carefully consider the contact patterns of each population of interest as well as the relevance of assumptions made by networks defined to represent them.

There are some properties that are shared by many human interaction networks, which the authors make an effort to capture in the contact networks generated for use with the SEIRS+ models:

  • Heterogeneity: Degree (number of contacts per individual) varies across individuals and groups of like-individuals (e.g., age groups). Groups of individuals may differ in the numbers of within- and between-group contacts they make.
  • Broad degree distribution: Most individuals have roughly average connectivity (degree), but there is individual variation around the mean degree (this is in contrast with, scale-free networks where most individuals have very low degree and the mode is often well below the mean).
  • Heavy-tailed degree distribution: A small number of individuals have many more contacts than average, so the degree distribution tends to have a relatively long right tail.
  • Assortativity: There tends to be correlation in degree between adjacent nodes in the contact network. That is, highly-connected individuals tend to have highly-connected contacts.
  • Transitivity (aka clustering): Individuals A and B are relatively likely to be contacts of each other if they both share a mutual contact C.
  • Community Structure: Contact networks often have communities of individuals (groups of nodes) that are more likely to be contacts of each other than they are to be with individuals from another community.

Network Generation

Contact networks can be defined and generated by any method that is appropriate for representing the user's population and scenario of interest. Network generation is not a focus of the SEIRS+ package itself, but a few network tools are briefly described here.

networkx generators

The networkx package includes network generation functions for a number of classes of networks. Many of these are not particularly relevant to contact network structures of interest, but a few of them can be useful. In general it's best to closely tailor network definition to your population of interest, but networkx is a readily available package and its off-the-shelf generators can be handy for quick exploration.

LFR Networks

See networkx LFR_benchmark_graph

The LFR algorithm generates networks that have a known community structure and a broad (roughly bell-shaped) degree distribution with an exponential-like right tail. For an off-the-shelf generator, the LFR network has is in the ballpark.

Caveats: LFR networks typically have very low assortativity and transitivity, which are important features of many real contact networks. In addition, the implementation of the LFR algorithm in the networkx generator function has a known bug that causes it to randomly fail to converge on a generated network in some attempts (calling the generator function until it successfully returns is one workaround).

Barabasi-Albert (BA) Networks

See networkx barabasi_albert_graph

The Barabasi-Albert algorithm generates random scale-free networks using a preferential attachment mechanism. The power law degree distribution of the BA network is relevant to some human networks (e.g., the internet, citation networks, some social networks). However, BA networks do not have broad degree distributions or levles of assortativity, transitivity, or community structure that are reasonable for many such networks. That said, the networkx BA generator is fast and reliable, so it can sometimes be useful for rapid testing and protoyping with the SEIRS+ models.

SEIRS+ network generation tools

FARZ Networks

The FARZ algorithm generates networks with built-in community structure and broad, heavy-tailed distributions for the degree of nodes and sizes of communities. The FARZ algorithm has parameters for average degree, number of communities, strength of the community structure, the transitivity, assortativity (degree correlation), and the distribution of the community sizes. The tunability of these properties makes the FARZ generator an attractive method for generating contact networks for use with SEIRS+.

Code implementing the FARZ algorithm can be found on github, and a version of their generator function is included in the FARZ.py module of the SEIRS+ package.

FARZ Parameters

Parameter Description Data Type
n number of nodes REQUIRED
m number of edges created per node REQUIRED
k number of communities REQUIRED
beta probability of edges formation within communities, rather than between (strength of community structure) 0.8
alpha strength of common neighbor's effect on edge formation (tunes transitivity, clustering) 0.5
gamma strength of degree similarity effect on edge formation (tunes assortativity) 0.5
r maximum number of communities each node can belong to 1
q probability of a node belonging to the multiple communities 0.5
phi constant added to all community sizes, higher number makes the communities more balanced in size, 1 results in power law community size distribution 10
epsilon probability of noisy/random edges 0.0000001
t probability of also connecting to the neighbors of a node each nodes connects to (tunes transitivity, clustering) 0

Demographic Community Network

We define a function for generating community-level contact networks with realistic network properties as well as age-stratification, households, and communities (e.g., schools, workplaces) that are calibrated to demographic statistics for a population of interest.

Each node is assigned an age bracket (0-9, 10-19, … 70-79, 80+) according to population-level age distribution (e.g. from census data). FARZ network layers are generated to represent the out-of-household regular contacts amongst individuals of certain age groups (i.e., children, adults, seniors). FARZ networks have a community structure, parameterized in this function such that half of an individuals connections are with members of their own community and half of their connections are with individuals from outside their own community. Separate FARZ network layers are generated for the 0-9 age group (communities can be thought of as primary schools), the 10-19 age group (communities can be thought of as secondary schools), the 20-59 age group (communities can be thought of as workplaces), and the 60+ age group. The degree distribution of these networks are broad with a heavy tail. The mean degree for each layer is calibrated to avaverage number of contacts by age group from this study.

Nodes are divvied up into households, such that the distribution of household sizes and the household age demographics data provided to the function. All of the nodes in a household are strongly connected, which rivots together the LFR layers for each age group. The resulting graph ends up resembling age-age interaction matrices estimated by this study (but this data isn’t used directly).

In the SEIRS+ network models, there is also a probability p of well-mixed global interactions (nodes interacting with a randomly drawn node from anywhere in the network), which is an avenue for both within- and between-age-group contacts.

This network generation function can also return versions of the same contact network where social distancing and/or age group isolation ("cocooning") has been applied.

Demographic calibration

This function generates networks that are calibrated to age distribution, household size, and household age composition figures that are specified by the user. The function expects these statistics to be provided in a dict that has the following structure (figures shown are from US census data).

household_data = {
                   'age_distn':{'0-9': 0.121, '10-19': 0.131, '20-29': 0.137, '30-39': 0.133, '40-49': 0.124, '50-59': 0.131, '60-69': 0.115, '70-79': 0.070, '80+'  : 0.038  },
                   'household_size_distn':{ 1: 0.284, 2: 0.345, 3: 0.151, 4: 0.128, 5: 0.058, 6: 0.023, 7: 0.012 },
                   'household_stats':{ 'pct_with_under20': 0.337,                      # percent of households with at least one member under 60
                                       'pct_with_over60': 0.380,                       # percent of households with at least one member over 60
                                       'pct_with_under20_over60':  0.034,              # percent of households with at least one member under 20 and at least one member over 60
                                       'pct_with_over60_givenSingleOccupant': 0.110,   # percent of households with a single-occupant that is over 60
                                       'mean_num_under20_givenAtLeastOneUnder20': 1.91 # number of people under 20 in households with at least one member under 20
                                     }
                 }

Layer definitions

The age brackets to be included in each network layer and the target mean degrees for each layer are defined by the layer_info dictionary. The default dictionary defining the layers is the following, which is based on degree data from this study:

layer_info  = { '0-9':   {'ageBrackets': ['0-9'],   'meanDegree': 8.6,  'meanDegree_CI': (0.0, 17.7) },
                '10-19': {'ageBrackets': ['10-19'], 'meanDegree': 16.2, 'meanDegree_CI': (12.5, 19.8) },
                '20-59': {'ageBrackets': ['20-29', '30-39', '40-49', '50-59'], 'meanDegree': ((age_distn_given20to60['20-29']+age_distn_given20to60['30-39'])*15.3 + (age_distn_given20to60['40-49']+age_distn_given20to60['50-59'])*13.8), 'meanDegree_CI': ( ((age_distn_given20to60['20-29']+age_distn_given20to60['30-39'])*12.6 + (age_distn_given20to60['40-49']+age_distn_given20to60['50-59'])*11.0), ((age_distn_given20to60['20-29']+age_distn_given20to60['30-39'])*17.9 + (age_distn_given20to60['40-49']+age_distn_given20to60['50-59'])*16.6) ) },
                '60+':   {'ageBrackets': ['60-69', '70-79', '80+'], 'meanDegree': 13.9, 'meanDegree_CI': (7.3, 20.5) } }

The user can provide their own dictionary with the same structure to override the default layer definitions above.

Social distancing

This function can optionally return a version of the generated network where social distancing has been applied by using the edge pruning mechanism of the custom_exponential_graph() function (also included in this package). The user provides a list of distancing magnitude values to the distancing_scales argument of the generate_demographic_contact_network() function (which are passed to the scale argument of the custom_exponential_graph() function; the smaller the scale value, the more edge pruning and thus distancing is applied. A version of the generated network is returned for every distancing scale in the list provided to distancing_scales (in addition to the baseline network).

Age group isolation

This function can optionally return a version of the generated network where some age groups have been isolated by having their out-of-household connections removed (within-household connections remain). The user provides a list of age group labels ('0-9', '10-19', '20-29', etc.) to the isolation_groups argument of the generate_demographic_contact_network() function. If one or more age groups are provided to isolation_groups, a version of the generated network is returned where all specified age groups have had their out-of-household edges removed (in addition to the baseline network).

The function that performs this network generation has the following arguments

Argument Description Data Type Default Value
N total number of nodes in the population int REQUIRED
demographic_data dictionary specifying age and household composition distributions
See Demographic calibration for more info
dict REQUIRED
layer_generator The algorithm to use in generating in network layer ('FARZ' or 'LFR' string 'FARZ'
layer_info dictionary specifying the age groups and mean degree targets for each network layer
See Layer definitions for more info
dict None
*(use default layers)
distancing_scales list of social distancing scales for which versions of the network should be returned
See Social distancing for more info
list []
isolation_groups list of age groups for which a version of the network should be returned with their out-of-household edges removed
See Age group isolation for more info
list []

This function returns the following

Returned Description Data Type
graphs list of graphs (networks) generated, includes baseline network always, includes distancing and/or age group isolation versions as applicable list of networkx Graph objects
individualAgeBracketLabels list of the age groups assigned to each of the N nodes list of strings
households list of lists giving the node IDs for each household list of lists of ints

Workplace Network

We define a function for generating contact networks that resemble workplaces and other multi-level modular groups.

FARZ network layers are generated to represent cohorts of employees (e.g., departments, floors, shifts). FARZ networks have a tunable community structure, so each cohort includes some number of communities, which can be thought to represent teams (i.e., groups of employees that work closely with each other). Employees may belong to more than one team (specified by a FARZ parameter), but employees belong to only one cohort. An employee's intra-team and intra-cohort contacts are defined by the FARZ cohort network they belong to. A specified percentage of each employee's total number of workplace contacts can be with individuals from other cohorts. An employee's inter-cohort contacts are drawn randomly from the pool of individuals outside their own cohort.

The number of cohorts, number of employees per cohort, number of teams per cohort, number of teams employees belong to, mean intra-cohort degree, percent of within- and between-team connections, and percent of intra- and inter-cohort connections can be controlled with the arguments to the generate_demographic_contact_network() function (some of which are passed as parameters to the FARZ generator).

Distancing

This function can optionally return a version of the generated network where social distancing has been applied by using the edge pruning mechanism of the custom_exponential_graph() function (also included in this package). The user provides a list of distancing magnitude values to the distancing_scales argument of the generate_demographic_contact_network() function (which are passed to the scale argument of the custom_exponential_graph() function; the smaller the scale value, the more edge pruning and thus distancing is applied. A version of the generated network is returned for every distancing scale in the list provided to distancing_scales (in addition to the baseline network).

The function that performs this network modification has the following arguments

Argument Description Data Type Default Value
num_cohorts number of cohort layers to generate int 1
num_nodes_per_cohort number of nodes per cohort
number of nodes per FARZ layer, FARZ param n
(can provide single value to use in all cohorts or list of values for each cohort)
int or list 100
num_teams_per_cohort number of teams per cohort
number of communities per FARZ layer, FARZ param k
(can provide single value to use in all cohorts or list of values for each cohort)
t 10
mean_intracohort_degree mean number of within cohort contacts per individual
mean degree per FARZ layer, FARZ param m
(can provide single value to use in all cohorts or list of values for each cohort)*
t 6
pct_contacts_intercohort percentage of each employee's total workplace contacts (total degree) that are inter-cohort interactions float 0.2
farz_params dictionary specifying parameters for the FARZ network generator, other than params n, k, and m which are given by the arguments above
default: {'alpha':5.0, 'gamma':5.0, 'beta':0.5, 'r':1, 'q':0.0, 'phi':1, 'b':0, 'epsilon':1e-6}
dict see left
distancing_scales list of social distancing scales for which versions of the network should be returned
See Social distancing for more info
list []

This function returns the following

Returned Description Data Type
workplaceNetwork dictionary of graphs (networks) generated, includes baseline network always, includes distancing and/or age group isolation versions as applicable dict of networkx Graph objects
cohorts_indices list of lists giving the node IDs belonging to each cohort list of lists of ints
teams_indices list of lists giving the node IDs belonging to each team list of lists of ints

custom_exponential_graph() function

This function defines an edge pruning mechanism that returns a modified version of a graph where a subset of the original edges have been removed.

Here is the process:

  • For each node:
    • Count the number of neighbors of the node N
    • Draw a random number R from an exponential distribution with some mean=M. If R > N, set R=N.
    • Randomly select R of this node’s neighbors to keep, delete the edges to all other neighbors.

This results in network whose set of edges are a subset of the original network's edges and where the mean degree has been decreased.

This method is useful for generating quarantine or social distancing versions of a baseline contact network.

The function that performs this network modification has the following arguments

Argument Description Data Type Default Value
base_graph the base graph that is to be modified
*(if None provided, this function will generate a BA network as a starting point using to the optional m and n arguments
networkx Graph object None
scale the scale (mean) of the exponential distribution used in the edge pruning method (denoted M above). The smaller the scale, the more edges are pruned. int 100
min_num_edges A minimum number of edges to ensure all nodes are left with after pruning int 0
m the m argument of the networkx BA network generator (only relevant if no base_graph provided int 9
n the size (num nodes) of the networkx BA network to be generated (only relevant if no base_graph provided int None