# Lecture 8


Topics:

* Link clustering
* Percolation

## Link Clustering

* A simple way to have overlapping clusters: cluster the links, nodes are part of all clusters of adjacent edges.
* Natural: Edges rather capture a relation in a community

Example: Link clustering by Ahn et al.
Try to cover the topological role of edges - what do they connect to? Only in connected neighborhood.


* Step 1: Define link similarity - by neighbourhood of nodes connected to them $$S(e_{ik}, e_{jk}) = \frac{|n_+(i)\cap n_+(j)|}{|n_+(i)\cup n_+(j)|}$$\
  $n_+(i)$: the list of the neighbors of node $i$, including itself.\
  $S$ measures the relative number of common neighbours $i$ and $j$ have (1 for same neighbours).\
  The less the overlap, the smaller $S$.
* Step 2: Apply hierarchical clustering - using similarity matrix, single linkage. Assign nodes to adjacent edge clusters (weighted).

Computational complexity:
* Step 1: Comparison between two links requires $max(k_1, k_2)$ steps. For scale free networks the step has complexity $\mathcal{O}(N^\frac{2}{\gamma - 1})$.
* Step 2: Hierarchical clustering requires $\mathcal{O}(M^2)$ and $\mathcal{O}(N^2)$ for sparse networks.

#### More on this
This is probably a really good and brief summary of the topic. I just want to add, for my own understanding some more detailed explanation from what we can extract from the Ahn et al. source and the networks book potentially.

Link Clustering scheint nach meinung der Autoren in 2010 schon eine wirklich überlegene methode zu sein.

Jetzt ist es natürlich interessant, ob das heute immer noch so gesehen wird, was man wahrscheinlich daran fest machen kann, ob das ganze implementiert in den Paketen zur verfügung steht.

Es gibt eigentlich nur eine öffentliche Implementierung linkcomm als ein R paket. Damit kann man link communities generieren, visualisieren und analysieren. Es gibt zu dem ganzen auch ein dazugehöriges paper.

Man könnte auch mal bei connected papers nachsehen, aber da sieht man jetzt nicht ultra viel. Auf jeden fall nicht eine die einfach link communities part 2 ist oder so.



## Spectral Clustering

* Uses the Eigenvalues of the Laplacian matrix L.
  N-dimensional vector $v$ is Eigenvector of $A$ with Eigenvalue $\delta$ (scalar) if $Av = \delta v$ (multiplying with A results in vector in same or opposite direction, just scaled)
* L has no non-negative Eigenvalues.
* The smallest Eigenvalue of $L$ is 0 with vector all 1s (Calculates degree minus sum of adjacent entries per node).
* The multiplicity of Eigenvalue 0 (i.e. the number of distinct Eigenvectors for it) equals the number of connected components in the graph.
* Obtain the $k$ Eigenvectors associated with the $k$ smallest eigenvalues of $L$ standard calculation in algebra packages)
* Represent each node as a $k$-dimensional vector
* Clusters nodes as points in space using $k$-means clustering (or use any other method)
* The spectral clustering algorithm is related to cut-set partitioning - can be thought of as solving a continous relaxation of this problem. (No further details here)


## Conductance

* Modularity is investigated and used quite often, but problematic for interpretation (can be unintuitive).
* One simple alternative (out of many - just to mention more than one): **Conductance**
* Conductance of a cluster $C$ measures the fraction of total edge volume that points outside of the cluster. $f(C) = \frac{O_C}{2M_C + O_C}$ where $O_C$ is the number of edges leaving $C$ and $M_C$ the number of edges in $C$.
* Compared to modularity, it is not based on a network model.
* Optimisation for a graph is also NP-hard

* (Variant of spectral clustering: Use only one Eigenvector (second), sort vertices by component value, split at the lowest conductance.)
* See Yang&Leskovec for practical evaluation of measures and their correlation.

## Threats and Robustness

Robust: strong, healty, unlikely to break or fail
From latin robustus, made out of robur (hardwood/oak)

For networks?

Networks as a model representation for real-world processes and structures

* Computer networks
* Power networks
* Physical contacts between humans
* ...

Different potential, requirements and impact of changes in network structure.

Structure-function relation, can we predict behaviors?

Important, e.g., for communication or commodity networks.

**Analysis of network robustness / resilience against damage or attacks**

Investigate the ability of a system to maintain its connectivity and function, tolerance to errors.

Damage: Errors in function or random failure.\
Attack: Deliberate manipulation / interference - probably specific entities (hubs) targeted

Most simple damage: deletion of nodes and edges, e.g. failure of internet router or power transmission line.

Examples: Power Grid

You can also have combined effects in interrelated networks...

Single local events vs. contagious processes (later)
Impact of event - does the graph still have a certain property / set of properties?

Most simple event: deletion of node or edge - e.g. is the remaining graph still k-connected, do we have a dominating large component?

Impact of process - how do the dynamics proceed and spread certain states or properties?

Technical / digital networks

Error:

* Failure of internet router or power transmission line.

Attacks:

* Bluetooth viruses - need physical proximity similar to influenza / pandemic. Can also be used to simulate spreading or control contact.
* SMS/MMS viruses from contact list - similar to computer virus rather than influenza.

In general:

Modelling might need information on entities - e.g. states or compartmentilasation as susceptible, infected, recovered, immune, latent (not yet contagious but exposed).

Entities can switch between states or accumulate several. Often only certain trajectories are possible, e.g. immune $\rightarrow$ infected is not possible.



### Percolation
> how do phenomena spread?

The term "percolation" was introduced in 1957 by Broadbent and Hammersley in connection with their new class of mathematical problems. These problems concerned the flow of a liquid through a random maze, and thus the name "percolation theory".

* Internet routers fail (always a fraction) $\rightarrow$ How does the remainder perform?
* People get immunised $\rightarrow$ How does a disease spread through a population?
* Proteins misfold, DNA mutates $\rightarrow$ What is the impact on the organism's health?
* Understanding of mechanisms (metabolism, gene regulation, ...) through knock-outs

In some networks, entities have limited time span, e.g. actors or scientists publishing.\
Knock-on effects (domino, chain reaction) of hampering or benefitting others, e.g. through threshold for herd immunity.

How to investigate the corresponding effects? Percolation theory\
Percolation: Removal / non-functioning of nodes or edges - site / bond percolation

**Basic strategies for node removal (site-percolation):**

* uniformly random (random drop-outs - error)
* according to their degree (attack goal or more complex thus higher failure rate)
* high betweenness centrality
* ...

**Basic properties for investigation of impact:**

* Connectivity of structure: components/size
* Largest cluster size / containment probability
* Average shortest path

Different classes/instances of networks might perform quite differently.

Note the similarity (inverse process) to our investigation of growing networks.

Percolation quantity $\sigma$ = probability of nodes functioning\
Functional nodes: occupied ($\sigma$ also occupation quantity)\
$\sigma=0$  all nodes present / functioning, $\sigma = 0$

For large percolation quantity still mostly connected - giant components exist.

Network breaks at some threshold.

Remember the dynamic forming process of evolving networks. We can look at such processes from two directions - growing or disintegrating.  (Note the difference - probability of edges vs.s nodes)

Giant component breaks apart / forms at some threshold

Here: A percolation transition occurs. The network percolates at formation of giant component.

To distinguish the parts that result from the disintegration from the components of the original graph, they are called clusters in context of percolation.

Note that this is a quite universal phenomenon in many cases:

* Water to ice
* Magnetism
* Whipped cream

Let's consider a very simple model - a square lattice (infinite)\
At each intersection, with probability $p$ we place a pebble.\
Neighbouring pebbles are considered connected - they form clusters.

* What is the expected size of the largest cluster (similar to giant component)?
* What is the average cluster size?

The higher $p$, the larger the clusters. \
Given our experience with giant component formation, intuition:

* Cluster size does not gradually change with $p$
* For large range many tiny clusters before approaching critical value $p_C$
* Large cluster emerges at $p_C \Rightarrow$ phase transition "percolate whole lattice" (considered infinite)


The percolation transition can be characterised through power laws.

**Cluster size $\bar s$:** average size of all finite clusters for a given $p$

$$\bar s \sim |p - p_C|^{-\gamma}$$
i.e. cluster size diverges when approaching $p_C$


**Order parameter $P_\infty$:** probability that a pebble belongs to the largest cluster
$$p_\infty \sim (p - p_C)^\beta$$
i.e., drops to zero as $p$ decreases towards $p_C$

**Correlation length:** mean distance between two sites on the same finite cluster, diverges at $p_C$.
$$\xi \sim |p - p_C |^-\nu$$

$\gamma, \beta, \nu$ are called critical exponents of the medium (here grind).


Note that $p_C$ depends on lattice type and dimension.

E.g.

* $p_C \approx 0.593$ for 2-dim. square lattice
* $p_C = 0.5$ for 2-dim. triangular lattice
* $p_C \approx 0.3116$ for cubic lattice (need to cover less to reach percolation - degree)

Interestingly, the critical exponents are universal in the sense that they depend on the dimensionality $d$, e.g. not the lattice.

For $d \geq 6$, we even have $\gamma = 1, \beta = 1, v = \frac{1}{2}$, i.e.
indepencence of $d$ (note: similar to random networks).

Interpretation as forest fire: Each pebble a tree, which ignites its neighbours after catching fire.

Fire spreads until no burning tree non-burning neighbour. Cluster sizes and average path length - what fraction burns down and how quickly?

At critical point (middle) large component burns down, but it takes a very long time.

Clearly, we can investigate the process as from the opposite direction: Given a certain structure, what happens if we remove a fraction $f$ of the pebbles?

Probability to be contained in large cluster drops sharply at critical fraction.

In summary, the breakdown of a lattice under random node removal is not a gradual process.

**Removal of small fraction has limited impact.**\
**Sudden breakdown apart with phase transition.**

$0 \leq f \leq f_c$: There is a giant component. $P_\infty \backsim |f - f_C|^\beta$

$f = f_C$: The giant component vanishes.

$f \geq f_C$: The lattice breaks into many tiny components.


* How does this work with our ususal graph models?


#### uniform removal for CM

Consider the configuration model (specified degree distribution $p_k$).

Calculate giant percolation cluster $C$ properties similar to giant component investigation.

Consider still present node $v$:

* $v$ in $C$ then connected to $C$ via its neighbours
* $v$ not in $C$ then not connected via its neighbours

Let $u$ be average probability that a node is not connected to $C$ via a particular neighbour $w$.  Degree $k$ then total probability not to be connected to $C$ is $u^k$.

Average over probability distribution $p_k$ then average probability of not being in giant cluster is $$\sum_k P_k u^k = g_0(u)$$ where $g_0(z) = \sum_{k=0}^\infty P_k z^k$ is the generating function for the degree distribution (Simply power series).

Average probability to belong to giant cluster $1 - g_0(u)$.

Note that this is for a node that has NOT been removed. Fraction $S$ that are in giant cluster given by non-removed fraction \phi thimes probability $$S = \phi [1 - g_0(u)]$$

But what is $u$? Average probability is not connected to $C$ via a particular neighbor $w$.

Similar to giant component analysis, but two ways here: Neighbour $w$ removed $(1 - \phi)$ or not in $C (\phi u^k)$.

$w$ not connected to $C$ via $k$ other edges (not the one to $v$): $u^k$ for functional nodes

Thus, the total probability to be not connected to $C$ via $w$ is $1 - \phi + \phi u^k$

Here $k$ is the excess degree of $w$ as discussed previously, following distribution $q(k) = \frac{k + 1}{\bar k} P_{k+1}$.

We can average over the distribution $$u = \sum_{k = 0}^\infty u = q(k)(1-\phi + \phi u^k) = 1 - \phi + \phi \sum_{k=0}^\infty q(k)u^k = 1-\phi + \phi g_1(u)$$ where $g_1(z) = \sum_{k=0}^\infty q(k)z^k$ is the generating function for excess degree distribution as seen before.

Note that $\sum_{k=0}^\infty q(k)=1$ as sum of probabilities.


We can average over the distribution for the closed form $u = 1 − \phi + \phi g_1(u)$, but still often not possible to solve it.

Intuitive way of graphical representation depending on degree distribution:

* $g_1$ is a polynomial with only non-negative coefficients (probabilities)
* with non-negative u must have non-negative value (also all derivatives).


Now multiply by $\phi$ (compress) and then shift upwards by $1 − \phi$.\
Points at which curve crosses line $y=u$ are solutions.

We have tangent with gradient 1 at 1 (where percolation threshold occurs):


Derivative $[\frac{d}{du}(1 - \phi + \phi g_1(u))]_{u=1} = 1$

Thus the value of \phi at the transition (critical value) is $\phi_C = \frac{1}{g_1'(1)}$

Given $g_1(z)= \sum_{k=0}^\infty q(k)z^k$ and $q(k)=\frac{k+1}{\bar k} P_{k+1}$ we have
$$g_1'(1) = \frac{1}{\bar k} \sum_{k=0}^\infty k(k+1)P_{k+1} =$$
$$\frac{1}{\bar k} \sum_{k=0}^\infty k(k-1) P_k = \frac{\bar k^2 - \bar k}{\bar k}$$

Then $\phi_C = \frac{1}{g_1'(1)} = \frac{\bar k}{\bar k^2 - \bar k}$

The minimum fraction of vertices that must be present / occupied in CM model network for giant cluster to exist.

In practice, e.g. our communication networks / internet, would like to make $\phi_c$ low so that giant cluster exists even when some fraction non-functional.

Need $\bar k^2 \gg \bar k$

For Poisson distribution $P_k = e^{-\bar k} \frac{\bar k^k}{k!}$ and mean degree $c = \bar k$ and $\bar k^2 = c(c+1)$ $$P_k = e^-c \frac{c^k}{k!}$$

Thus $\phi_c = \frac{1}{c}$

Make $c$ large to withstand attacks - large average degree. For $c=4 \phi_c = \frac{1}{4}$ meaning three quarter of the nodes need to fail before giant cluster destroyed — robust against random failure.

**Internet degree closer to scale-free though, not Poissonian. $\bar k^2$ can be very large for such networks, thus $\phi_c$ very small.**

In theory, performance is hardly affected by any outage even though structure not 1:1 to the model.

In practice very robust but you still have an outage for all the nodes that are taken out...



* **Molloy-Reed criterion**: Any randomly wired (CM) network (any deg. dist.) has a giant component if $\frac{\bar{k^2}}{\bar k} > 2$
* Compare $g'_1(1) = \frac{1}{\bar k} \sum_{k=0}^\infty k(k -1) P_k = \frac{\bar{k^2} - \bar k}{\bar k}$ (average excess degree) and our critical threshold $\phi_c = \frac{1}{g'_1(1)} = \frac{\bar k}{\bar{k^2} - \bar k}$, CM avg. Neighbour degree $\frac{\bar{k^2}}{\bar k}$


($g_0(z) = \sum_{k=0}^\infty P_k z^k$ is the generating function for the degree distribution, $g_1(z) = \sum_{k=0}^\infty q(k) z^k$ is the generating function for excess degree distribution, \phi_c need to be still there at least)

* For a random network we had second moment $\bar k^2 = \bar k (1 + \bar k)$ then MR $\frac{\bar k (1 +\bar k)}{\bar k} = 1 + \bar k > 2$ i.e. $\bar k > 1$ (every node should still have at least two connections to build GC)

### Percolation - Scale Free Networks
* Compare lattice to scale-free: Hubs are present
* Impact of random removal?

Given that hubs are rare but exist and connect many elements, random removal will hit and destroy connection with low probability.

Hitting a hub however is bad.

**Percolation Simulation**

* Attack on scale-free networks
* Rapid breakdown under removal of hubs
* Single runs (not averaged)
* Fraction of nodes in largest CC vs. Nodes removed
    * Random node removal in random graph
    * Random node removal in scale-free
    * Dashed line: Maximum possible
    * orange: vs original graph size
    * blue: vs remainder


**Speeding up Calculations**

Multiple runs better (average results, make curves smooth) but significant impact on running time
* Main calculation effort is finding the clusters – e.g. BFS $O(n+m)$
* One run: Remove nodes one by one, calculate clusters from scratch – $O(n(n+m))$

Instead: Detect clusters for $k$ nodes based on last result?

* If we remove nodes, we need to check for cluster split – have a giant cluster for a long time.
* If we add nodes, we exactly know if clusters are merged by checking cluster labels.

1) Start with single node (cluster), label with cluster ID
2) Add nodes one by one.\
    Check for adjacent edge if endnode already added – then add edge, and if clusters merged (different ID, always for first edge) – then relabel smaller cluster. Keep cluster info (size).

Maximum numbers of relabelings per node $\log(n)$ (otherwise more than $n$ nodes merged) – $O(n \log n)$.
Effort per node: Relabel plus average edges followed \frac{2M}{N} (mean degree)

$O(1 + \frac{M}{N}) \cdot n \log (n) = O((m+n) \log(n)$ Repeat many times.



### Robustness

* Maintain basic functions in the presence of errors, i.e. missing nodes and links.
* Cover both attacks (specific nodes ) as well as random errors.
* Example Star Graph:
    * Random failure: breakdown probability $\frac{1}{N}$
    * Attack: direct breakdown for core $\bar k = \frac{16}{9}$
* Example Triangle Fan:
    * More robust against attack $\bar k = \frac{32}{9}$
    * However, cost is often correlated with $\bar k$

* Any way to maximise robustness without increasing the cost?
* Decrease critical transition value $\phi_c$ (or increase $f_c = 1 - \phi_c$
* As $\phi_c = \frac{1}{g'_1 (1)} = \frac{\bar k}{\bar{k^2} - \bar k}$ it depends only on $\bar k$ and $\bar{k^2}$
* As $\bar k$ is correlated with our cost, we need to maximise $\bar{k^2}$ while keeping $\bar k$
* Bimodal distribution: Nodes with either degree $k_{min}$ or $k_{max}$
* $p_k=(1-r)\delta(k - k_{min}) + r \delta (k - k_{max})$ for $r$ fraction with maximum degree


Example: One hub
* With small $\bar k$ only hub holds network together.
* With larger $\bar k$ GC without hub, but hub still.
* With even larger $\bar k$ no difference between attack and error.

Optimised if single node with maximum degree and the rest minimum degree.

But wait – didn't we have that already in the star and said it is not sufficient?

There we had one hub with $k_{max}$ and the rest had $k_{min}$.

But if $k_{min}>1$ then even without the hub the remainder must form a large connected component that is robust against target attacks.



## Spreading of Phenomena

What are the effects of failure / exposition events?

Starting events at single locations or small areas have the potential to cause failures / problems throughout a diversity of networks – failures, disease contraction, rumours...

* Disease spreading in social networks – e.g. Covid, influenza. Contacts among family, friends, work, etc. can be modelled as networks. What is the transmission structure, and how can we influence it? Super-spreaders?

* Contagion in technical networks – e.g. computer or bluetooth virus. Short range (similar to human disease) and long range transmission possible.

Cascading effects:
* power networks (note: much more complicated than just looking at the structure)
* financial crisis (One bank topples, panic starts, ...)
* transport delays (FAA estimation 2018: 28 billion delay cost)


### Models for Cascading Failures
Influential factors:
* Network structure
* Propagation process
* Failure criteria for components

Differ for different applications and instances, but we can identify and model universal effects across different systems.

Modeling needs a decision on some abstraction level.

Cascade characterisation
* Some flow / exchange over the network, e.g. power, commodity, or information
* Local breakdown rule for each component that determines when it contributes to cascade by failing (power grid, finances, internet) or passing commodity / information (social networks)
* Redistribution mechanism of system for the traffic from one node to another upon failure or activateion.

Different abstraction levels.


### Failure Propagation Model

Introduced for modelling of information spreading.

Assumes network with arbitrary degree distribution.
* Each node contains an agent in state 0 (active/healthy) or 1 (inactive or failed)
* Same breakdown threshold value $\phi = \phi_i$ for all nodes $i$. Healthy node breaks down if at least fraction $\phi$ of neighbours has failed.

Simulation:
* Start with all agents' states as 0
* At time $t_0$ one agent switches to state 1 (component failure or information release)
* In each subsequent time step, randomly pick and agent and update the state with threshold rule.

Initial pertubation of states might die out immediately depending on local structure.

Cascading in small network with $\phi = 0.4$\
Purple: Inactive node

Starting with A, the whole network is inactive within two steps

What happens when we start with B?

* Subcritical regime: With high $\bar k$ it is unlikely that a change moves nodes over threshold.
* Supercritical regime: With small $\bar k$ flipping a single node can put several neighbours over threshold, triggering a cascade, major breakdown.
* Critical regime: Boundary with widely different cascade sizes. Barabasi simulations' probability  distribution for cascade size: $p(s) \sim s^{-\alpha}$ with $\alpha = \frac{3}{2}$ for random networks the avalanche exponent.


### Branching Model

* Hard to analytically predict scaling behavior of cascades (avalanches) in propagation model
* Most simple model: Branching model
* Failure cascades resemble a branching process on the network structure – the initial trigger node is the root, creating branches of triggered nodes in subsequent steps.

1) Start with a single active node
2) In each step, each active node produces $k$ offsprings, where $k$ is selected from a $P_k$ distribution.
3) If a node selects $k=0$, the branch dies out. Otherwise we habe $k>0$ new active sites.

The avalanche size corresponds to the size of the tree when all cative sites die out.

Same phases as in cascading failures model.
Determined by $\bar k$, the p_k distribution.

Examples of created trees depending on the regime. Green node marks root.

* Subcritical regime: $\bar k < 1$ on average less than one offspring. Exponential distribution of sizes.
* Supercritical regime: $\bar k > 1$ on average more than one offspring, all avalanches are global.
* Critical regime: $\bar k = 1$ on average exactly one offspring. So trees are large, other dies quickly. Avalanche size distribution follows power-law.

Can be solved analytically. E.g.: if $p_k$ is scale-free the avalance exponent $\alpha$ depends on the degree exponent $\gamma$ following $\alpha = \begin{cases} \frac{3}{2} \gamma \geq 3 \\
\frac{\gamma}{\gamma - 1}2 \leq \gamma < 3
\end{cases}$

The fact as such is not surprising – depending on how your degrees are distributed, the branching can succeed in a similar fashion.

Further models exist, in particular for specific use-cases (some more in the literature)




## Epidemics

Epidemic modelling – use structure of the transmission network to transmission mechanisms to investigate dynamics of spreading.

Differing transmission mechanisms regarding e.g. direct entity contacts / remote, same time / asynchronous, length of contact or availability of exposition...

E.g. information or rumour spreading – can be indirect by making information available somewhere.

Epidemic:
1) an outbreak of disease that spreads quickly and affects many individuals at the same time; an outbreac of epidemic disease.
2) an outbreak or product of sudden rapid spread, growth, or development.

Analysis similar to our previous work on structure and percolation analysis – create models that describe scenarios with specific characteristics.

These models simplify spreading characteristics – consider e.g. the complex response of organisms to an infection, depending also on multiple factors such as the current health status, immune system, ...

Simplify  these processes to simple states with potential state changes / sequences.

Often states = compartments, thus compartmental models.

Note that now we also need to take into account time for durations of states.

Most simple case: Two states, susceptible and infected. **SI model**
* Susceptible (S): "Healthy" (does not have disease), but could catch disease if contact
* Infected / Infectious (I): Contagious people that can spread the infection

Note that
* an infected person could be healthy and
* after infection often it takes time until being infectious

No modeling of life times etc.

If a reasonable network structure is missing: Homogenous mixing (Fully mixed / mass action approximation) – each individual has equal chance per unit time to come in contact with every other.

Not very realistic – remember our degree discussion for random networks.

Modeling the dynamics:
* Expected/ average susceptible individuals at thime $t: S(t)$ (short: S)
* Expected/ average susceptible infected individuals at time $t: I(t)$ (short: I)

Note: Average not integer even though actual numbers have to be.

Transmission process:
* People meet, suceptible contract infection from ones at random.
* Per-individual rate $\beta$ (randomly chosen contacts with others per unit time)

Average probability to meet susceptible persons in population of $N$ people: $\frac{S}{N}$

Average susceptible contacts of infected person per unit time: $\beta \frac{S}{N}$

* I infected individuals on average, thus average rate of new infections $\beta \frac{S\cdot I}{N}$
* Infected rate of change (diff. eq.) $\frac{dI}{dt} = \beta \frac{S\cdot I}{N}$
* Susceptible rate of change $\frac{dS}{dt} = -\beta \frac{S\cdot I}{N}$ accordingly as $S + I = N$

For convenience $s = \frac{S}{N}, i = \frac{I}{N}$ then rates $\frac{\delta i}{\delta t} = \beta s i, \frac{\delta s}{\delta t} = -\beta s i$

With $s=1-i$ we get $\frac{\delta i}{\delta t} = \beta (1 - i)i$,
the logistic equation (known from population growth models). Solving gives $$x(t) = \frac{x_0 e^{\beta t}}{1 - x_0 + x_0 e^{\beta t}}$$ with value $x_0$ at $t=0$.

Some S-shaped growth curve, increasing exponentially in a small time interval, then saturates

Note that there are many infectious diseases that don't spread like this. Also deaths are not taken into account.

* For many diseases, individuals recover and for a certain time are not susceptible.
* Additionaly state recovered (R)
* Note: Death is the same effect in this model (removed)
* Note: Death has the same effect in this model (removed)

Two-stage dynamics:
* Contacts in infections as before, contact rate $\beta$
* Second stage: At constant rate $gamma$, infected individuals recover (or die)

Lenth of time $\tau$ that infected individual stays invfected?

Probability of recovery interval $\delta \tau$ is $\gamma \delta \tau$, of not recovering $1 - \gamma \delta \tau$.

Probability
* of still being infected after total time $\tau$ is $\lim_{\delta \tau \rightarrow 0} (1 - \gamma \delta \tau)^{\frac{\tau}{\delta \tau}}} = e^{-\gamma \tau}$
* that individual remains infected for time $\tau$ and then recovers in interval $\tau$ to $\tau + d\tau$ is the above times $\gamma d \tau: \gamma e^{-\gamma \tau} d\tau$ (exponential distribution)
* of not getting infected $e^{-\beta \tau}$, transmission probably then $\phi = 1 - e^{- \beta \tau}$ (SI: $\phi = 1$)

The Probability of immediate recovery thus is high, but state of infection might be very long (compared to infectious time $\frac{1}{\gamma}$). This is unlike real-world infections (where we have a narrow peak around some average, not the exponential decline).



