# Total size distribution of Continuous-Time small outbreaks: $N \to \infty$

We now study the total number infected in small outbreaks in the $N \to \infty$ limit.  We do this by studying a Galton-Watson process with $r_2 = \beta$ and $r_0=\gamma$ having 

$$
\hat{\mu}(x) = \frac{\gamma}{\beta+\gamma} + \hat{\beta}{\beta+\gamma}x^2
$$
and starting with $X(t)=1$.

Although it is possible to solve the Forward Kolmogorov equations for this model analytically, the resulting solution only tells us the probability of having a given size at each given time.  The solution does not tell us about how past sizes and current sizes are correlated, and so it does not directly give us the total size distribution.

Instead we will use a different approach.  We first make some observations about the trees that emerge from this continuous-time Galton-Watson process.  A sample small outbreak is shown in {numref}`fig-BinaryTreeWithTime`, for which each infection event corresponds to the "death" of a node and replacement by two nodes.  

```{figure} BinaryTreeWithTime.png
width: 400px
name: fig-BinaryTreeWithTime
---
    A sample illustration of $I(t)$ for a small outbreak (which could be SIS or SIR), using a Galton-Watson conceptualization.  Each node persists for an exponentially distributed random time with rate $\beta+\gamma$.  At that point she is replaced by either $k=0$ (crosses) or $k=2$ nodes with identical properties. The corresponding probabilities are $p_0=\gamma/(\beta+\gamma)$ and $p_2=\beta/(\beta+\gamma)$.  In the disease conceptualization, the $k=0$ case corresponds to recovery.  The $k=2$ case corresponds to the infected individual infecting another individual, with the same individual represented by  different nodes before and after the event (we can think of the original infected individual being the left offspring and the newly infected individual being the right offspring).  Although this plot shows $9$ nodes, it corresponds to exactly $5$ total individuals.
---
```
We can convert this outbreak to a different representation which is simpler to analyze (but loses the time dependence):

```{figure} BinaryTreeNoTime.png
width: 400px
name: fig-BinaryTreeNoTime
---
The same outbreak as in {numref}`fig-BinaryTreeWithTIme`, but without the time dependence.  The number of offspring of each node is more clearly visible.  Again, an individual in the disease model may be represented by multiple nodes in this Galtonrepresentation.
---
```

To determine the probability of a particular size, we will want to calculate the probability of a tree like in {numref}`fig-BinaryTreeNoTime`.  To determine this probability, we first look for properties of the tree.

The following properties are relatively easy to confirm:

- Each node in the Galton-Watson process has either $k=0$ or $k=2$ offspring.  
- Aside from the initial node, each node has a single parent.
- The total number of nodes with $k=0$ corresponds to the total number of individuals infected (each infected individual eventually recovers exactly once).

If the total number of *nodes* in the tree is $j$, then from the fact that all nodes except the first have a single parent, we conclude that the total number of parent-offspring pairs is $j-1$.  However, this is also the sum $\sum k_i$.  So $\sum k_i = j-1$.  There are $\frac{j-1}{2}$ nodes with $2$ offspring, so the total number of infections is $j - \frac{j-1}{2} = \frac{j+1}{2}$.

We do not yet have enough information to calculate the probability of {numref}`fig-BinaryTreeNoTime`.  First, we will convert it into a sequence of $k_i$, then we will analyze the probability of that sequence.  There are two natural ways to order the nodes in {numref}`fig-BinaryTreeNoTime`, shown in {numref}`fig-BFSvsDFS`.

```{figure} BFSvsDFS.png
name: fig-BFSvsDFS
width: 600px
--- 
The difference between Breadth-First-Search (BFS) and Depth-First-Search (DFS) in a tree.  In BFS we start from a node and find all nodes distance $1$, then all nodes distance $2$, etc.  Each time we encounter a node, we record its degree.  In DFS we first travel down one branch recursively, "exhausting" each branch before looking at the next branch.
---
```
Many people more naturally consider Depth-First-Search (DFS).  However, for us it will be useful to use Breadth-First-Search (BFS).  DFS can be considered similarly to how we might expect inheritance to travel in a Royal Family in Europe.  The king's first son has priority, and that son's sons come next, etc.  So long as there is at least one son along that branch of the family tree, the king's second son is not considered.

The advantage for this in the proof we will be doing later is that in DFS, the offspring of a node immediately follow that node in the list.  In BFS, the location of a node's offspring in the list depends on what happens in other parts of the tree.  It will be much easier to reconstruct a tree from its sequence of offspring counts in the DFS case than in the BFS case.

The sequence of offspring counts in the DFS case is called a Łukasiewicz word:
```{prf:definition} Łukasiewicz word.
:label: def-LukWord

If we find a list of nodes $v_1, \ldots, v_j$ through a Depth-First search of a Galton-Watson tree, then the sequence $\mathcal{S}=k_1, \ldots, k_j$ where $k_i$ is the number of offspring of $v_i$ is called a **Łukasiewicz word**.
```


## The Cycle Lemma

An important part of our proof is the Cycle Lemma.  Before giving it, we must
define a cyclic permutation:

```{prf:definition} Cyclic Permutation
:label: definition-CyclicPerm

Given a sequence $\mathcal{S} = (s_1, s_2, \ldots, s_j)$, the $j$ cyclic permutations of $\mathcal{S}$ are: 

\begin{align*}
&(s_1, s_2, \ldots, s_j)\\
&(s_2, s_3, \ldots, s_j, s_1)\\
&(s_3, s_4, \ldots, s_j, s_1, s_2)\\
& \vdots\\
& (s_j, s_1, s_2, \ldots, s_{j-1})
\end{align*}
```

Now we are ready to give the Cycle Lemma:
```{prf:lemma} Cycle Lemma
:label: lemma-CycleLemma

Given a sequence $S$ of $j$ non-negative integers summing to $j-1$, there is a unique tree whose Łukasiewicz word is one of the cyclic permutations of $S$.
```

```{prf:remark}
Note that in the trees above $k$ was restricted to $0$ or $2$, but the cycle lemma does not have this restriction.
```

The proof will proceed by induction and create the unique tree whose Łukasiewicz word is a cyclic permutation of $S$.  Implicitly the inductive proof creates the following algorithm:

```{prf:algorithm} Constructing a tree from $S$
:label: algorithm-GenerateTree

**Input** 
- A length-$j$ sequence $S$ of non-negative integers summing to $j-1$

**Output**
- The unique tree whose Łukasiewicz word is a cyclic permutation of $S$

**Steps**

1. Place $j$ nodes $u_1$, $u_2$, $\ldots$, $u_j$ clockwise around a circle with $u_1$ at the top.  
2. Label each $u_i$ with $s_i$.
3. Repeat the following steps as long as more than one node remains:
   **(i)** Identify a pair of adjacent nodes so that the first has  $s_m>0$ and the following node has label equal to $s_n=0$  There may be multiple such pairs, the choice is arbitrary.
   **(ii)**  Add an edge from $u_m$ to $u_n$.  Remove $u_n$ from the cycle, and reduce $s_m$ by $1$.  
```
Note that each step of the algorithm reduces the total number of nodes in the cycle by $1$ and reduces the sum of the $s_i$ by $1$.  Thus the sum is always one less than the number of nodes.  This gurantees that there is always a pair of nodes for step (3.i) until only one node remains.

```{prf:proof}
Consider a length-$j$ sequence $S = (s_1, s_2, \ldots, s_j)$ of non-negative integers that sum to $j-1$.  

We will use induction on $j$.  If $j=1$, then $S=(0)$.  This is a Łukasiewicz word, and the corresponding tree is simply the isolated node $u_1$ with no offspring.

Now consider $j \geq 2$.  Place the nodes in a cycle and label them following the first two steps of {prf:ref}`algorithm-GenerateTree`.  We will prove that there is a unique tree that can be constructed with the nodes on this cycle whose ordering corresponds to a Depth-First-Search (starting from whichever node ends up being the root).  

The $s_i$ sum to $j-1$ and are all non-negative integers.  Because $0 < j-1< j$, we are guaranteed at least one value of $0$ and one non-zero value.  It follows that somewhere there is a non-zero value $s_i>0$ which is followed immediately by $s_{i+1}=0$, (taking indices to be modulo $j$ so that if $i=j$ then $s_{j+1}=s_1$).

No matter what cyclic permutation of $S$ we consider, if it is a Łukasiewicz word there must be an edge from $u_i$ to $u_{i+1}$.  Add an edge from $u_i$ to $u_{i+1}$.  

Consider now the new sequence $\hat{S} = (s_1, \ldots, s_{i-1}, s_i-1, s_{i+1}, \ldots, s_j)$, which has length $j-1$ and sum $j-2$.  By the inductive hypothesis, there is a unique tree on the $j-1$ nodes whose Łukasiewicz word is a cyclic permutation of $\hat{S}$, and this tree is constructed by {prf:ref}`algorithm-GenerateTree`.  Add the edges of this tree.  Along with the $u_i$ to $u_{i+1}$ edge, we have a new tree with $j-1$ edges and $j$ nodes.  This tree is the only possible tree constructed with this cyclic orientation.

We find the (unique) root of this tree $u_\ell$, and create the sequence $(s_\ell, s_{\ell+1}, \ldots, s_j, s_1, \ldots, s_{\ell-1})$.  This is a cyclic permutation of $S$ and it is a Łukasiewicz word.  
```



## Finding the probability of a Galton-Watson tree with $j$ nodes

We consider a Galton-Watson tree as above with $p_0 = \frac{\gamma}{\beta+\gamma}$ and $p_2 = \frac{\beta}{\beta+\gamma}$.  When we construct the Łukasiewicz word, at each stage the probability that the next node has $k=0$ or $k=2$ is an independent choice with probabilities $p_0$ and $p_2$.  This means that the probability of a tree with a specific length-$j$ Łukasiewicz word is equal to the probability that a length-$j$ sequence of values chosen independently from the offspring distribution is that Łukasiewicz word.

We now consider the set of all length-$j$ sequences that sum to $j-1$.   Any two sequences that are cyclic permutations of one another have equal probability.  This means that the combined probability of all length-$j$ sequences that sum to $j-1$ is $j$ times the combined probability of all length-$j$ Łukasiewicz words.

Now we observe that the probability a length-$j$ sequence sums to $j-1$ is equal to the coefficient of $x^{j-1}$ of the product $\hat{\mu}(x)^j$. So

\begin{align*}
\mathbb{P}[j \text{ nodes}] &= \frac{1}{j} \left[x^{j-1}\right]\big( \hat{\mu}(x)^j\big)\\
   &= \frac{1}{j} \left[x^{j-1}\right] \left(\left(\frac{\gamma}{\beta+\gamma} + \frac{\beta}{\beta+\gamma} x^2 \right)^j\right)\\
   &= \frac{1}{j} \left[x^{j-1}\right] \sum_{k=0}^j  \binom{j}{k} \frac{\gamma^k}{(\beta+\gamma)^k} \frac{\beta^{j-k} x^{2(j-k)}}{(\beta+\gamma)^{j-k}}\\
   &= \frac{1}{j} \frac{1}{(\beta+\gamma)^j}\left[x^{j-1}\right] \sum_{k=0}^j  \binom{j}{k} \gamma^k\beta^{j-k}x^{2(j-k)}\\
   &= \begin{cases} 
   \frac{1}{j} \frac{1}{(\beta+\gamma)^j} \binom{j}{\frac{j+1}{2}} \gamma^{(j+1)/2}\beta^{(j-1)/2} & j \text{ odd}\\
   0 & j \text{ even}
   \end{cases}
   \end{align*}
   
   Since $\ell = (j+1)/2$ is the total number of individuals infected, we conclude that the probability of $\ell$ infections is

\begin{align*}
\mathbb{P}[\ell \text{ infections}] &= \frac{1}{2\ell-1} \frac{1}{(\beta+\gamma)^{2\ell-1}} \binom{2\ell-1}{\ell} \gamma^\ell \beta^{\ell-1}\\
&= \frac{\gamma^\ell \beta^{\ell-1}}{(\beta+\gamma)^{2\ell-1}} \frac{1}{2\ell-1}\frac{(2\ell-1)!}{\ell! (\ell-1)!}\\
&= \frac{1}{\ell}\frac{\gamma^\ell \beta^{\ell-1}}{(\beta+\gamma)^{2\ell-1}} \frac{(2\ell-2)!}{(\ell-1)! (\ell-1)!}\\
&= \frac{1}{\ell}\frac{\gamma^\ell \beta^{\ell-1}}{(\beta+\gamma)^{2\ell-1}} \binom{2\ell-2}{\ell-1}
\end{align*}

```{prf:example} Small Outbreak Size Distribution if $\beta = 3\gamma/2$

If $\beta = 3 \gamma/2$ as in the simulations performed in earlier sections, then 

\begin{align*}
\mathbb{P}[\ell \text{ infections}] &= \frac{1}{\ell}\frac{(3/2)^{\ell-1}}{(5/2)^{2\ell-1}} \binom{2\ell-2}{\ell-1}\\
\end{align*}
This leads to

|Size | Probability|
|--|--|
|$1$|$2/5 = 0.4$|
|$2$|$12/125 =0.096$|
|$3$|$3^22^4/5^5 = 0.04608$|

```