# von Foerster Hazard Rate Models

## Background

The canonical von Foerster relationship states that for a stochastic process on age $a$ and time $t$ the infinitesimal generator of probability is the kernel the of evolution of the expectation. Formally, given the [hazard rate](https://en.wikipedia.org/wiki/Hazard_ratio) $h$ of the stopping time statistic $A$ of an age dependent stochastic process:

$$
\mathbb{P}\left[ A = a \parallel A \ge a \right] = h
$$

and the conditional expectation of density of states statistic $N$:

$$
\mathbb{E}\left[N \parallel A=a, T=t\right]=n
$$

the hazard rate $h$ is the kernel of the [von Foerster forward evolution operator](https://en.wikipedia.org/wiki/Von_Foerster_equation) acting on the expected density of states $n$ along the age $a$ and time $t$ diagonal:

$$
\partial_a n + \partial_t n = - h \cdot n
$$

For a general population the hazard rate is a  matrix of transition rates with fixed column sums $\vec{1}^\dagger H = 0$, ensuring local conservation of probability. Given a transition of the density of states statistics vector $\vec{N}$ from density of state vector $\vec{m}$ to the state vector $\vec{n}$ the hazard rate matrix is:

$$
\mathbb{P}\left[ T = t, \vec{N} = \vec{n} \parallel T \ge t, \vec{N} = \vec{m} \right] = H
$$

In turn the, possibly inhomogeneous, hazard rate matrix determines the evolution of the density of states vector field:

$$
\partial_a \vec{n} + \partial_t \vec{n} = - H \vec{n}
$$

We will use the preceding density of states vector equation to phenomenologically model the age dependent dynamics of communicable diseases, subject to the boundary conditions of an initial demographic $\vec{n}(a,0)$, and constant birth rate $\vec{n}(0,t)=\vec{n}_\text{B}$.

## Methods

If we are working with a continuous density of states $\hat{n}$ we can exploit the observation that the von Foerster operator evolves along the fixed age $a$ and time $t$ diagonal to discretize the evolution operator on an equal grid scale of one day in age and time $\Delta a = \Delta t = 1 \text{ day}$:

$$
\vec{n}\left(a + \Delta t, t + \Delta t \right) = \vec{n}\left(a, t\right) - H \vec{n}\left(a, t\right) \cdot \Delta t
$$

However if we wish to capture the subtly of integer occupancy in the density of states $\hat{n}$ then we need to use explicit transition probabilities. Assuming that for a single person only one transition occurs in a single day $\Delta a = \Delta t = 1 \text{day}$ then we can count the number of transitions between states $i \ne j$, with the approximation that the integral of the hazard is linear in time, which is true if the hazard is roughly constant over a single day:

$$
\begin{array}{rcl}
    n_{i \rightarrow j} & = & n_i \mathbb{P}\left[ T_{i \rightarrow j} \le \Delta t \right]\\
                        & = & n_i \frac{h_{i \rightarrow j}}{h_{i \rightarrow i}}e^{-h_{i \rightarrow i} \Delta t}
\end{array}
$$

If the count is less than one $n_{i \rightarrow j} \lt 1$, we then draw a value $u$ from the uniform probability distribution on $\left[0,1\right)$ assigning $n_{i \rightarrow j} = 1$ if $n_{i \rightarrow j} \ge u$ and $n_{i \rightarrow i} = 0$ otherwise. This last step ensures that we can work with the nuances of integer counts without excluding low hazard transitions due to an arbitrary choice in rounding. To ensure conservation of state occupancy $n_i$ we order the transition counts $n_{i \rightarrow j}$ from largest to smallest and cutoff immediately before the cumulative losses exceed the state occupancy $n_i$.

## Model

Even the most concise model of age dependent communicable diseases, in which we only account for a handful of observed states and transitions, still requires $8$ density of states functions to distinguish between all the observed outcomes, particularly when permuting the absorbing state of fatality by causes and settings:

$$
\vec{n} =
\begin{pmatrix}
  n_\text{S}\\
  n_\text{I}\\
  n_\text{H}\\
  n_\text{R}\\
  n_\text{D}\\
  n_\text{FA}\\
  n_\text{FI}\\
  n_\text{FH}
\end{pmatrix}
\begin{array}{cl}
  \rightarrow & \text{Susceptible}\\
  \rightarrow & \text{Infected}\\
  \rightarrow & \text{Hospitalized}\\
  \rightarrow & \text{Recovery of infected}\\
  \rightarrow & \text{Discharge of hospitalized}\\
  \rightarrow & \text{Fatalities due to ageing}\\
  \rightarrow & \text{Fatalities of infected}\\
  \rightarrow & \text{Fatalities of hospitalized}
\end{array}
$$

Of the possible transitions only the those from susceptible to infected is a strongly exogenous process, representing the transmission of the communicable disease. The other possible transitions represent endogenous biological processes. Over nearly two centuries endogenous biological processes have been well characterized as following [Gompertz failure processes](https://en.wikipedia.org/wiki/Gompertz-Makeham_law_of_mortality) on average, as a function of age $a$:

$$
h = \beta e^{\alpha \cdot a}
$$

The ageing acceleration parameter $\alpha$ can be either positive or negative. Heuristically exacerbation processes are positive, while recovery processes are negative. For example, hospitalizations increase exponentially with age, while discharges decrease exponentially with age, that is increasing length of hospital stay. Gompertz processes theoretically arise in the asymptotic limit of infinitesimal stochastic accelerations of time. Accelerated failure time as a model of biological response to environmental stresses has been rigorously validated in high throughput basic science experiments on [Caenorhabditis elegans ](https://www.nature.com/articles/nature16550). In turn the experimental validation of accelerated failure times gives a parsimonious model for transient infections. Given a background hazard rate $h\left(a\right)$:

$$
\mathbb{P}\left[ A = a \parallel A \ge a \right] = h\left(a\right)
$$

The statistical impact of an illness is to accelerate the stopping time by a factor of $\gamma \gt 1$, which by differentiating the cumulative probability by age $a$ yields the relationship:

$$
\mathbb{P}\left[ A = \gamma a \parallel A \ge \gamma a \right] = \gamma h\left(\gamma a\right)
$$

The combination of Gompertz processes and accelerated failure time models greatly simplify the construction of communicable disease models because, given a priori estimates of the background parameters, only the single acceleration parameter $\gamma$ needs to be estimated, and then applied to the background dynamics as an age $a$ and hazard $h$ multiplier.

In contrast to the simplicity of Gompertz processes and accelerated failure time models as phenomenological theories of endogenous biological processes, exogenous biological processes, particularly the logistic growth of social interactions through puberty, demands a much more complex theory. In general the rate of interaction between any two ages is given by a bivariate age-age scattering cross section, that can be spectrally decomposed into a sum of products of univariate uniform normalized Eigen functions:

$$
\sigma = \sum_{\lambda} \lambda \hat{\sigma}^\dagger_\lambda \hat{\sigma}_\lambda
$$

While the Eigen functions are dimensionless, the Eigen values $\lambda$ carry inverse volume units of $\left(\text{person} \times \text{person} \times \text{time}\right)^{-1}$. From empirical observations the dominant first order summand of the age-age scattering cross section is logistic, and can be interpreted as representing the fraction of the demographic of a particular age that is instantaneously available for social interactions:

$$
\hat{\sigma} = \frac{1}{1 + e^{-\alpha_\text{pub} \left(a - \beta_\text{pub} \right)}}
$$

Even with the scattering cross section in hand we cannot naively place the term into the canonical epidemiological infection rate to determine the hazard rate because social interactions are highly serialized. Instead we will work through a combinatoric-probabilistic argument to derive the hazard of infection under serialized social interactions. For a single person consider a short period of time $\delta t$ during which that person interacts with $\kappa$ people. Provided the infections do not grow during the time of $\delta t$ the hazard $h$ of infection is roughly constant. Assuming transmission is guaranteed when in social contact with an infected person, the probability of getting infected after time $T \gt \delta t$ is then the probability that all $\kappa$ contacts are not infected $I$ before time $\delta t$:

$$
\begin{array}{rcl}
    e^{-\int_0^{\delta t} h da} & = & e^{-h \delta t}\\
                                & = & \mathbb{P}\left[T \gt \delta t\right]\\
                                & = & \left(1 - \mathbb{P}\left[I \parallel T = 0 \right]\right)^\kappa
\end{array}
$$

To estimate the probability of encountering an infected person $\mathbb{P}\left[I \parallel T = 0 \right]$ we weight integral of each state density function with the dominant term from the age-age scattering cross section, essentially counting the fractional contributions by age, and noting that any multiplicative constants cancel out:

$$
\mathbb{P}\left[I \parallel T = 0 \right] = \frac{\int_0^\infty n_\text{I} \hat{\sigma} da}{\int_0^\infty \left(n_\text{S} + n_\text{I} + n_\text{R} + n_\text{D}\right)\hat{\sigma} da}
$$

We introduce inverse probability weight:

$$
\hat{\omega} = \frac
    {\int_0^\infty \left(n_\text{S} + n_\text{I} + n_\text{R} + n_\text{D}\right)\hat{\sigma} da}
    {\int_0^\infty \left(n_\text{S} + n_\text{R} + n_\text{D}\right)\hat{\sigma} da}
$$

and solve for the hazard rate $h$, recognizing the number of social contacts $\kappa$ for an age will be proportional to the dominant term from the age-age scattering cross section $\kappa \propto \hat{\sigma}$, and that the ratio of social contacts $\kappa$ to the time $\delta t$ is the rate of social contact $\eta$:

$$
\begin{array}{rcl}
    h & = & \frac{\kappa}{\delta t} \ln\hat{\omega}\\
      & = & \eta\hat{\sigma} \ln\hat{\omega}
\end{array}
$$

In balancing the physical plausibility of the model with the availability of data that provides a priori estimates of background parameters we will make seven simplifications and assumptions to expedite the formulation of the model:

* People are contagious within 6 hours of infection, and are contagious until fully recovered or discharged. This is justified both by qualitative case studies published in peer reviewed journals, and anecdotal accounts published by journalist.
* Hospitalization acts as an effective quarantine.
* Mortality due to infection proceeds fast enough that fatalities due to infection are the only immediate absorbing state for both infected and hospitalized. This eliminates the transitions form infected and hospitalized to background fatality due to ageing.
* The impact of infection is an equal acceleration to the hospitalization, recovery, discharge, and fatality hazards.
* The impact of infection is fully transient, so that fatalities due to ageing among the recovered and discharge proceed at the same rate as those among the susceptible.
* The fatality rate of the infected is the same as the fatality rate of the hospitalized, this is the most dubious of all the assumptions and simplifications, as it ignores the strong gatekeeper effects in hospitalization that a priori select for more severely ill patients.
* The rate of recovery of the infected is the same as the rate of discharge of the hospitalized. This is primarily because we have well calibrated hospital discharge rates, but lack the equivalent rates for recovery in community.

Fully labeling all the states and the allowed transitions we then have the following diagram:

![State Transition Diagram](../imgs/vonFoersterHazards.png "State Transition Diagram")

Reading off the transitions from the diagram yields the hazard rate lower diagonal matrix:

$$
- H =
\begin{pmatrix}
  -\beta_\text{age} e^{\alpha_\text{age} a} - \eta\hat{\sigma} \ln\hat{\omega} & 0 & 0 & 0 & 0 & 0 & 0 & 0\\
  \eta\hat{\sigma} \ln\hat{\omega} & - \gamma\beta_\text{age} e^{\gamma\alpha_\text{age} a} - \gamma\beta_\text{dis} e^{-\gamma\alpha_\text{dis} a} - \gamma\beta_\text{hos} e^{-\gamma\alpha_\text{hos} a} & 0 & 0 & 0 & 0 & 0 & 0\\
  0 & \gamma\beta_\text{hos} e^{-\gamma\alpha_\text{hos} a} & - \gamma\beta_\text{age} e^{\gamma\alpha_\text{age} a} - \gamma\beta_\text{dis} e^{-\gamma\alpha_\text{dis} a} & 0 & 0 & 0 & 0 & 0\\
  0 & \gamma\beta_\text{dis} e^{-\gamma\alpha_\text{dis} a} & 0 & - \gamma\beta_\text{age} e^{\alpha_\text{age} a} & 0 & 0 & 0 & 0\\
  0 & 0 & \gamma\beta_\text{dis} e^{-\gamma\alpha_\text{dis} a} & 0 & -\beta_\text{age} e^{\alpha_\text{age} a} & 0 & 0 & 0\\
  \beta_\text{age} e^{\alpha_\text{age} a} & 0 & 0 & \beta_\text{age} e^{\alpha_\text{age} a} & \beta_\text{age} e^{\alpha_\text{age} a} & 0 & 0 & 0\\
  0 & \gamma\beta_\text{age} e^{\gamma\alpha_\text{age} a} & 0 & 0 & 0 & 0 & 0 & 0\\
  0 & 0 & \gamma\beta_\text{age} e^{\gamma\alpha_\text{age} a} & 0 & 0 & 0 & 0 & 0\\
\end{pmatrix}
$$

Our model thus falls within the general class of [Phase Type Distrubtions](https://en.wikipedia.org/wiki/Phase-type_distribution), because a lower diagonal matrix represents a process that does not backtrack through states.

## Materials

We will use our previous work on the [Hazard Rate Zoo](https://public.tableau.com/profile/gompertz.makeham#!/vizhome/AlbertaMortalityandUtilizationHazardRatesbyAgeandSex/Welcome) to set the boundary conditions and to determine credible estimates for the ranges of the parameters that describe the background evolution dynamics. To determine the ageing acceleration factor we will use the age dependent case fatality rates compiled on [Our World in Data](https://ourworldindata.org/coronavirus#current-data-across-countries-suggests-that-the-elderly-are-most-at-risk), and assume the average case lasts for approximately one month.

|Parameter          |Name                             |Estimate          |Units                                               |Notes                                                 |
|-------------------|---------------------------------|------------------|----------------------------------------------------|------------------------------------------------------|
|$\alpha_\text{age}$|Background ageing mortality      |$\frac{\ln 2}{7}$ |$\frac{1}{\text{years}}$                            |Mortality doubles every $7$ years.                    |
|$\beta_\text{age}$ |Background mortality rate        |$\frac{1}{32000}$ |$\frac{\text{deaths}}{\text{person year}}$          |$1$ death every $1000$ person years at $35$.          |
|$\alpha_\text{hos}$|Background ageing hospitalization|$\frac{\ln 2}{14}$|$\frac{1}{\text{years}}$                            |Hospitalizations double every $14$ years.             |
|$\beta_\text{hos}$ |Background hospitalization rate  |$\frac{1}{320}$   |$\frac{\text{hospitalizations}}{\text{person year}}$|$1$ hospitalization every $40$ person years at $42$.  |
|$\alpha_\text{dis}$|Background ageing discharge      |$\frac{\ln 2}{49}$|$\frac{1}{\text{years}}$                            |Hospital stay doubles every $49$ years.               |
|$\beta_\text{dis}$ |Background discharge rate        |$\frac{1}{16}$    |$\frac{\text{discharges}}{\text{person day}}$       |$1$ discharge every $8$ person days at $49$.          |
|$\alpha_\text{pub}$|Puberty rate                     |$\frac{\ln 2}{2}$ |$\frac{1}{\text{years}}$                            |Social contacts doubling every $2$ years.             |
|$\beta_\text{pub}$ |Pubescent mid-point              |$16$              |$\text{years}$                                      |Doubling peaks at $16$.                               |
|$\eta$             |Maximum contact rate             |$2$ to $32$       |$\frac{\text{contacts}}{\text{person day}}$         |Maximum transmissible contacts in a day.              |
|$\gamma$           |Infection ageing acceleration    |$1.1$ to $1.5$    |Dimensionless                                       |$15\%$ mortality at $80$ after $1$ month of infection.|

Having downloaded the hazard rate data set and the demographic pyramid data set from the Hazard Rate Zoo hosted on Tableau Public we will estimate the parameters for ageing fatality rates, hospital admission rates, and hospital discharge rates, using the Gadfly, DataFrames, and CSV Julia packages.

## Estimating the Infection Ageing Acceleration

From our work on the Hazard Rate Zoo we know that the background hazard rate for fatalities due to ageing, in round numbers, is a doubling of mortality every $7$ years, with $1$ fatality in $1000$ person years at $35$ years of age:

$$
h = \frac{e^{\frac{\ln 2}{7}a}}{32000}
$$

Furthermore during infection at $80$ years of age there is a $15\%$ fatality rate over a month of infection, and thus $1.8$ fatalities per person year. To estimate the infection ageing acceleration parameter $\gamma$ we have to solve the equation:

$$
\frac{\gamma e^{\gamma\frac{\ln 2}{7}a}}{32000} = 1.8
$$

We can find the root by a poor man's fixed point recursion, with the trial solution $\gamma=1.34615$:

$$
57600 e^{-\gamma\frac{\ln 2}{7}80} = \gamma
$$

The root is unique because the left hand side is decreasing in $\gamma$ and the right hand side is increasing in $\gamma$. This yields a reasonable range of a $20\%$ to $40\%$ increase in their rate of ageing while infected. Rounding off to $25\%$ we have the interpretation that while infected a $30$ year old faces the same mortality risk as a healthy $45$ year old, and more gravely an infected $80$ year old faces the same mortality risk as a healthy $120$ year old.

In [1]:
57600*exp(-1.34615*log(2)*80/7)

1.346495830673008