# Models

Here is a description of the models under configuration. Each model adds a feature to the previous model.


## Summary of Models

1. **Baseline**. Random effects at the county level for a quadratic trend.
2. **Intervention**. Adds an indicator term for the "intervention", which starts 12 days after the deaths  threshold is reached. This model adds flexibility that can "bend" the linear and quadratic term. It doesn't allow a mean shift, so it's a *hockey stick* model.
3. **NCHS simple**. Adds an indicator variable for the NCHS (National Center for Health Statistics) category for each county. The NCHS variable has 6 levels.
4. **NCHS interaction**. Adds an interaction term between intervention and the NCHS indicator. It can alter the intensity of the interaction trend shift based on NCHS category.
5. **Earliness**. Adds an interaction term between the intervention and number of days between the intervention and the theshold (the time at which $t=0$). It can alter the intensity of the interaction trend shift based on how early the intervention took place.

## Mathematical Descriptions

All models use the a log-link for the negative binomial model with an offset for population. Under this formulation, a model is predicting the *per capita rate of log deaths/cases*. Formally, denote
- $X_{it}$ the covariates of county $i$ at time $t$
- $y_{it}$ the observed deaths/cases of county $i$ at time $t$.

We then assume that using that
$$
y_{it} \sim \mathrm{Negbin}(\lambda_{it}, r)
$$
where $N_i$ is the population of the $i$-th county, $\log(\lambda_{it}/N_i)=f(X_{it}, t)$ is obtained from a time-trend and covariates. The parameterization of the negative binomial used is so that $E[y_{it}] = \lambda_{it}$ and $\mathrm{Var}[y_{it}]=\lambda_{it}(1 + \lambda_{it}/r)$.

### 1. Baseline

The mathematical model is
$$
\log(\lambda_{it} / N_i) = (\mu_0 + r_{i,0}) + 
(\mu_1 + r_{i,1}) t + (\mu_2  + r_{i, 2})t^2.
$$

where the $r_{i,p}$ are called the random effects. They are called this way because it is assumed a prior
$$
(r_{i,0}, r_{i,1}, r_{i,2})^T \sim N(\mathbf{0}, \Sigma)
$$
which intuitively is a form of regularization since they and pulled towards zero but allow each county to fit the curve well. The terms $\mu_p$ are the population averages and the random effects are deviations from that average.

The formulation within `rstanarm` is
```r
y ~ t + I(t^2) + (t + I(t^2) | county)
```
The bar pr "given" notation `|` is that specifies the random effects as mathematically defined above. In practice, `t` and `t^2` can be very correlated and can create numerical difficulties, particularly for the random effects which have a coefficient for each county. For this reasons and equivalent model is
```r
y ~ poly(t, 2) + (poly(t, 2) | county)
```
which is based on orthogonal polynomials. The documentation of `poly` contains some explanation. Throughout this documentation I will not use the `poly` notation because it can be confusing.

### 2. Intervention

The mathematical model is
\begin{align}
\log(\lambda_{it} / N_i) = \text{[..baseline..]} + (\beta^\text{inter}_1 t + \beta^\text{inter}_2 t^2) J_{it}
\end{align}
where $J_{it}$ is an indicator of intervention in county $i$. Formally,
$$
J_{it}=\begin{cases}
1 &\text{$t$ is greater than 12 days after intervention began} \\
0 &\text{otherwise}
\end{cases}
$$
We don't allow a mean shift term (without $t$ involved) so that when the intervention starts only the rate of increase decrease changes but the line is still continuous. One way to see this is to compute the derivative with respect to time
$$
\frac{d}{dt} \log(\lambda_{it} / N_i) = (\mu_1 + r_{i,1} + J_{it}\beta^\text{inter}_1) +  2(\mu_2 + r_{i,2} + J_{it}\beta^\text{inter}_2) t
$$
So the moment $J_{it}=1$ there is sudden change in the derivative.

In `rstanarm` we can write

```r
y ~ [..baseline..] +
    t:intervention + I(t^2):intervention    
```

### 3. Earliness

The mathematical model is
\begin{align}
\log(\lambda_{it} / N_i) = 
\text{[..intervention..]} + (\beta_{1}^\text{early}t + \beta_{2}^\text{early}t^2)E_{i}J_{it}
\end{align}
where $E_i$ is the difference in days between the death/cases threshold (the moment when $t=0$) and the intervention time.

In `rstanarm`
```r
y ~ [..intervention..] +
    t:earliness:intervention + I(t^2):earliness:intervention
```

### 4. NCHS Simple

The mathematical model is
\begin{align}
\log(\lambda_{it} / N_i) = 
\text{[..earliness..]} + \sum_{j=2}^6 \left(\beta^\text{nchs}_{j,0} + \beta^\text{nchs}_{j, 1} t + \beta^\text{nchs}_{j, 2} t^2\right)I(\text{NCHS}_i = j)
\end{align}
The way to interpret this model is that $\text{NCHS}_i\in\{1,...,6\}$ is an indicator of the NCHS category of county $i$ and the group $j=1$ is left as the control group, and for all others, only one term in the sum will have $I(\text{NCHS}_i = j)=1$. As a reminder, it is always necessary to leave one group out as the control group in linear models since otherwise the indicator function of group always adds up to one and the model does not have a unique solution.

So another way to write the model is
\begin{align}
\log(\lambda_{it} / N_i) = 
\text{[..earliness..]} + \beta^\text{nchs}_{\text{NCHS}_i,0} + \beta^\text{nchs}_{\text{NCHS}_i, 1} t + \beta^\text{nchs}_{\text{NCHS}_i, 2} t^2
\end{align}
where we fix $\beta_{1,i}^\text{NCHS}=0$.

The notation for `rstanarm` is simple.

```r
y ~ [..earliness..] +
    nchs + t:nchs + I(t^2):nchs    
```

### 5. NCHS Interaction

The mathematical model is
\begin{align}
\log(\lambda_{it} / N_i) = 
\text{[..nchs_simple..]} + \sum_{j=2}^6 \left(\beta^\text{nchs_interv}_{j, 1} t + \beta^\text{nchs_interv}_{j, 2} t^2\right)I(\text{NCHS}_i = j)J_{it}
\end{align}

The added terms follow a similar logic as the previous one, but adds terms that are multiplied by the intervention $J_{it}$. Also, it only includes only linear and quadratic term to preserve the *hockey stick* logic we explained before.

The notation for `rstanarm` is simple.

```r
y ~ [..nchs_simple..] +
    t:nchs:intervention + I(t^2):nchs:intervention
```