Recasts to a non-linear model #42

seabbs · 2021-06-24T23:01:21Z

This PR recasts as a non-linear model and adds uncertainty for Rt by assuming they are normally distributed. Not sure this is strictly better than what was there previously but interesting perhaps as there is lots of scope for using this for other problems (i.e this is close to a Rt model implemented on top of brms). I also looked at adding uncertainty on sgft proportion using a dummy stanadard error but this caused some fairly major fittings issues (maybe could do this using a seperate equation to fit to sgft positive tests).

Noticed the normalize argument which is new to stan 2.2.5 and allows dropping constants (i.e what I think using the ~ approach does). .devcontainer stuff is the automated docker stuff from vscode I am trying out but if this is being merged can be gitignored.

Not entirely clearl if this branch is up to date and or what the most recent data is so model fit is largely for show.

Model formula

model <- bf(
  rt_mean | se(rt_sd, sigma = TRUE) ~ (1 + var) * exp(logRt),
  var ~ 0 + prop_sgtf,
  logRt ~ 1,
  family = student(),
  nl = TRUE
  # this is the change that allows future fancy stuff
  # but as it stands tries to do a dot product :( (may need to open an issue)
  #loop = FALSE
)

priors <- c(prior(normal(0, 0.5), nlpar = "var"),
            prior(student_t(3, 0, 0.5), nlpar = "logRt"))

fit <- brm(
  model,
  data = utla_rt_with_covariates %>%
   rename(rt_mean = rt_mean_long_gt,
          rt_sd = rt_sd_long_gt),
  prior = priors,
  control = list(adapt_delta = 0.95))

Generated stan:

// generated with brms 2.15.0
functions {
  /* compute the logm1 link 
   * Args: 
   *   p: a positive scalar
   * Returns: 
   *   a scalar in (-Inf, Inf)
   */ 
   real logm1(real y) { 
     return log(y - 1);
   }
  /* compute the inverse of the logm1 link 
   * Args: 
   *   y: a scalar in (-Inf, Inf)
   * Returns: 
   *   a positive scalar
   */ 
   real expp1(real y) { 
     return exp(y) + 1;
   }
}
data {
  int<lower=1> N;  // total number of observations
  vector[N] Y;  // response variable
  vector<lower=0>[N] se;  // known sampling error
  int<lower=1> K_var;  // number of population-level effects
  matrix[N, K_var] X_var;  // population-level design matrix
  int<lower=1> K_logRt;  // number of population-level effects
  matrix[N, K_logRt] X_logRt;  // population-level design matrix
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
  vector<lower=0>[N] se2 = square(se);
}
parameters {
  vector[K_var] b_var;  // population-level effects
  vector[K_logRt] b_logRt;  // population-level effects
  real<lower=0> sigma;  // residual SD
  real<lower=1> nu;  // degrees of freedom or shape
}
transformed parameters {
}
model {
  // likelihood including constants
  if (!prior_only) {
    // initialize linear predictor term
    vector[N] nlp_var = X_var * b_var;
    // initialize linear predictor term
    vector[N] nlp_logRt = X_logRt * b_logRt;
    // initialize non-linear predictor term
    vector[N] mu;
    for (n in 1:N) {
      // compute non-linear predictor values
      mu[n] = (1 + nlp_var[n]) * exp(nlp_logRt[n]);
    }
    target += student_t_lpdf(Y | nu, mu, sqrt(square(sigma) + se2));
  }
  // priors including constants
  target += normal_lpdf(b_var | 0, 0.5);
  target += student_t_lpdf(b_logRt | 3, 0, 0.5);
  target += student_t_lpdf(sigma | 3, 0, 2.5)
    - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += gamma_lpdf(nu | 2, 0.1)
    - 1 * gamma_lccdf(1 | 2, 0.1);
}
generated quantities {
}

Basic fit with no predictors:

 Family: student 
  Links: mu = identity; sigma = identity; nu = identity 
Formula: rt_mean | se(rt_sd, sigma = TRUE) ~ (1 + var) * exp(logRt) 
         var ~ 0 + prop_sgtf
         logRt ~ 1
   Data: utla_rt_with_covariates %>% rename(rt_mean = rt_me (Number of observations: 2194) 
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup samples = 4000

Population-Level Effects: 
                Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
var_prop_sgtf       0.11      0.01     0.09     0.14 1.00     2184     2383
logRt_Intercept     0.03      0.01     0.02     0.04 1.00     2240     2600

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.21      0.00     0.21     0.22 1.00     2603     2722
nu       41.44     15.14    20.58    79.98 1.00     2575     2642

Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

seabbs · 2021-06-25T17:34:18Z

I went a bit extra (closing the stable door after the horse has bolted) and extended this to include the proportion SGTF as a latent variable (learning some interesting brms hacks along the way). Quite a big impact on run time but still within a feasible range I think. I am fairly confident this is the correct approach but its certainly not something people seem to be doing that much of (except here: paul-buerkner/brms#304).

Code:

Model function: https://github.com/epiforecasts/covid19.sgene.utla.rt/blob/non-linear/R/variant_rt.r
Model example:

covid19.sgene.utla.rt/R/fit_models.r

Line 20 in 3733680

source(here("R", "variant_rt.r"))

Model:

// generated with brms 2.15.0
functions {
  /* compute the logm1 link 
   * Args: 
   *   p: a positive scalar
   * Returns: 
   *   a scalar in (-Inf, Inf)
   */ 
   real logm1(real y) { 
     return log(y - 1);
   }
  /* compute the inverse of the logm1 link 
   * Args: 
   *   y: a scalar in (-Inf, Inf)
   * Returns: 
   *   a positive scalar
   */ 
   real expp1(real y) { 
     return exp(y) + 1;
   }
}
data {
  int<lower=1> N;  // total number of observations
  int<lower=1> N_sgtfsamples;  // number of observations
  int Y_sgtfsamples[N_sgtfsamples];  // response variable
  int trials_sgtfsamples[N_sgtfsamples];  // number of trials
  int<lower=1> Ksp_sgtfsamples;  // number of special effects terms
  int<lower=1> N_sgtf;  // number of observations
  vector[N_sgtf] Y_sgtf;  // response variable
  int<lower=0> Nmi_sgtf;  // number of missings
  int<lower=1> Jmi_sgtf[Nmi_sgtf];  // positions of missings
  int<lower=1> N_rtmean;  // number of observations
  vector[N_rtmean] Y_rtmean;  // response variable
  vector<lower=0>[N_rtmean] se_rtmean;  // known sampling error
  int<lower=1> Ksp_rtmean_var;  // number of special effects terms
  int<lower=1> K_rtmean_logRt;  // number of population-level effects
  matrix[N_rtmean, K_rtmean_logRt] X_rtmean_logRt;  // population-level design matrix
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
  vector<lower=0>[N_rtmean] se2_rtmean = square(se_rtmean);
}
parameters {
  vector<lower=0,upper=1>[Nmi_sgtf] Ymi_sgtf;  // estimated missings
  vector[Ksp_rtmean_var] bsp_rtmean_var;  // special effects coefficients
  vector[K_rtmean_logRt] b_rtmean_logRt;  // population-level effects
  real<lower=0> sigma_rtmean;  // residual SD
  real<lower=1> nu_rtmean;  // degrees of freedom or shape
}
transformed parameters {
  vector[Ksp_sgtfsamples] bsp_sgtfsamples;  // special effects coefficients
  real Intercept_sgtf;  // temporary intercept for centered predictors
  real<lower=0> phi_sgtf;  // precision parameter
  bsp_sgtfsamples = rep_vector(1, rows(bsp_sgtfsamples));
  Intercept_sgtf = 1;
  phi_sgtf = 1;
}
model {
  // likelihood including constants
  if (!prior_only) {
    // vector combining observed and missing responses
    vector[N_sgtf] Yl_sgtf = Y_sgtf;
    // initialize linear predictor term
    vector[N_sgtfsamples] mu_sgtfsamples = rep_vector(0.0, N_sgtfsamples);
    // initialize linear predictor term
    vector[N_sgtf] mu_sgtf = Intercept_sgtf + rep_vector(0.0, N_sgtf);
    // initialize linear predictor term
    vector[N_rtmean] nlp_rtmean_var = rep_vector(0.0, N_rtmean);
    // initialize linear predictor term
    vector[N_rtmean] nlp_rtmean_logRt = X_rtmean_logRt * b_rtmean_logRt;
    // initialize non-linear predictor term
    vector[N_rtmean] mu_rtmean;
    Yl_sgtf[Jmi_sgtf] = Ymi_sgtf;
    for (n in 1:N_sgtfsamples) {
      // add more terms to the linear predictor
      mu_sgtfsamples[n] += (bsp_sgtfsamples[1]) * Yl_sgtf[n];
    }
    for (n in 1:N_rtmean) {
      // add more terms to the linear predictor
      nlp_rtmean_var[n] += (bsp_rtmean_var[1]) * Yl_sgtf[n];
    }
    for (n in 1:N_sgtf) {
      // apply the inverse link function
      mu_sgtf[n] = inv_logit(mu_sgtf[n]);
    }
    for (n in 1:N_rtmean) {
      // compute non-linear predictor values
      mu_rtmean[n] = (1 + nlp_rtmean_var[n]) * exp(nlp_rtmean_logRt[n]);
    }
    target += binomial_lpmf(Y_sgtfsamples | trials_sgtfsamples, mu_sgtfsamples);
    target += beta_lpdf(Yl_sgtf | mu_sgtf * phi_sgtf, (1 - mu_sgtf) * phi_sgtf);
    target += student_t_lpdf(Y_rtmean | nu_rtmean, mu_rtmean, sqrt(square(sigma_rtmean) + se2_rtmean));
  }
  // priors including constants
  target += normal_lpdf(bsp_rtmean_var | 0, 0.5);
  target += student_t_lpdf(b_rtmean_logRt | 3, 0, 0.5);
  target += student_t_lpdf(sigma_rtmean | 3, 0, 2.5)
    - 1 * student_t_lccdf(0 | 3, 0, 2.5);
  target += gamma_lpdf(nu_rtmean | 2, 0.1)
    - 1 * gamma_lccdf(1 | 2, 0.1);
}
generated quantities {
  // actual population-level intercept
  real b_sgtf_Intercept = Intercept_sgtf;
}

Output (no covariates and winter 2020 data):

 Family: MV(binomial, beta, student) 
  Links: mu = identity
         mu = logit; phi = identity
         mu = identity; sigma = identity; nu = identity 
Formula: sgtf_samples | trials(samples) ~ 0 + mi(sgtf) 
         sgtf | mi() ~ 1 
         rt_mean | se(rt_sd, sigma = TRUE) ~ (1 + var) * exp(logRt) 
         var ~ 0 + mi(sgtf)
         logRt ~ 1
   Data: structure(list(week_infection = structure(c(18540, (Number of observations: 2194) 
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup samples = 4000

Population-Level Effects: 
                       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sgtf_Intercept             1.00      0.00     1.00     1.00 1.00     4000     4000
rtmean_logRt_Intercept     0.03      0.01     0.02     0.04 1.00     6830     2933
rtmean_var_misgtf          0.12      0.02     0.09     0.15 1.00     7278     2577
sgtfsamples_misgtf         1.00      0.00     1.00     1.00 1.00     4000     4000

Family Specific Parameters: 
             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
phi_sgtf         1.00      0.00     1.00     1.00 1.00     4000     4000
sigma_rtmean     0.21      0.00     0.21     0.22 1.00     7000     3206
nu_rtmean       41.28     14.69    20.39    76.94 1.00     6992     3434

Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

One of the cool things with this approach is you can prototype using the point estimates and then set them to NA in your data and model their uncertainty for the final model. Same fit without uncertainty (and 10% of the runtime):

 Family: MV(binomial, beta, student) 
  Links: mu = identity
         mu = logit; phi = identity
         mu = identity; sigma = identity; nu = identity 
Formula: sgtf_samples | trials(samples) ~ 0 + mi(sgtf) 
         sgtf | mi() ~ 1 
         rt_mean | se(rt_sd, sigma = TRUE) ~ (1 + var) * exp(logRt) 
         var ~ 0 + mi(sgtf)
         logRt ~ 1
   Data: structure(list(week_infection = structure(c(18540, (Number of observations: 2194) 
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup samples = 4000

Population-Level Effects: 
                       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sgtf_Intercept             1.00      0.00     1.00     1.00 1.00     4000     4000
rtmean_logRt_Intercept     0.03      0.01     0.02     0.04 1.00     2787     2774
rtmean_var_misgtf          0.11      0.01     0.08     0.14 1.00     2687     2192
sgtfsamples_misgtf         1.00      0.00     1.00     1.00 1.00     4000     4000

Family Specific Parameters: 
             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
phi_sgtf         1.00      0.00     1.00     1.00 1.00     4000     4000
sigma_rtmean     0.21      0.00     0.21     0.22 1.00     3192     2444
nu_rtmean       40.97     14.26    20.72    76.13 1.00     2773     2913

Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

seabbs

LGTM (self review)

seabbs added 5 commits June 24, 2021 20:48

add devcontainer

104a427

playing around complete

9546839

model with uncertainty

5152825

standardised priors

2165f34

add certain and uncertain examples

3733680

Base automatically changed from may-update to main July 27, 2021 15:42

seabbs and others added 2 commits November 30, 2021 17:23

Merge branch 'main' into non-linear

d257118

Generalise naming to match master branch

711ef36

seabbs commented Nov 30, 2021

View reviewed changes

seabbs merged commit 8ab1189 into main Nov 30, 2021

seabbs deleted the non-linear branch November 30, 2021 17:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recasts to a non-linear model #42

Recasts to a non-linear model #42

seabbs commented Jun 24, 2021

seabbs commented Jun 25, 2021 •

edited

seabbs left a comment

Recasts to a non-linear model #42

Recasts to a non-linear model #42

Conversation

seabbs commented Jun 24, 2021

seabbs commented Jun 25, 2021 • edited

seabbs left a comment

Choose a reason for hiding this comment

seabbs commented Jun 25, 2021 •

edited