Support all pp_check types for censored y #1327

hdrab127 · 2022-03-22T20:08:53Z

Hopefully addresses #1326.

The diff got a bit confused, it's mostly just moving the ppc_arg assignments, like group,lw,x,psis above the censor check, then subsetting them there if they are used.

Also allows censored values for the type = "km_overlay" plot (https://mc-stan.org/bayesplot/reference/PPC-censoring.html).

sysdata.rda just has brms:::brmsfit_example7 included for the censored tests, I didn't re-run the other example fits.

Re-running some of the examples in the issue:

library(brms)
data(kidney)
fit <- brm(
  formula = time | cens(censored) ~ age * sex + disease + (1 | patient),
  data = kidney, family = lognormal(),
  prior = c(set_prior("normal(0, 5)", class = "b"),
            set_prior("cauchy(0, 2)", class = "sd")),
  seed = 0, warmup = 1000, iter = 2000, chains = 4,
  control = list(adapt_delta = 0.95)
)
#> Compiling Stan program...
#> Start sampling
#> ...

pp_check(fit, type = 'dens_overlay_grouped', group = 'sex')
#> Using 10 posterior draws for ppc type 'dens_overlay_grouped' by default.
#> Warning: Censored responses are not shown in 'pp_check'.

pp_check(fit, status_y = kidney$censored, type = 'km_overlay')
#> Using 10 posterior draws for ppc type 'km_overlay' by default.

pp_check(fit, x = 'patient', group = 'sex', type = 'intervals_grouped')
#> Using all posterior draws for ppc type 'intervals_grouped' by default.
#> Warning: Censored responses are not shown in 'pp_check'.

pp_check(fit, type = 'loo_intervals')
#> Using all posterior draws for ppc type 'loo_intervals' by default.
#> Warning: Some Pareto k diagnostic values are too high. See help('pareto-k-diagnostic') for details.

#> Warning: Censored responses are not shown in 'pp_check'.
#> Warning: Some Pareto k diagnostic values are too high. See help('pareto-k-diagnostic') for details.

pp_check(fit, type = 'loo_pit_overlay')
#> Using 10 posterior draws for ppc type 'loo_pit_overlay' by default.
#> Warning: Not enough tail samples to fit the generalized Pareto distribution in some or all columns of matrix of log importance ratios. Skipping the following columns: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ... [66 more not printed].

#> Warning: Some Pareto k diagnostic values are too high. See help('pareto-k-diagnostic') for details.
#> Warning: Censored responses are not shown in 'pp_check'.
#> NOTE: The kernel density estimate assumes continuous observations and is not optimal for discrete observations.

pp_check(fit, type = 'loo_pit_qq')
#> Using 10 posterior draws for ppc type 'loo_pit_qq' by default.
#> Warning: Not enough tail samples to fit the generalized Pareto distribution in some or all columns of matrix of log importance ratios. Skipping the following columns: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ... [66 more not printed].
#> Warning: Some Pareto k diagnostic values are too high. See help('pareto-k-diagnostic') for details.
#> Warning: Censored responses are not shown in 'pp_check'.

^{Created on 2022-03-23 by the reprex package (v2.0.1)}

paul-buerkner

Thank you for this PR! I think it looks good mostly and I have just some minor things to change.

R/pp_check.R

paul-buerkner · 2022-03-23T07:27:44Z

tests/brmsfit_examples.R

@@ -84,16 +84,24 @@ brmsfit_example6 <- brm(
  stan_model_args = stan_model_args, rename = FALSE
 )

+brmsfit_example7 <- brm(


These internal models take quite some space so I would like to avoid saving another one. Can you add this test to the local tests in tests/local/? Perhaps there is already a censored model.

Good idea thanks, have reverted those changes and moved. Though when running the local tests beforehand I noticed the add_rstan_model function fails for that kidney survival example. Is this the same on your computer, and if so, is it okay to ignore for this PR and handle separately?

Test fail:

-- Error (tests.models_new.R:89:3): Survival model from brm doc works correctly --------------------------- Error in `stanc(file = file, model_code = model_code, model_name = model_name, verbose = verbose, obfuscate_model_name = obfuscate_model_name, allow_undefined = allow_undefined, isystem = isystem)`: failed to parse Stan model '530b15df4662e9b168e0f021510d8a97' due to the above error. Backtrace: 1. brms::add_rstan_model(fit3) at tests.models_new.R:89:2 4. rstan::stan(...) at brms/R/brmsfit-helpers.R:761:4 5. rstan::stan_model(...) 6. rstan::stanc(...)

Checking internally looks like the arrays from the cmdstanr backend:

rstan::stan( + model_code = stancode(x, threads = threading()), + data = standata(x), chains = 0 + ) SYNTAX ERROR, MESSAGE(S) FROM PARSER: error in 'modeldb4faf53a2_530b15df4662e9b168e0f021510d8a97' at line 8, column 2 ------------------------------------------------- 6: int<lower=1> N; // total number of observations 7: vector[N] Y; // response variable 8: array[N] int<lower=-1, upper=2> cens; // indicates censoring ^ 9: int<lower=1> K; // number of population-level effects ------------------------------------------------- PARSER EXPECTED: <one of the following: a variable declaration, beginning with type, (int, real, vector, row_vector, matrix, unit_vector, simplex, ordered, positive_ordered, corr_matrix, cov_matrix, cholesky_corr, cholesky_cov or '}' to close variable declarations> Error in stanc(file = file, model_code = model_code, model_name = model_name, : failed to parse Stan model '530b15df4662e9b168e0f021510d8a97' due to the above error.

Seems to fix if I add rtsan backend to the recompiling code (

brms/R/brmsfit-helpers.R

Line 762 in 9410572

model_code = stancode(x, threads = threading()),

):

model_code = stancode(x, threads = threading(), backend = 'rstan'), ^^^^^^^^^^^^^^^^^^^

But I can submit that in different request? Or you check/change? Whatever's easiest

This is just because of new array syntax in the latest version of Stan (>= 2.29.0). You can safely ignore that here.

Just fixed it on master.

R/pp_check.R

hdrab127 · 2022-03-23T19:40:47Z

Sorry I wasn't sure the cleanest way to revert the sysdata.rda file back to how it was before so did a reset then force push. Can see the requested changes from your last check to now here.

codecov-commenter · 2022-03-23T19:59:22Z

Codecov Report

Merging #1327 (a75cce3) into master (cffd336) will decrease coverage by 3.15%.
The diff coverage is 76.33%.

@@            Coverage Diff             @@
##           master    #1327      +/-   ##
==========================================
- Coverage   86.31%   83.15%   -3.16%     
==========================================
  Files          42       70      +28     
  Lines       15511    18958    +3447     
==========================================
+ Hits        13388    15765    +2377     
- Misses       2123     3193    +1070

Impacted Files	Coverage Δ
R/brmsfit-methods.R	`77.72% <ø> (-6.21%)`	⬇️
R/families.R	`87.03% <ø> (-0.71%)`	⬇️
R/family-lists.R	`100.00% <ø> (ø)`
R/formula-ac.R	`87.61% <ø> (ø)`
R/formula-ad.R	`93.75% <ø> (ø)`
R/formula-cs.R	`100.00% <ø> (ø)`
R/formula-gp.R	`72.95% <ø> (ø)`
R/formula-re.R	`92.87% <ø> (ø)`
R/formula-sm.R	`94.87% <ø> (ø)`
R/formula-sp.R	`89.84% <ø> (ø)`
... and 89 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9410572...a75cce3. Read the comment docs.

paul-buerkner · 2022-03-24T07:44:28Z

Thank you! Looks good to me! Merging now!

paul-buerkner requested changes Mar 23, 2022

View reviewed changes

support all pp_check types for censored y

a75cce3

hdrab127 force-pushed the issue-pp-check-grouped-cens branch from 912072e to a75cce3 Compare March 23, 2022 19:35

update NEWS

bdc701a

paul-buerkner merged commit 5b26151 into paul-buerkner:master Mar 24, 2022

paul-buerkner mentioned this pull request Mar 24, 2022

pp_check types with censored data #1326

Closed

hdrab127 deleted the issue-pp-check-grouped-cens branch March 24, 2022 20:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support all pp_check types for censored y #1327

Support all pp_check types for censored y #1327

hdrab127 commented Mar 22, 2022

paul-buerkner left a comment

paul-buerkner Mar 23, 2022

hdrab127 Mar 23, 2022

paul-buerkner Mar 24, 2022

paul-buerkner Mar 24, 2022

hdrab127 commented Mar 23, 2022 •

edited

codecov-commenter commented Mar 23, 2022

paul-buerkner commented Mar 24, 2022

Support all pp_check types for censored y #1327

Support all pp_check types for censored y #1327

Conversation

hdrab127 commented Mar 22, 2022

paul-buerkner left a comment

Choose a reason for hiding this comment

paul-buerkner Mar 23, 2022

Choose a reason for hiding this comment

hdrab127 Mar 23, 2022

Choose a reason for hiding this comment

paul-buerkner Mar 24, 2022

Choose a reason for hiding this comment

paul-buerkner Mar 24, 2022

Choose a reason for hiding this comment

hdrab127 commented Mar 23, 2022 • edited

codecov-commenter commented Mar 23, 2022

Codecov Report

paul-buerkner commented Mar 24, 2022

hdrab127 commented Mar 23, 2022 •

edited