-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify stat_max
#237
Clarify stat_max
#237
Conversation
The need to clarify this argument came up today in the following tutorial. A participant asked, "Why do some chains produce more cumulative cases than what @joshwlambert and @avallecam Do you have thoughts on how to clarify this further? I'm tagging you because we had a brief discussion about it after the tutorial session. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do some chains produce more cumulative cases than what
stat_max
is?
Would an alternative be to ensure it doesn't, i.e. select cases in the final generation such that the stat is stat_max
?
My preference would be to either 1) keep the name This is where I think the confusion stemmed from in the tutorial because it is a max that does not behave as a max. |
FYI we have the same functionality dilemma in {simulist} and have chosen to have a soft limit so the "maximum" can be exceeded and the function returns a warning to let the user know the "maximum" (either default https://github.com/epiverse-trace/simulist/blob/main/R/sim_linelist.R#L70 This discussion also makes it clear that the documentation in |
I think this might be the desirable behaviour. We could treat the last generation of offspring as potential offspring and sample the number needed to reach |
Thanks, Josh. I think option 1 is better because it can be used to simulate desirable behavior. Allowing simulations to go beyond the limit set will make the results difficult to explain. |
I may be misinterpreting your definition of "soft" but from the code & output, it does seem like you don't return values above the max, so I would still interpret it as a "hard" limit. |
Sounds good to me.
Here's a reprex to show what I mean. The set.seed(2)
library(simulist)
linelist <- sim_linelist(
contact_distribution = function(x) dpois(x = x, lambda = 2),
infect_period = function(x) dgamma(x = x, shape = 3, scale = 3),
prob_infect = 0.55,
onset_to_hosp = function(x) dgamma(x = x, shape = 2, scale = 2),
onset_to_death = function(x) dgamma(x = x, shape = 2, scale = 2),
outbreak_size = c(5, 20)
)
#> Warning: Number of cases exceeds maximum outbreak size.
#> Returning data early with 29 cases and 44 total contacts (including cases). Created on 2024-05-03 with reprex v2.1.0 |
Ah! Thanks for this. That makes sense. I don't think we should have a hard lower limit in {epichains} as it is by definition the seeing cases, if the outbreak doesn't take off. The soft upper limit is what is already implemented as My concern is more about whether it is desirable behaviour. For example, if a user sets |
b517fee
to
1a20941
Compare
Upon further discussion with Seb, we've decided it might be better to rename |
9e3627e
to
d7d32e9
Compare
91920fb
to
5e12628
Compare
This PR closes #193 by renaming
stat_max
tostat_threshold
and improves the documentation to help clarify some nuances betweenstat_threshold
insimulate_chains()
andsimulate_summary()
. It also renames theinfinite
argument (old name ofstat_max
) inrborel()
tocensor_at
and improves its documentation. In particular,stat_threshold
insimulate_chains()
is a stopping criterion for chains. When the cumulative statistic of chains reaches or surpasses it, they endstat_threshold
insimulate_summary()
is both a stopping criterion for chains and a censoring limit. When the cumulative statistics of chains reach or surpass it, they end. Additionally, chain statistic values that are >=stat_threshold
are set toInf
.censor_at
in the borel functions is passed tostat_threshold
insimulate_summary()
as a stopping criterion and for censoring chain sizes but does not refer to chain sizes as used insimulate_summary()
, hence the name change and dedicated documentation.