-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop BD simulation from hanging in situations with many constraints #453
Conversation
ntries ++; | ||
} while (t < min && ntries < 1000); | ||
|
||
if(t < min) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this is safe or would it be better to throw an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my thinking was that this is primarily used to initialize the run so it doesn't matter too much what the ages are as long as they work with the constraints...
if we just throw an error then we're back with the initial issue which is that some analyses cannot start because it can't find a valid tree
I think the real solution here is to separate initialization from sampling.
So we need a function that randomly initializes a variable, but maybe not
from the correct distribution.
One solution that might not be too hard here would be to create a virtual
function for sampling an initial point, where the default implementation is
to sample from the true distribution. Then for the birth death process,
provide an alternate implementation that prioritizes termination over
correctness.
Then I guess you would need to call the initialization sampling function in
the right place...
…On Wed, Apr 17, 2024, 11:51 AM bjoelle ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In
src/core/distributions/phylogenetics/tree/birthdeath/AbstractBirthDeathProcess.cpp
<#453 (comment)>:
> t = simulateDivergenceTime(origin, present);
- } while ( t < min );
+ ntries ++;
+ } while (t < min && ntries < 1000);
+
+ if(t < min) {
my thinking was that this is primarily used to initialize the run so it
doesn't matter too much what the ages are as long as they work with the
constraints...
if we just throw an error then we're back with the initial issue which is
that some analyses cannot start because it can't find a valid tree
—
Reply to this email directly, view it on GitHub
<#453 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABCH3HS2DXAAKRL73IMEQDY52K6JAVCNFSM6AAAAABGLEX4ZKVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDAMBWGQZTIOJVGY>
.
You are receiving this because your review was requested.Message ID:
***@***.***>
|
I would agree that this may a good idea to separate simulation and initialization, since a lot of BD models don't have good simulation functions under certain constraints (fossils, or like here topological constraints), but I don't know if it's possible to distinguish the two uses from the Rev side of things at the moment |
as an alternative I just thought of, I could add a warning to the simulation if we get into the fallback behaviour |
This is just my 5 cents, but I'm personally not too concerned about the fact that the initial state might not be a valid draw from the prior. After all, we already allow users to supply their own initial trees, which (presumably) don't constitute valid draws, either. I also wanted to point out that this discussion is relevant to issue #384 as well, i.e., the fact that it's currently impossible to find an initial state for non-clock trees with backbone constraints. This one, too, should be easy to fix if we're willing to decouple initialization from simulation (e.g., we could take all the tips that are not in the backbone, and randomly drop them onto the backbone). |
@bjoelle Hmm... perhaps the warning might be the way to go for now. The rest of RevBayes is not set up to handle the distinction between simulation and initialization yet, but perhaps logically separating the two different things in the api is a step in the right direction? @davidcerny I think allowing users to set the value is different. Presumably if a user sets the value, they know that it doesn't come from the distribution. But people also use RevBayes to simulate stuff, so it would be problematic if it silently simulates from the wrong distribution. Long term, here are three things that might be possible:
|
That would be a very elegant solution, yes! |
Perhaps it might be worth looking into SimulationCondition (
From my perspective it would be super nice if we could adopt their usage. Currently there are two options: (1) MCMC which only requires a valid state (non-zero probability), and (2) VALIDATION, which should make the validation scripts work, for example, in BD models not condition the simulation in n taxa. |
ok as suggested I added an argument to all tree simulations to decide whether we're ok with returning a state that is not drawn from the true distribution - note that I haven't checked the behaviour of all simulation functions, so I just assumed that they did the correct thing unless explicitly specified |
This is a fix for issue #159 where the combination of many topological constraints made it hard to find good ages for the simulated tree and led to the BD simulator hanging in an infinite loop
Now if a good time cannot be found after several tries the age will simply be set based on the minimum
To be honest I don't think the trials would be needed at all, but removing them would require changing a lot of test output