Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop BD simulation from hanging in situations with many constraints #453

Merged
merged 7 commits into from
Apr 24, 2024

Conversation

bjoelle
Copy link
Contributor

@bjoelle bjoelle commented Apr 17, 2024

This is a fix for issue #159 where the combination of many topological constraints made it hard to find good ages for the simulated tree and led to the BD simulator hanging in an infinite loop
Now if a good time cannot be found after several tries the age will simply be set based on the minimum

To be honest I don't think the trials would be needed at all, but removing them would require changing a lot of test output

@bjoelle bjoelle requested a review from bredelings April 17, 2024 11:47
ntries ++;
} while (t < min && ntries < 1000);

if(t < min) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this is safe or would it be better to throw an error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my thinking was that this is primarily used to initialize the run so it doesn't matter too much what the ages are as long as they work with the constraints...
if we just throw an error then we're back with the initial issue which is that some analyses cannot start because it can't find a valid tree

@bredelings
Copy link
Contributor

bredelings commented Apr 17, 2024 via email

@bjoelle
Copy link
Contributor Author

bjoelle commented Apr 18, 2024

I would agree that this may a good idea to separate simulation and initialization, since a lot of BD models don't have good simulation functions under certain constraints (fossils, or like here topological constraints), but I don't know if it's possible to distinguish the two uses from the Rev side of things at the moment
since this is a case where we cannot simulate properly anyway (because of the constraints on topology) I think we could do what I suggested as a temporary fix while someone (me if I find the time) goes looking further ?
especially since that issue exists since 2021

@bjoelle
Copy link
Contributor Author

bjoelle commented Apr 18, 2024

as an alternative I just thought of, I could add a warning to the simulation if we get into the fallback behaviour
that way it doesn't block the analysis but the user can decide if it's ok or not that the simulation is not fully BD

@davidcerny
Copy link
Contributor

This is just my 5 cents, but I'm personally not too concerned about the fact that the initial state might not be a valid draw from the prior. After all, we already allow users to supply their own initial trees, which (presumably) don't constitute valid draws, either.

I also wanted to point out that this discussion is relevant to issue #384 as well, i.e., the fact that it's currently impossible to find an initial state for non-clock trees with backbone constraints. This one, too, should be easy to fix if we're willing to decouple initialization from simulation (e.g., we could take all the tips that are not in the backbone, and randomly drop them onto the backbone).

@bredelings
Copy link
Contributor

@bjoelle Hmm... perhaps the warning might be the way to go for now. The rest of RevBayes is not set up to handle the distinction between simulation and initialization yet, but perhaps logically separating the two different things in the api is a step in the right direction?

@davidcerny I think allowing users to set the value is different. Presumably if a user sets the value, they know that it doesn't come from the distribution. But people also use RevBayes to simulate stuff, so it would be problematic if it silently simulates from the wrong distribution.

Long term, here are three things that might be possible:

  • allow distributions/variables to have an empty value, and simulate the value when it is first needed/requested. Currently we do weird stuff because we try to simulate after every parameter change, even before all the parameters are initialized.
  • allow distributions/variables to have an "can't simulate" value. When someone writes x ~ dnVeryComplex( ) then writes x at the prompt, we could say "x does not have a value; cannot be directly simulated" or something like that. The idea being that some distributions can only be simulated from via MCMC.
  • MCMC routines would start by looking for variables with a "can't simulate" value and then initialize them using the initialization routine

@davidcerny
Copy link
Contributor

That would be a very elegant solution, yes!

@hoehna
Copy link
Member

hoehna commented Apr 21, 2024

Perhaps it might be worth looking into SimulationCondition (

virtual void redrawValue(SimulationCondition c = SimulationCondition::MCMC) = 0; //!< Draw a new random value from the distribution
)

From my perspective it would be super nice if we could adopt their usage. Currently there are two options: (1) MCMC which only requires a valid state (non-zero probability), and (2) VALIDATION, which should make the validation scripts work, for example, in BD models not condition the simulation in n taxa.

@bjoelle
Copy link
Contributor Author

bjoelle commented Apr 22, 2024

ok as suggested I added an argument to all tree simulations to decide whether we're ok with returning a state that is not drawn from the true distribution - note that I haven't checked the behaviour of all simulation functions, so I just assumed that they did the correct thing unless explicitly specified
Note nb2 is that a lot of the constructor functions use draws from the simulation as well - in that case I assumed that we didn't care about the true distribution (so alwaysReturn = true) but I can change it if that's not correct

@hoehna hoehna merged commit 303865b into development Apr 24, 2024
20 checks passed
@hoehna hoehna deleted the issue159_fix branch April 24, 2024 07:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants