Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relative operator weights are not applied correctly which can degrade performance #633

Closed
genomescale opened this issue Nov 5, 2016 · 6 comments

Comments

@genomescale
Copy link
Contributor

Currently BEAUTi adjusts the sum of weights of species tree operators (those with IDs ending in "Species") to equal 20% of the total operator weights. This is done whenever "scrubAll" is called, which causes the following behaviour:

(1) User adds a bunch of gene trees with a total operator weight of 800.
(2) If there are 4 species tree operators of 100 weight each, BEAUTi will change their weights to 50 each (to sum to 20% of the total weight).
(3) User changes the model causing a new operator to be added (e.g. switching clock, population or substitution models).
(4) If this new operator is of weight 100, BEAUTi will reduce its weight to 66.666, and reduce the other species tree operators to weight 33.3333 (again to sum to 20% of the total weight).

Obviously all the species tree operators with original weights of 100 should end up with equal final weights, but you can see that the later additions end up with much height weight. If instead the original species tree weights are too low, the inverse will occur (later operators will end up with much lower weights)

This causes degraded performance either because steps are wasted when new operator weights are too high or mixing suffers when they are too low.

One way to ameliorate the behaviour is to change the weights only when an XML file is saved and then immediately revert them in memory. However this still doesn't completely fix the behaviour because if an XML file is saved and reloaded new operators will still get incorrect weights.

My guess is the only way to fix this properly is to finally implement the proposal by @tgvaughan #17

@tgvaughan
Copy link
Contributor

(Can't take credit for that suggestion - that was a port from the google code issue tracker. :-) )

@genomescale
Copy link
Contributor Author

Well then props to Tim for porting the google code issues across ;-)

@rbouckaert
Copy link
Member

As noted in #17, this could be solved by having a CompoundOperator that collects all partition specific operators, and ensure the total weight for all partitions does not exceed X%. This will have consequences for BEAUti templates, since connector rules need to be aware of this. There may be an issue with backward compatibility, investigating the consequences.

@rbouckaert
Copy link
Member

I parked an implementation of a CompoundOperator in BEASTLabs, but am now thinking that doing the reweighting in the OperatorSchedule instead of BeastDoc may be the best way to go. The OperatorSchedule determines which operators to choose when running the MCMCM so won't interfere with any BEAUti related stuff, nor do we need to update templates (as is required when using the CompoundOperator). Any issues I am overlooking?

@alexeid
Copy link
Member

alexeid commented Nov 8, 2016

The top-leve operator schedule shouldn’t know anything about specific models or even that such thing as a phylogenetic tree exists. What is your design proposal? One option is to allow an operator schedule to contain sub-schedules. Then StarBEAST could provide a sub-schedule.

On 9/11/2016, at 10:13 am, Remco Bouckaert notifications@github.com wrote:

I parked an implementation of a CompoundOperator in BEASTLabs, but am now thinking that doing the reweighting in the OperatorSchedule instead of BeastDoc may be the best way to go. The OperatorSchedule determines which operators to choose when running the MCMCM so won't interfere with any BEAUti related stuff, nor do we need to update templates (as is required when using the CompoundOperator). Any issues I am overlooking?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub #633 (comment), or mute the thread https://github.com/notifications/unsubscribe-auth/AA3WSdPKLVtFyDCvolZrjB3sWfRugZQiks5q8OX6gaJpZM4KqKer.

@rbouckaert
Copy link
Member

Option 1: The immediate problem at hand (reweighting *BEAST operators robustly) can be fixed by moving the current reweighting code to the OperatorSchedule class. What it currently does is identify the set of operators that work on the species tree by their ID (must end in "Species") and make sure they get assigned 20% of the total weight. So, that is a very *BEAST specific hack.

Option 2: A more generic scheme could allow identification of sets of operators by say a regular expressions matching operator IDs, and distribute a percentage of the weight among these operators. Flexible, but complex to take care of all boundary cases (multiple matches, weights adding up too much, etc.).

Option 3: The CompoundOperator identifies operators by having them as inputs, which is perhaps more explicit, but much more cumbersome for BEAUti templates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants