-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relative operator weights are not applied correctly which can degrade performance #633
Comments
(Can't take credit for that suggestion - that was a port from the google code issue tracker. :-) ) |
Well then props to Tim for porting the google code issues across ;-) |
As noted in #17, this could be solved by having a CompoundOperator that collects all partition specific operators, and ensure the total weight for all partitions does not exceed X%. This will have consequences for BEAUti templates, since connector rules need to be aware of this. There may be an issue with backward compatibility, investigating the consequences. |
I parked an implementation of a CompoundOperator in BEASTLabs, but am now thinking that doing the reweighting in the OperatorSchedule instead of BeastDoc may be the best way to go. The OperatorSchedule determines which operators to choose when running the MCMCM so won't interfere with any BEAUti related stuff, nor do we need to update templates (as is required when using the CompoundOperator). Any issues I am overlooking? |
The top-leve operator schedule shouldn’t know anything about specific models or even that such thing as a phylogenetic tree exists. What is your design proposal? One option is to allow an operator schedule to contain sub-schedules. Then StarBEAST could provide a sub-schedule.
|
Option 1: The immediate problem at hand (reweighting *BEAST operators robustly) can be fixed by moving the current reweighting code to the OperatorSchedule class. What it currently does is identify the set of operators that work on the species tree by their ID (must end in "Species") and make sure they get assigned 20% of the total weight. So, that is a very *BEAST specific hack. Option 2: A more generic scheme could allow identification of sets of operators by say a regular expressions matching operator IDs, and distribute a percentage of the weight among these operators. Flexible, but complex to take care of all boundary cases (multiple matches, weights adding up too much, etc.). Option 3: The CompoundOperator identifies operators by having them as inputs, which is perhaps more explicit, but much more cumbersome for BEAUti templates. |
Currently BEAUTi adjusts the sum of weights of species tree operators (those with IDs ending in "Species") to equal 20% of the total operator weights. This is done whenever "scrubAll" is called, which causes the following behaviour:
(1) User adds a bunch of gene trees with a total operator weight of 800.
(2) If there are 4 species tree operators of 100 weight each, BEAUTi will change their weights to 50 each (to sum to 20% of the total weight).
(3) User changes the model causing a new operator to be added (e.g. switching clock, population or substitution models).
(4) If this new operator is of weight 100, BEAUTi will reduce its weight to 66.666, and reduce the other species tree operators to weight 33.3333 (again to sum to 20% of the total weight).
Obviously all the species tree operators with original weights of 100 should end up with equal final weights, but you can see that the later additions end up with much height weight. If instead the original species tree weights are too low, the inverse will occur (later operators will end up with much lower weights)
This causes degraded performance either because steps are wasted when new operator weights are too high or mixing suffers when they are too low.
One way to ameliorate the behaviour is to change the weights only when an XML file is saved and then immediately revert them in memory. However this still doesn't completely fix the behaviour because if an XML file is saved and reloaded new operators will still get incorrect weights.
My guess is the only way to fix this properly is to finally implement the proposal by @tgvaughan #17
The text was updated successfully, but these errors were encountered: