Relative operator weights are not applied correctly which can degrade performance #633

genomescale · 2016-11-05T04:12:08Z

Currently BEAUTi adjusts the sum of weights of species tree operators (those with IDs ending in "Species") to equal 20% of the total operator weights. This is done whenever "scrubAll" is called, which causes the following behaviour:

(1) User adds a bunch of gene trees with a total operator weight of 800.
(2) If there are 4 species tree operators of 100 weight each, BEAUTi will change their weights to 50 each (to sum to 20% of the total weight).
(3) User changes the model causing a new operator to be added (e.g. switching clock, population or substitution models).
(4) If this new operator is of weight 100, BEAUTi will reduce its weight to 66.666, and reduce the other species tree operators to weight 33.3333 (again to sum to 20% of the total weight).

Obviously all the species tree operators with original weights of 100 should end up with equal final weights, but you can see that the later additions end up with much height weight. If instead the original species tree weights are too low, the inverse will occur (later operators will end up with much lower weights)

This causes degraded performance either because steps are wasted when new operator weights are too high or mixing suffers when they are too low.

One way to ameliorate the behaviour is to change the weights only when an XML file is saved and then immediately revert them in memory. However this still doesn't completely fix the behaviour because if an XML file is saved and reloaded new operators will still get incorrect weights.

My guess is the only way to fix this properly is to finally implement the proposal by @tgvaughan #17

tgvaughan · 2016-11-05T05:40:14Z

(Can't take credit for that suggestion - that was a port from the google code issue tracker. :-) )

genomescale · 2016-11-06T01:58:12Z

Well then props to Tim for porting the google code issues across ;-)

rbouckaert · 2016-11-07T18:10:34Z

As noted in #17, this could be solved by having a CompoundOperator that collects all partition specific operators, and ensure the total weight for all partitions does not exceed X%. This will have consequences for BEAUti templates, since connector rules need to be aware of this. There may be an issue with backward compatibility, investigating the consequences.

rbouckaert · 2016-11-08T21:13:29Z

I parked an implementation of a CompoundOperator in BEASTLabs, but am now thinking that doing the reweighting in the OperatorSchedule instead of BeastDoc may be the best way to go. The OperatorSchedule determines which operators to choose when running the MCMCM so won't interfere with any BEAUti related stuff, nor do we need to update templates (as is required when using the CompoundOperator). Any issues I am overlooking?

alexeid · 2016-11-08T21:59:35Z

The top-leve operator schedule shouldn’t know anything about specific models or even that such thing as a phylogenetic tree exists. What is your design proposal? One option is to allow an operator schedule to contain sub-schedules. Then StarBEAST could provide a sub-schedule.

On 9/11/2016, at 10:13 am, Remco Bouckaert notifications@github.com wrote:

I parked an implementation of a CompoundOperator in BEASTLabs, but am now thinking that doing the reweighting in the OperatorSchedule instead of BeastDoc may be the best way to go. The OperatorSchedule determines which operators to choose when running the MCMCM so won't interfere with any BEAUti related stuff, nor do we need to update templates (as is required when using the CompoundOperator). Any issues I am overlooking?

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub #633 (comment), or mute the thread https://github.com/notifications/unsubscribe-auth/AA3WSdPKLVtFyDCvolZrjB3sWfRugZQiks5q8OX6gaJpZM4KqKer.

rbouckaert · 2016-11-08T23:51:06Z

Option 1: The immediate problem at hand (reweighting *BEAST operators robustly) can be fixed by moving the current reweighting code to the OperatorSchedule class. What it currently does is identify the set of operators that work on the species tree by their ID (must end in "Species") and make sure they get assigned 20% of the total weight. So, that is a very *BEAST specific hack.

Option 2: A more generic scheme could allow identification of sets of operators by say a regular expressions matching operator IDs, and distribute a percentage of the weight among these operators. Flexible, but complex to take care of all boundary cases (multiple matches, weights adding up too much, etc.).

Option 3: The CompoundOperator identifies operators by having them as inputs, which is perhaps more explicit, but much more cumbersome for BEAUti templates.

…+ add unit test #633

genomescale mentioned this issue Nov 5, 2016

Only applies relative species operator weights to saved XML files #634

Merged

rbouckaert mentioned this issue Nov 7, 2016

Relative operator weights #17

Closed

rbouckaert added a commit that referenced this issue Nov 11, 2016

move Species operator weighting to OperatorSchedule (more to follow) …

79fc004

…+ add unit test #633

rbouckaert added a commit that referenced this issue Nov 11, 2016

add correct test #633

a8629df

rbouckaert added a commit that referenced this issue Nov 11, 2016

make OperatorSchedule trigger its own reweighting #633

78c9758

rbouckaert added a commit that referenced this issue Nov 11, 2016

reweight when operators have no id #633

0c0ad86

rbouckaert added a commit that referenced this issue Nov 16, 2016

allow nested OperatorSchedules #633

357c0cc

rbouckaert added a commit that referenced this issue Nov 17, 2016

expand OperatorScheduleTest #633

69e7b98

rbouckaert added a commit that referenced this issue Nov 17, 2016

update OperatorScheduleTest #633

5a6c643

rbouckaert closed this as completed in 7202c28 Nov 17, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Relative operator weights are not applied correctly which can degrade performance #633

Relative operator weights are not applied correctly which can degrade performance #633

genomescale commented Nov 5, 2016

tgvaughan commented Nov 5, 2016

genomescale commented Nov 6, 2016

rbouckaert commented Nov 7, 2016

rbouckaert commented Nov 8, 2016

alexeid commented Nov 8, 2016

rbouckaert commented Nov 8, 2016

Relative operator weights are not applied correctly which can degrade performance #633

Relative operator weights are not applied correctly which can degrade performance #633

Comments

genomescale commented Nov 5, 2016

tgvaughan commented Nov 5, 2016

genomescale commented Nov 6, 2016

rbouckaert commented Nov 7, 2016

rbouckaert commented Nov 8, 2016

alexeid commented Nov 8, 2016

rbouckaert commented Nov 8, 2016