You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
WARNING: Log-likelihood -7226.94 of K2P+R4 worse than K2P+R3 -7059.91
Obviously this shouldn't happen. The lnL should always be better with R4 than R3. I'm guessing this is just a limitation of the current optimiser. In many cases it seems like a fairly big limitation too. E.g. in the example above the difference is >150 likelihood units.
So, I have a suggestion. When we optimise RN+1 (e.g. R4) we should do an intialisation step where we start with the ML rate parameters from RN (e.g. R3), and just add an extra one while holding the initial N parameters constant. We can then try to optimise this constrained model, e.g. by sliding the new parameter from the minimum up to double the maximum rate from RN. My bet is that this will often get us a model with RN+1 that has a better likelihood. But even if it doesn't, we can then pass these RN+1 rates to the BFGS or EM optimiser to further optimise them all together.
Thoughts @bqminh and @thomaskf? This is really just a constrained EM step to start with. And maybe we already do something like this.
Either way, it seems like there's room for improvement here.
The text was updated successfully, but these errors were encountered:
This is done already, i.e. R4 parameters are initialised from the R3. In the code it's this function: RateFree::initFromCatMinusOne() of model/ratefree.cpp,
My suggestion is not quite the same. It's that we initially hold the CatMinusOne parameters constant, and only optimise the new parameter. Once that's done, we optimise all of them.
E.g. if R2 gave: 0.1, 2.0
Then we initialise R3 with 0.1, 2.0, New
And we hold 0.1, 2.0 constant while finding the optimum value of New (allowing it to be anything from the minimum to maximum bound, i.e. smaller than 0.1, between 0.1 and 2.0, and larger than 2.0).
On many datasets I notice warnings like:
WARNING: Log-likelihood -7226.94 of K2P+R4 worse than K2P+R3 -7059.91
Obviously this shouldn't happen. The lnL should always be better with R4 than R3. I'm guessing this is just a limitation of the current optimiser. In many cases it seems like a fairly big limitation too. E.g. in the example above the difference is >150 likelihood units.
So, I have a suggestion. When we optimise RN+1 (e.g. R4) we should do an intialisation step where we start with the ML rate parameters from RN (e.g. R3), and just add an extra one while holding the initial N parameters constant. We can then try to optimise this constrained model, e.g. by sliding the new parameter from the minimum up to double the maximum rate from RN. My bet is that this will often get us a model with RN+1 that has a better likelihood. But even if it doesn't, we can then pass these RN+1 rates to the BFGS or EM optimiser to further optimise them all together.
Thoughts @bqminh and @thomaskf? This is really just a constrained EM step to start with. And maybe we already do something like this.
Either way, it seems like there's room for improvement here.
The text was updated successfully, but these errors were encountered: