Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BEAUti Template for SubstBMA plugin #9

Closed
tgvaughan opened this issue Jan 19, 2014 · 4 comments
Closed

BEAUti Template for SubstBMA plugin #9

tgvaughan opened this issue Jan 19, 2014 · 4 comments

Comments

@tgvaughan
Copy link
Contributor

This is a crucial development to allow SubstBMA plugin (published in 2012) to be used/cited in BEAST2.

Comments from GC issue:

Project Member #3 higgs.ml

subst-bma plugin needs to be compilable

Oct 2, 2013
Project Member #4 alexei.drummond

Subst-bma compiles against BEAST2.0.2 -- if BEAST2.1 has changed the API such that subst-bma doesn't compile then I guess this issue depends on first changing it so it compiles again?

Oct 2, 2013
Project Member #5 higgs.ml

That is what I would expect.
If you foresee any development for v2.0.2, just branch off in SVN for that version only.

@rbouckaert
Copy link
Member

rbouckaert commented Jun 21, 2016

Is in package https://github.com/jessiewu/substBMA now.

@tgvaughan
Copy link
Contributor Author

This hasn't been completely solved: the template currently only supports inference of serially sampled data sets under the SDPM1 model in which DP categories are shared by both clock rates and substitution models. The SPDM2 model in which these two sets of parameters are governed by independent DPPs is still unsupported, as is the special case of contemporaneous data which requires tree height rescaling and the use of a special tree height dummy prior.

That said, we have a template that works with arbitrary viral datasets under arbitrary clock models and tree priors. We also have a (rough) tutorial in place. What's missing is a project web page.

@alexeid
Copy link
Member

alexeid commented Jun 21, 2016

Great!

P.S. the “special case” of contemporaneous data is actually the most common scenario in the wild ;)

On 21/06/2016, at 6:21 AM, Tim Vaughan notifications@github.com wrote:

This hasn't been completely solved: the template currently only supports inference of serially sampled data sets under the SDPM1 model in which DP categories are shared by both clock rates and substitution models. The SPDM2 model in which these two sets of parameters are governed by independent DPPs is still unsupported, as is the special case of contemporaneous data which requires tree height rescaling and the use of a special tree height dummy prior.

That said, we have a template that works with arbitrary viral datasets under arbitrary clock models and tree priors. We also have a (rough) tutorial in place https://github.com/jessiewu/substBMA/wiki. What's missing is a project web page.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub #9 (comment), or mute the thread https://github.com/notifications/unsubscribe/AA3WSe5dekxNryi3ZfYX4nq_YgMqHjWrks5qN2a-gaJpZM4Ba08a.

@tgvaughan
Copy link
Contributor Author

It's also a problem if you don't have a viral dataset - the model needs very well tuned hyperpriors on the base distributions. The defaults were constructed by analysing ~20 different viral alignments using PhyML and fitting an MVN to the result. In principle you can set these to whatever you like in the template, but from what I understand (which isn't much) constructing these hyperpriors needs a lot of thought.

One of the reasons the contemporaneous case is unimplemented is that I still don't fully understand it. In this case there are actually three distinct temporal scaling procedures in play: the clock rates generated by the DPP, the branch-specific clock rate distribution, and a third one that scales everything so that the height of the tree is (roughly) 1. Exactly how all this interacts with coalescent priors etc is still confusing me.

I spent a while trying to outdo Jessie (silly me) and come up with a way of avoiding the need for this mess. In conversation with Jessie setting the mean of the base distribution for the rates DP to 1 isn't enough to avoid identifiability problems. This makes sense to me, as it still leaves the true average rate across sites unspecified. I think fixing this average or even just one of the rate categories to 1 would solve the problem, but doing this seems to destroy the exchangeability of the DP...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants