Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FreeRate model instead of gamma #79

Closed
pwkooij opened this issue Jun 8, 2020 · 6 comments
Closed

FreeRate model instead of gamma #79

pwkooij opened this issue Jun 8, 2020 · 6 comments

Comments

@pwkooij
Copy link

pwkooij commented Jun 8, 2020

Is it possible to use/implement the FreeRate model (Soubrier et al., 2012) for heterogeneity across sites?

If it is already possible, how do you script it?

@bredelings
Copy link
Contributor

bredelings commented Jun 8, 2020

Something like this code fragment should work:

Q := fnHKY(kappa,pi)
rates ~ dnDirichlet([2.0,2.0,2.0,2.0])
freqs ~ dnDirichlet([2.0,2.0,2.0,2.0])
total = 0
for(i in 1:4){
   total += freqs[i]*rates[i]
}

seq ~ dnPhyloCTMC(tree=tau, Q=Q, siteRates=rates, siteRatesProbs=freqs, branchRates=1.0/total,type="DNA")

@pwkooij
Copy link
Author

pwkooij commented Jun 8, 2020

Thanks! I will give that a try. Any chance this can be build in the RevScripter?

@bredelings
Copy link
Contributor

Maybe make an issue on https://github.com/revbayes/revscripter/ ?

@Chjulian
Copy link

Thanks for this @bredelings, I have a follow up Question.

Don't we need to set the mixing specs for the parameters of the FreeRate model? (Otherwise the mcmc will use one single value for every rate/freq).

I have tried the specs below but Im not sure is these are the right specs as all four rates and frequencies converge to the same values regardless of the dataset.

moves[mvi++] = mvBetaSimplex(rates, weight=2.0)
moves[mvi++] = mvDirichletSimplex(rates, weight=1.0)
moves[mvi++] = mvBetaSimplex(freqs, weight=2.0)
moves[mvi++] = mvDirichletSimplex(freqs, weight=1.0)

@bredelings
Copy link
Contributor

bredelings commented Feb 16, 2021

Yes, you do need to add MCMC moves for the rates and freqs. It sounds like the moves that you chose are fine. I think the issue you are seeing is probably a different one: the categories are exchangeable -- the prior does not ensure that smaller rate comes first, etc. This makes it harder to do logging. Here is an example:

Suppose that your data set supports the rates [0.1, 0.3, 0.9, 3.0] and (for the sake of simplicity) the frequencies [0.25, 0.25, 0.25, 0.25]. In this case the likelihood will be the same if you interchange two of the rates: [3.0, 0.3, 0.9, 0.1]. Therefore, if your mixing is good, the first rate position will eventually average over all of the 4 rate values. And your second rate position will average over all of the 4 rate values. So, the posterior distribution of the first, second, third, and fourth rate positions will be the same. But if you look at an individual sample, you will see 4 different rate values. They will not be in increasing order though.

In this case, the model is working fine. The problem is actually with how you log and summarize the data. What I have done in this case is to sort the rates before logging them, so that the smallest rate always occurs first. Then it is meaningful to look at the posterior distribution of the first rate. So, something conceptually like this:

order = getOrder(rates)
sorted_rates = v(rates[order[1]], rates[order[2]], rates[order[3]], rates[order[4]])
sorted_freqs = v(freqs[order[1]], freqs[order[2]]. freqs[order[3]], freqs[order[4]])

then you can log the sorted rates. However, you might have to add the getOrder function to RevBayes.

Hmm..... actually, you might want to take a simpler approach: read the log file in python, and use python to reorder the rates and frequencies, then write a new log file with sorted rates. Just be careful that freqs[1] stays connected to rates[1] -- you have to sort the frequencies in the order of the rates.

Does that make sense?

@Chjulian
Copy link

Yes, it does makes sense. After inspecting the log, I suspected exchangeability could be the issue but I was not sure if this was an improper moves set-up (to be ordered by default, etc). I will try your suggestions to log the result. Many thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants