Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run make.simmap in parallel #22

Closed
flashton2003 opened this issue Nov 28, 2017 · 5 comments
Closed

Run make.simmap in parallel #22

flashton2003 opened this issue Nov 28, 2017 · 5 comments

Comments

@flashton2003
Copy link

It takes about 40 minutes to run one iteration of make.simmap, is it possible to run make.simmap in parallel and combine the outputs somehow? Thanks for making nice software with good documentation!

@ddiazescandon
Copy link

I was asking me the same, I just ran a cophylo with 2000 terminals (each) and it was quite time, even with a 1000, we're talking about 5 to 10 mins, it isn't that much but after run it few times it become a lot. I was wondering if parallel would work, but... I don't have much expertise

@liamrevell
Copy link
Owner

liamrevell commented Nov 28, 2017

flashton, have you updated phytools from GitHub? Evidently, when I replaced the matrix exponentiation function that was used internally make.simmap got really slow for some models. I believe I have it fixed now. I posted about this on my blog: http://blog.phytools.org/2017/11/small-but-important-update-to-fitmk-and.html.

With regard to parallelization, this is not supported in Windows R, however I think something like this would work:

## fit model so we don't have to repeatedly recompute Q
fit<-fitMk(tree,x,model="ARD")
fittedQ<-matrix(NA,length(fit$states),length(fit$states))
fittedQ[]<-c(0,fit$rates)[fit$index.matrix+1]
diag(fittedQ)<-0
diag(fittedQ)<--rowSums(fittedQ)
colnames(fittedQ)<-rownames(fittedQ)<-fit$states
## load parallel
library(parallel)
trees<-mclapply(1:10,function(n,tree,x,fixedQ) make.simmap(tree,x,Q=fixedQ,nsim=10),
	tree=tree,x=x,fixedQ=fittedQ)
trees<-do.call(c,trees) ## combine trees
if(!("multiSimmap"%in%class(trees))) class(trees)<-c("multiSimmap",class(trees))

With regard to cophylo, that's a different story. I don't see how it could be easily parallelized as what requires all the computation there is iterative optimization of the left tree, then the right tree, then the left tree, and so on.

@liamrevell
Copy link
Owner

It turns out this can also be done in R for Windows using a package called snow.

Here are two examples:

  1. Without snow on a Unix-like system: http://blog.phytools.org/2017/11/running-makesimmap-in-parallel.html.

  2. With snow in R for Windows: http://blog.phytools.org/2017/11/running-makesimmap-in-parallel-in-r-for.html.

@flashton2003
Copy link
Author

Thanks Liam. Updating to the latest version from github increased the speed of the SYM, but not the ARM model. Not sure if this is expected behaviour, got a bit lost in your blog post!

Will try out the parallel options once I figure out which model to use. Thanks again!

@flashton2003
Copy link
Author

The parallel solution works like a dream. Thanks Liam!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants