Proposal: NEW MPI splitting #38

vivianmiranda · 2019-09-06T20:37:39Z

Hi

This is just a proposal and can happily help you implement if you think is ok. We are planning of making the merge between CosmoLike and Cobaya Framework an official DES pipeline. This means we are going to test modes in CosmoLike that are slow using this new framework - like non-limber calculation.

The way Cobaya splits MPI threads right now in Metropolis Hasting is one MPI core per walker. However, the way Metropolis Hasting in your code works is the following (I've learned that from CosmoMC notes): for each step you create an orthonormal basis with random orientation and cycle evaluations in the basis vectors. Therefore, you could split MPI threads up to number_walkers x number_dimmensions which can be >> n_walkers. This new splitting would require to include all points in a pool and then distribute them to MPI. Of course MPI threads = number_walkers x number_dimmensions is a bit extreme, but the point is that user can allocate MPI threads > number_walkers to achieve faster convergence.

With this change, convergence can happen a lot faster when CosmoLike need to evaluate things like a non-limber approximation. This would also help a lot in cases when the user adds a lot of parameters like when Prof. Dvorkin and I worked with 20 PCAs in w(z).

cmbant · 2019-09-06T20:50:02Z

I think this would increase significantly the total numerical cost as you’d be evaluating locations that actually end up being rejected? The sequential ordering of vectors just determines the proposals, but the evaluation order is still Markovian.

The best way to test slow variants is usually to importance sample, that way you only need to evaluate the likelihood on final accepted (and possibly thinned) samples. Using importance sampling can also avoid Monte Carlo noise when comparing different theory calculations on the same samples.

vivianmiranda · 2019-09-07T03:04:13Z

Hi

Thank you.

"The sequential ordering of vectors just determines the proposals, but the evaluation order is still Markovian." -> Not sure if I understood that. But in anycase, while it is true that IS can be quite useful, when testing shifts in scales you don't know if a new effect (such as including non-limber or RSD or baryonic effect) can shift things more than a sigma or two, IS can fail. We see that in DES when we include for example RSD and baryonic effects in specific scale cuts. Having a higher numerical cost but smaller convergence time is a good deal for us. But anycase, just a proposal based on a sampler I am testing that do that.

Best
Vivian

cmbant · 2019-09-07T06:45:20Z

That’s true, though if IS doe not work that tells you already there is a problem (result not numerically stable).

By markovian I meant the next step depends on the previous point. So while you can propose N points at once, typically it will accept one of the first few, at which point you have to throw away the calculations for the remaining points and generate a new set of proposals based at the new point. You can increase the step distance to decrease the acceptance rate so less are wasted, but overall efficiency will go down. Usually one is running a bunch of runs in parallel anyway, so overall it’s more efficient to run each run on a modest number of CPU’s and the different runs in parallel. ThIngs like nonLimber should parallelize well with openmp (eg calculate using Camb) on each mpi process.

JesusTorrado · 2019-09-07T06:58:07Z

Hi both,

I am afraid I agree with Antony in that what you are proposing is likely not to increase performance significantly, and that you would have some trouble making it behave in a markovian way, and with both that IS can miss something new.

I remember we discussed at some point about internal caching of different parts of your calculation, which could very significantly improve performance now that Cobaya has manual parameter blocking. Did you get to implement that sort of caching where possible?

Another tip would be computing the initial covariant matrix from the Limber case (even if it's a little different, will be faster that guessing a full one from a slower non-Limber likelihood). (In terms of increasing acceptance rate, maybe emcee can also be a good approach? we can try to do a quick implementation)

Or even using PolyChord with a small number of live points and small n_repeats, which would not provide a good approximation of the evidence, but the Monte Carlo sample would be as good as MCMC, and it would take better advantage of MPI parallelisation for slow likeihoods (since at least to me it looks like you should worry more about that at this point than about the dimensionality, but maybe I am wrong).

When is this particular project due for? I have something in the pipeline that may be useful for it, using machine learning), but it's a more long term approach. We can discuss privately about it, if you want.

vivianmiranda · 2019-09-07T07:22:47Z

Thank you for the explanation. Unfortunately none of DES pipelines (CosmoSiS and Cosmolike) are OpenMP threaded (or can be easily threaded) - so that is why I am trying ways to make MH faster with MPI. The project will be based on the CosmoLike-Cobaya glued I have done so far for my current project - but more general. CosmoLike is already one of the official pipelines but your awesome Cobaya allows the use of MH and Polychord and also use of external data easily (and allowed me to use CAMB with Cosmolike). We may also use that for WFIRST. I will also try to include EMCEE as an additional sampler in Cobaya as you said. We can talk more in private. Project will start quick in fast - we will try to use for DES-Y3. Of course - when you have a paper about Cobaya - we will cite it everywhere we use the Cobaya-Cosmolike glue. Did you get to implement that sort of caching where possible Not yet :( - CosmoLike is a big code. Best Vivian On Sep 6, 2019, at 11:58 PM, Jesús Torrado <notifications@github.com<mailto:notifications@github.com>> wrote: Hi both, I am afraid I agree with Antony in that what you are proposing is likely not to increase performance significantly, and that you would have some trouble making it behave in a markovian way, and with both that IS can miss something new. I remember we discussed at some point about internal caching of different parts of your calculation, which could very significantly improve performance now that Cobaya has manual parameter blocking. Did you get to implement that sort of caching where possible? Another tip would be computing the initial covariant matrix from the Limber case (even if it's different, will be faster that guessing a full one from a slow likelihood). (In terms of increasing acceptance rate, maybe emcee can be a good approach? we can try to do a quick implementation) Or even using PolyChord with a small number of live points and small n_repeats, which would not provide a good approximation of the evidence, but the Monte Carlo sample would be as good as MCMC, and it would take better advantage of MPI parallelisation for slow likeihoods (since at least to me it looks like you should worry more about that at this point than about the dimensionality, but maybe I am wrong). When is this particular project due for? I have something in the pipeline that may be useful for it, using machine learning), but it's a more long term approach. We can discuss privately about it, if you want. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#38?email_source=notifications&email_token=AAYP32B6D3XMIJVVZHZN7ULQINGH7A5CNFSM4IUM4HZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6ESBWA#issuecomment-529080536>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AAYP32FYKIN4ZLVMZGXUUOLQINGH7ANCNFSM4IUM4HZQ>.

JesusTorrado · 2019-10-02T13:44:45Z

Hi @vivianmiranda

Closing this for book-keeping reasons (there is nothing clearly defined to be done), but feel free to write us privately to discuss how to approach that problem.

JesusTorrado closed this as completed Oct 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: NEW MPI splitting #38

Proposal: NEW MPI splitting #38

vivianmiranda commented Sep 6, 2019

cmbant commented Sep 6, 2019

vivianmiranda commented Sep 7, 2019

cmbant commented Sep 7, 2019

JesusTorrado commented Sep 7, 2019 •

edited

Loading

vivianmiranda commented Sep 7, 2019 via email

JesusTorrado commented Oct 2, 2019

Proposal: NEW MPI splitting #38

Proposal: NEW MPI splitting #38

Comments

vivianmiranda commented Sep 6, 2019

cmbant commented Sep 6, 2019

vivianmiranda commented Sep 7, 2019

cmbant commented Sep 7, 2019

JesusTorrado commented Sep 7, 2019 • edited Loading

vivianmiranda commented Sep 7, 2019 via email

JesusTorrado commented Oct 2, 2019

JesusTorrado commented Sep 7, 2019 •

edited

Loading