Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: NEW MPI splitting #38

Closed
vivianmiranda opened this issue Sep 6, 2019 · 6 comments
Closed

Proposal: NEW MPI splitting #38

vivianmiranda opened this issue Sep 6, 2019 · 6 comments

Comments

@vivianmiranda
Copy link

Hi

This is just a proposal and can happily help you implement if you think is ok. We are planning of making the merge between CosmoLike and Cobaya Framework an official DES pipeline. This means we are going to test modes in CosmoLike that are slow using this new framework - like non-limber calculation.

The way Cobaya splits MPI threads right now in Metropolis Hasting is one MPI core per walker. However, the way Metropolis Hasting in your code works is the following (I've learned that from CosmoMC notes): for each step you create an orthonormal basis with random orientation and cycle evaluations in the basis vectors. Therefore, you could split MPI threads up to number_walkers x number_dimmensions which can be >> n_walkers. This new splitting would require to include all points in a pool and then distribute them to MPI. Of course MPI threads = number_walkers x number_dimmensions is a bit extreme, but the point is that user can allocate MPI threads > number_walkers to achieve faster convergence.

With this change, convergence can happen a lot faster when CosmoLike need to evaluate things like a non-limber approximation. This would also help a lot in cases when the user adds a lot of parameters like when Prof. Dvorkin and I worked with 20 PCAs in w(z).

@cmbant
Copy link
Collaborator

cmbant commented Sep 6, 2019

I think this would increase significantly the total numerical cost as you’d be evaluating locations that actually end up being rejected? The sequential ordering of vectors just determines the proposals, but the evaluation order is still Markovian.

The best way to test slow variants is usually to importance sample, that way you only need to evaluate the likelihood on final accepted (and possibly thinned) samples. Using importance sampling can also avoid Monte Carlo noise when comparing different theory calculations on the same samples.

@vivianmiranda
Copy link
Author

Hi

Thank you.

"The sequential ordering of vectors just determines the proposals, but the evaluation order is still Markovian." -> Not sure if I understood that. But in anycase, while it is true that IS can be quite useful, when testing shifts in scales you don't know if a new effect (such as including non-limber or RSD or baryonic effect) can shift things more than a sigma or two, IS can fail. We see that in DES when we include for example RSD and baryonic effects in specific scale cuts. Having a higher numerical cost but smaller convergence time is a good deal for us. But anycase, just a proposal based on a sampler I am testing that do that.

Best
Vivian

@cmbant
Copy link
Collaborator

cmbant commented Sep 7, 2019

That’s true, though if IS doe not work that tells you already there is a problem (result not numerically stable).

By markovian I meant the next step depends on the previous point. So while you can propose N points at once, typically it will accept one of the first few, at which point you have to throw away the calculations for the remaining points and generate a new set of proposals based at the new point. You can increase the step distance to decrease the acceptance rate so less are wasted, but overall efficiency will go down. Usually one is running a bunch of runs in parallel anyway, so overall it’s more efficient to run each run on a modest number of CPU’s and the different runs in parallel. ThIngs like nonLimber should parallelize well with openmp (eg calculate using Camb) on each mpi process.

@JesusTorrado
Copy link
Contributor

JesusTorrado commented Sep 7, 2019

Hi both,

I am afraid I agree with Antony in that what you are proposing is likely not to increase performance significantly, and that you would have some trouble making it behave in a markovian way, and with both that IS can miss something new.

I remember we discussed at some point about internal caching of different parts of your calculation, which could very significantly improve performance now that Cobaya has manual parameter blocking. Did you get to implement that sort of caching where possible?

Another tip would be computing the initial covariant matrix from the Limber case (even if it's a little different, will be faster that guessing a full one from a slower non-Limber likelihood). (In terms of increasing acceptance rate, maybe emcee can also be a good approach? we can try to do a quick implementation)

Or even using PolyChord with a small number of live points and small n_repeats, which would not provide a good approximation of the evidence, but the Monte Carlo sample would be as good as MCMC, and it would take better advantage of MPI parallelisation for slow likeihoods (since at least to me it looks like you should worry more about that at this point than about the dimensionality, but maybe I am wrong).

When is this particular project due for? I have something in the pipeline that may be useful for it, using machine learning), but it's a more long term approach. We can discuss privately about it, if you want.

@vivianmiranda
Copy link
Author

vivianmiranda commented Sep 7, 2019 via email

@JesusTorrado
Copy link
Contributor

Hi @vivianmiranda

Closing this for book-keeping reasons (there is nothing clearly defined to be done), but feel free to write us privately to discuss how to approach that problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants