Low performance compared to dynesty? (I'm probably doing something wrong) #149

asmasca · 2022-02-24T16:23:32Z

I recently found about NESSAI and I'm trying to implement some of my analysis of radial velocity time-series (for exoplanets) to test the performance compared to dynesty, as I have a few problems which will hugely benefit from a 2x reduction in computing time.

To give some context, I'm working with the data of this article (https://ui.adsabs.harvard.edu/abs/2022A%26A...658A.115F/abstract). So basically 2 time-series fitted simultaneously, with Gaussian processes and a custom likelihood function (written below).

In my first try it is running much slower than dinesty (parallel). My test case takes around 3x longer (in a 4-cores/8-threads laptop). I am also getting a few warnings I have trouble linking to configuration parameters. The final distributions (and logZ) is basically the same.

I have trouble figuring out if I am doing something wrong with NESSAI, or if this is just a problem at which dynesty is very good. I'm also not sure the parallelization is scaling correctly. The CPU load seems quite low (compared to dynesty).

I'm using the sampler as:

cpu = 8

fs = FlowSampler(TestModel(), 
                    output=output, 
                    resume=False, 
                    seed=1235,
                    nlive=live,
                    max_threads=cpu,
                    n_pool=cpu-1,
                    max_poolsize_scale=100)

The model is written at the end of the message.

I'm getting this warning nonstop:
02-24 15:16 nessai.proposal.flowproposal WARNING : Rejection sampling accepted less than 1 percent of samples!

Could the default configuration of the flow proposal have issues with this problem? Or maybe that it's not the right problem for it? I'm also not sure the paralellization is scaling correctly. The CPU load is quite low (compared to dynesty).

Here's the model, in case there's something that shouldn't be written the way it is. The priors are a mix of gaussian and uniform, with distributions created with scity.stats.

global bjd_rv,rv,erv,bjd_fwhm,fwhm,efwhm  #This is the data
global E18R, E19R, E21R  #Some chunks with independent properties
global E18F, E19F, E21F   #Same for the secondary time-series

class TestModel(Model):
    def __init__(self):
        self.names = labels
        self.bounds = dict_labels

    def log_prior(self, x):

        log_p = np.log(self.in_bounds(x))

        for n in self.names:
            log_p += prior_dict[n].logpdf(x[n])
        return log_p

    def log_likelihood(self, x):

        A1   = np.exp(x["LN_A1_RV"])
        A1_2 = np.exp(x["LN_A2_RV"])
        A2   = np.exp(x["LN_A1_FW"])
        A2_2 = np.exp(x["LN_A2_FW"])
        PROT = x["PROT"]
        TS1  = np.exp(x["LN_TS1"])
        TS2  = np.exp(x["LN_TS2"])

        zero_rv = rv.copy()
        zero_rv[E18R] = x['V0_E18_RV']
        zero_rv[E19R] = x['V0_E19_RV']
        zero_rv[E21R] = x['V0_E21_RV']

        res_rv = rv - zero_rv

        erv1 = erv.copy()
        erv1[E18R] = np.sqrt(erv[E18R]*erv[E18R] + np.exp(2*x['LJ_E18_RV']))
        erv1[E19R] = np.sqrt(erv[E19R]*erv[E19R] + np.exp(2*x['LJ_E19_RV']))
        erv1[E21R] = np.sqrt(erv[E21R]*erv[E21R] + np.exp(2*x['LJ_E21_RV']))



        K1 = terms2.SHOTerm(sigma=A1, rho=PROT, tau=TS1) + \
                terms2.SHOTerm(sigma=A1_2,rho=0.5 * PROT, tau=TS2)
        GP1 = celerite2.GaussianProcess(K1)
        GP1.compute(bjd_rv, yerr=erv1)
        lnlike_rv = GP1.log_likelihood(res_rv)



        zero_fw = fwhm.copy()
        zero_fw[E18F] = x['V0_E18_FW']
        zero_fw[E19F] = x['V0_E19_FW']
        zero_fw[E21F] = x['V0_E21_FW']

        res_fw = fwhm - zero_fw

        efwhm1 = efwhm.copy()
        efwhm1[E18F] = np.sqrt(efwhm[E18F]*efwhm[E18F] + np.exp(2*x['LJ_E18_FW']))
        efwhm1[E19F] = np.sqrt(efwhm[E19F]*efwhm[E19F] + np.exp(2*x['LJ_E19_FW']))
        efwhm1[E21F] = np.sqrt(efwhm[E21F]*efwhm[E21F] + np.exp(2*x['LJ_E21_FW']))

        K2 = terms2.SHOTerm(sigma=A2, rho=PROT, tau=TS1) + \
                terms2.SHOTerm(sigma=A2_2, rho=0.5 * PROT, tau=TS2)
        GP2 = celerite2.GaussianProcess(K2)
        GP2.compute(bjd_fwhm, yerr=efwhm1)
        lnlike_fw = GP2.log_likelihood(res_fw)

        return lnlike_rv + lnlike_fw

The text was updated successfully, but these errors were encountered:

mj-will · 2022-02-25T13:03:06Z

Hi @asmasca,

Firstly, thanks for trying nessai out!

As for the slow performance you're seeing, I think there could be a few possible causes. The warning you're getting is an indication that the process of drawing new samples (before computing the likelihood) is inefficient. This isn't necessarily a problem (it's more common the higher the dimensionality of the problem) and won't bias results but it can be an indication that, as you suggested, the flow proposal isn't configured correctly and it will slow things down.

Just so I have a better handle on things, which version of nessai are you running? The latest release (v0.4.0) changed some of the default settings and included a new default mode that is more stable.

If you have the latest release, I think the possible issues/causes of the slow performance could be:

The number of live points - because of the normalising flows, nessai doesn't perform very well for low numbers of live points, what value of nlive are you using in this analysis?
max_poolsize_scale - I'd generally only recommend changing this setting if you're seeing issues with nessai over-constraining the posterior distributions since it will tend to slow things downs. I'd recommend leaving this as the default since it should be faster.
The model - I can't see any obvious issues but see my comment below.
The default configuration of flow proposal - this is rather problem-dependent, I've explained a bit more about this below.
nessai is ill-suited to this particular problem - this is possible, we know that there are certain problems where the normalising flows in nessai don't work very well. The most common case is likelihoods that are highly multi-modal like the eggbox, nessai can sample this distribution but it will be very inefficient whereas dynesty can handle this case pretty easily (https://dynesty.readthedocs.io/en/latest/examples.html#eggbox).

Model

I can't see any obvious issues with how you've configured the model, though one possible issue could be the Gaussian priors. Nessai can handle Gaussian priors but it requires a little more configuration. Are the Gaussian priors you mentioned defined on (-inf, inf) or are they truncated on some bounds? If it's the former then I would expect you to get an error when the sampler starts. I can give you an example of the additions needed to the model.

Diagnosis plots

It's probably worth having a look at the diagnostics plots nessai produces. In particular state.png can be very insightful, it will include the overall acceptance and proposal acceptance as well as showing when the proposal is being trained. Would you be able to post state.png for me to have a look at? I should allow me to give you some more definite answers about why nessai is being slow.

trace.png might also be helpful, since it would show if the space was highly multi-modal.

Configuration of flow proposal

The configuration of the flow can make or break things, most of this relates to the complexity of normalising flow. The more complex the space, the more complex the flow needs to be (and the default values are not designed for especially complex problems).

The flow is configured using flow_config. Without knowing more about the problem, I'd recommend something like this:

flow_config=dict(
    model_config=dict(
        n_neurons=32,
	n_blocks=8,
    )
)

And then just pass that to FlowSampler, as shown in this example.

There are other settings that can help speed things up, such as telling the proposal about particular types of parameters like angles, but I'm hesitant to recommend anything without checking the things I've already mentioned.

Parallelisation

I'm also not sure the parallelization is scaling correctly. The CPU load seems quite low (compared to dynesty).

So the parallelisation in nessai is very different to that in dynasty since drawing new samples is inherently parallelised. In the current version, the multiprocessing pool is only used for evaluating the likelihood. This happens in large batches when new points are being drawn. This means that whilst other parts of the sampling are happening, like training the normalising flow, the pool isn't being used. So the usage will vary depending on what stage of sampling that algorithm is at. I suspect this is what you're seeing. This could no doubt be improved to make more constant use of the pool but my current focus has been on other more fundamental bottlenecks in the algorithm.

I realise this is a lot of information, but I hope it's helpful.

Thanks,
Michael

mj-will added the discussion Discussion or question about nessai label Feb 25, 2022

mj-will added the waiting for author Waiting for the author to respond label Jun 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Low performance compared to dynesty? (I'm probably doing something wrong) #149

Low performance compared to dynesty? (I'm probably doing something wrong) #149

asmasca commented Feb 24, 2022 •

edited

mj-will commented Feb 25, 2022

Low performance compared to dynesty? (I'm probably doing something wrong) #149

Low performance compared to dynesty? (I'm probably doing something wrong) #149

Comments

asmasca commented Feb 24, 2022 • edited

mj-will commented Feb 25, 2022

asmasca commented Feb 24, 2022 •

edited