BOLFI + NUTS #135

vuolleko · 2017-04-20T14:40:26Z

Implementation of MCMC sampling using the No-U-Turn Sampler + convergence diagnostics
Implementation of BOLFI on top of BayesianOptimization
Result object for BOLFI, including some integrated plotting

jlintusaari · 2017-04-20T15:02:55Z

elfi/methods/posteriors.py

            self.threshold = minval
            logger.info("Using minimum value of discrepancy estimate mean (%.4f) as threshold" % (self.threshold))
-        self.priors = [None] * model.input_dim
+        self.priors = priors or [None] * model.input_dim
        self.ML, self.ML_val = stochastic_optimization(self._neg_unnormalized_loglikelihood_density, self.model.bounds, max_opt_iters)
        self.MAP, self.MAP_val = stochastic_optimization(self._neg_unnormalized_logposterior_density, self.model.bounds, max_opt_iters)


The above two will require some execution time and I'm not sure these are always needed. I think it would be cleaner to have this class only define the posterior distribution. Users can then optimize the distribution if they like.

jlintusaari · 2017-04-20T15:22:52Z

elfi/methods/methods.py

@@ -635,40 +635,59 @@ class BOLFI(InferenceMethod):

    """

-    def __init__(self, model, batch_size=1, discrepancy=None, bounds=None, **kwargs):
-        """
+    def get_posterior(self, threshold=None):


Perhaps this could be renamed to something that better reflects the fact that we do quite a bit of work here to infer the posterior. E.g. infer_posterior ?

jlintusaari · 2017-04-20T15:24:02Z

elfi/mcmc.py

+            n_accepted += 1
+
+    logger.info("{}: Total acceptance ratio: {:.3f}".format(__name__, float(n_accepted) / n_samples))
+    return samples[1:, :]


Nice work :)

jlintusaari · 2017-04-20T15:25:14Z

elfi/methods/posteriors.py

@@ -10,7 +10,7 @@
 logger = logging.getLogger(__name__)

 class Posterior():


Do we necessarily yet need a class for Posterior distribution?

True, this may just be confusing here.

jlintusaari · 2017-04-20T15:33:30Z

elfi/methods/posteriors.py

+        """
+        grad = self._grad_unnormalized_loglikelihood_density(x) + self._grad_logprior_density(x)
+        return grad[0]
+
    def __getitem__(self, idx):
        return tuple([[v]*len(idx) for v in self.MAP])

    def _unnormalized_loglikelihood_density(self, x):


I think _unnormalized_loglikelihood would be more proper since likelihood is a function.

jlintusaari · 2017-04-20T15:34:38Z

elfi/methods/posteriors.py

+        x : np.array
+        norm : boolean
+            Whether to normalize (unsupported).
+


maybe just remove these normalization flags? I don't know if there is any general (enough) way to normalize these.

jlintusaari · 2017-04-20T15:36:18Z

elfi/methods/posteriors.py

+        pdf = sp.stats.norm.pdf(term)
+        cdf = sp.stats.norm.cdf(term)
+
+        return factor * pdf / cdf

    def _unnormalized_likelihood_density(self, x):


Same here about density.

jlintusaari · 2017-04-20T15:36:33Z

elfi/methods/posteriors.py

-        return sp.stats.norm.logcdf(self.threshold, mean, std)
+        return sp.stats.norm.logcdf(self.threshold, mean, np.sqrt(var))
+
+    def _grad_unnormalized_loglikelihood_density(self, x):


Same here about density.

jlintusaari · 2017-04-20T15:37:31Z

elfi/methods/methods.py

@@ -618,14 +618,14 @@ def _report_batch(self, batch_index, params, distances):
        logger.debug(str)


-class BOLFI(InferenceMethod):
+class BOLFI(BayesianOptimization):


There is a test for BOLFI in test_inference.py. That should be updated now as this is implemented (it's currently skipped).

jlintusaari · 2017-04-20T15:39:16Z

elfi/bo/gpy_regression.py

+        # some heuristics to choose kernel parameters based on the initial data
+        length_scale = (np.max(x) - np.min(x)) / 3.
+        kernel_var = (np.max(y) / 3.)**2.
+        bias_var = kernel_var / 4.


I think there needs to be some amount of data available until these make sense. E.g. if we only have one observation, these may not work very well.

True, but what would be better?

I think just using ones, of course one cannot say if they would be better but e.g. length_scale here would become 0 if there was just 1 observation.

Ok, so now there's a check for really small length_scale, and a general check for reasonably large initialization set.

jlintusaari · 2017-04-20T15:40:42Z

elfi/methods/methods.py

    """Bayesian Optimization for Likelihood-Free Inference (BOLFI).

    Approximates the discrepancy function by a stochastic regression model.
    Discrepancy model is fit by sampling the discrepancy function at points decided by
    the acquisition function.

-    The implementation follows that of Gutmann & Corander, 2016.
+    The implementation (mostly) follows that of Gutmann & Corander, 2016.


Maybe list differences?

jlintusaari · 2017-04-20T15:41:24Z

elfi/methods/methods.py

-    def __init__(self, model, batch_size=1, discrepancy=None, bounds=None, **kwargs):
-        """
+    def get_posterior(self, threshold=None):
+        """Returns the posterior.


...Returns an approximation for the posterior based on a GP regression.

jlintusaari

Great work! Some minor comments.

jlintusaari

Lot of great work done!

jlintusaari · 2017-05-10T13:29:28Z

elfi/bo/acquisition.py

        """Returns the next batch of acquisition points.

        Parameters
        ----------
        n_values : int
            Number of values to return.
        pending_locations : None or numpy 2d array
-            If given, asycnhronous acquisition functions may
+            If given, asynchronous acquisition functions may


Just noticed this now but actually this applies to sync mode as well when max_parallel_batches > 1 (and furthermore we don't have async mode atm:)

jlintusaari · 2017-05-10T13:30:01Z

elfi/bo/acquisition.py

@@ -42,15 +42,15 @@ def evaluate(self, x, t=None):
        """
        return NotImplementedError

-    def acquire(self, n_values, pending_locations=None, t=None):
+    def acquire(self, n_values, pending_locations=None, t=0):


Why t=0 ?

It could be None as well, but then there should be a check. Otherwise it crashes if t is not given.

Yes, check is fine. I guess it was None at first because not all methods use it. The check can be implemented in the evaluate function though.

But what's wrong with 0?

Nothing too much, just to be safe. Since it is supposed to be a iteration id number coming from an algorithm, setting it to a fixed valid value can cause hidden bugs. You cannot know if it was truly set by the caller or was just 0 by accident.

jlintusaari · 2017-05-10T13:43:17Z

elfi/bo/acquisition.py

@@ -95,23 +97,55 @@ class LCBSC(AcquisitionBase):
    would be t**(2d + 2).
    """

-    def __init__(self, *args, delta=.1, **kwargs):
+    def __init__(self, *args, delta=.1, noise_cov=0., **kwargs):


What about moving this to the parent class acquisition? That way we could avoid having to implement this in other subclasses.

Ok, if similar noise structure can be expected for all acquisition methods?

jlintusaari · 2017-05-10T13:50:58Z

elfi/bo/acquisition.py

        super(LCBSC, self).__init__(*args, **kwargs)
        if delta <= 0 or delta >= 1:
            raise ValueError('Parameter delta must be in the interval (0,1)')
        self.delta = delta
+        if isinstance(noise_cov, float) or isinstance(noise_cov, int):


isinstance(noise_cov, (float, int)). This does not however detect numpy scalars but it may not be an issue here.

Oh, good, didn't know that.

I suppose we should generally add more argument checking for user-facing routines.

jlintusaari · 2017-05-10T13:51:26Z

elfi/bo/acquisition.py

+#class BolfiAcquisition(SecondDerivativeNoiseMixin, LCB):
+#    """Default acquisition function for BOLFI.
+#    """
+#    pass



We can just remove these I think.

jlintusaari · 2017-05-11T06:15:33Z

elfi/bo/gpy_regression.py

+        self._rbf_woodbury_inv = self._gp.posterior.woodbury_inv
+        self._rbf_woodbury_chol = self._gp.posterior.woodbury_chol
+        self._rbf_x2sum = np.sum(self._gp.X**2., 1)[None, :]
+        self._rbf_is_cached = True


When the GP model is updated and hence it's hyperparameters changed, should these be updated as well?

They will be, since self.is_sampling=False causes self._rbf_is_cached=False.

jlintusaari · 2017-05-11T06:19:18Z

elfi/bo/gpy_regression.py

+            kernel = self.gp_params.get('kernel')
+            self._kernel_is_default = False
+
+        noise_var = self.gp_params.get('noise_var') or np.max(y)**2. / 100.
        mean_function = self.gp_params.get('mean_function')


Are the noise_var and mean_function also taken care of when computing the gradient? It seems that only the kernel is checked to set the self._kernel_is_default flag, but does it also then include these?

True, they may not be constant. I'll add a requirement for them to be None for using the direct gradients.

jlintusaari · 2017-05-11T06:26:25Z

elfi/methods/methods.py

        bounds : list
            The region where to estimate the posterior for each parameter in
            model.parameters.
            `[(lower, upper), ... ]`
        initial_evidence : int, dict
            Number of initial evidence or a precomputed batch dict containing parameter 
            and discrepancy values
-        n_evidence : int
+        n_acq: int


I noticed I had some inconsistencies here with n_evidence and n_acq. I wonder if we should only use n_evidence, it would be easy for the user to know immediately how many evidence points he will have in the end (handy for estimating the required time to run the experiment). Also the initial_evidence could then be None by default and then be set by the heuristic depending on the dimensions rather than being a fixed quantity.

We also then wouldn't need _n_initial_runs since the only thing that matters is the n_evidence. I essence, all the evidence you gather will turn into initial_evidence for the future when you wish to continue where you are now.

Yes, sounds reasonable.

jlintusaari

A lot of work put into this, great job!

jlintusaari · 2017-05-12T09:58:56Z

elfi/methods/methods.py

    """Bayesian Optimization for Likelihood-Free Inference (BOLFI).

    Approximates the discrepancy function by a stochastic regression model.
    Discrepancy model is fit by sampling the discrepancy function at points decided by
    the acquisition function.

-    The implementation follows that of Gutmann & Corander, 2016.
+    The implementation loosely follows that of Gutmann & Corander, 2016.


What about:

the method implements the framework introduced in ...

jlintusaari · 2017-05-12T10:00:07Z

elfi/methods/methods.py

            Only needed if model is an ElfiModel
-        discrepancy_model : GPRegression, optional
+        target_model : GPyRegression, optional


Maybe we could say here that in BOLFI the target model is the discrepancy model.

jlintusaari · 2017-05-12T10:03:42Z

tests/functional/test_inference.py


    # We should be able to carry out the inference in less than six batches
    assert res.populations[-1].n_batches < 6


 @slow
 @pytest.mark.usefixtures('with_all_clients')
-def test_bayesian_optimization():
+def test_BOLFI():


I think we should have a separate test for bayesian optimization. Now the method is untested. Maybe test_bayesian_optimization_and_bolfi. I was thinking that first run the BO test as it was but then just give the GP model to BOLFI and do the sampling with BOLFI. Then a separate quick test for running everything with BOLFI.

jlintusaari reviewed Apr 20, 2017

View reviewed changes

vuolleko force-pushed the mcmc branch from e9f1b47 to 6cdc0bc Compare April 25, 2017 11:40

vuolleko force-pushed the mcmc branch 15 times, most recently from 6e827de to 4201cb5 Compare May 10, 2017 10:14

[PATCH 1/8] Implemented Metropolis MCMC (for reference)

6915d3d

vuolleko added 21 commits May 10, 2017 13:27

[PATCH 2/8] Implemented the NUTS MCMC sampler.

394fc46

[PATCH 3/8] Add tests, logging

23d7839

[PATCH 4/8] Cleaning and commenting

b7531f5

[PATCH 5/8] Implemented Rhat

1d998bd

[PATCH 6/8] Implemented calculation of efficient sample size

8e59260

[PATCH 7/8] Restrict recursion depth

052cd3b

[PATCH 8/8] Use RandomState

c87802b

Add import

b950230

Initial BOLFI implementation

0963298

Fix bug in Metropolis, add logging

6b7a82e

Add name of node to error msg

c5d2cf6

Fix gradient

d2dd1af

Optimizing NUTS

971740f

BOLFI samples posterior using NUTS

0802f1d

Fix ESS

cc954b3

BOLFI returns a Result object, parallel sampling, various fixes.

e75096e

Implemented direct (=faster) prediction methods for default kernel (RBF)

21df30f

Add noise to acquisition, set initial points for sampling

8714a38

Improve Result_BOLFI, include trace plot

24035c5

Add plots, rst files, address comments, misc fixes and improvements

48079c1

Add travis_wait to .travis.yml, disable sampling in BOLFI tests.

e8b74e3

jlintusaari reviewed May 11, 2017

View reviewed changes

vuolleko force-pushed the mcmc branch from 4201cb5 to 2f29e4f Compare May 11, 2017 15:29

jlintusaari approved these changes May 12, 2017

View reviewed changes

Address comments to PR, update docs

1c97710

vuolleko force-pushed the mcmc branch from 2f29e4f to 1c97710 Compare May 13, 2017 07:23

vuolleko merged commit 2e3ebdf into dev May 13, 2017

This was referenced May 13, 2017

Convergence diagnostics #106

Open

Integrate visualization #105

Closed

vuolleko deleted the mcmc branch June 15, 2017 14:00

		@@ -10,7 +10,7 @@
		logger = logging.getLogger(__name__)

		class Posterior():

BOLFI + NUTS #135

BOLFI + NUTS #135

Conversation

vuolleko commented Apr 20, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlintusaari Apr 20, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlintusaari Apr 20, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlintusaari Apr 20, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlintusaari Apr 28, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlintusaari Apr 20, 2017 • edited Loading

Choose a reason for hiding this comment

jlintusaari left a comment

Choose a reason for hiding this comment

jlintusaari left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vuolleko May 11, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlintusaari left a comment

Choose a reason for hiding this comment

jlintusaari May 12, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vuolleko commented Apr 20, 2017 •

edited

Loading

jlintusaari Apr 20, 2017 •

edited

Loading

jlintusaari Apr 20, 2017 •

edited

Loading

jlintusaari Apr 20, 2017 •

edited

Loading

jlintusaari Apr 28, 2017 •

edited

Loading

jlintusaari Apr 20, 2017 •

edited

Loading

vuolleko May 11, 2017 •

edited

Loading

jlintusaari May 12, 2017 •

edited

Loading