Is BOLFI robust to scaling of the parameters? #292

coschroeder · 2018-08-30T11:49:31Z

Summary:

Running BOLFI on the example MA2 with one scaled parameter is not giving meaningful results.

Description:

If I run the BOLFI fitting algorithm on the toy example MA2 where one parameter is scaled by the factor 10**4 for sampling and rescaled in the MA2 function the fitting breaks down. The acquisition function is not meaningful and also after running NUTS the posterior is not what it should look like.
If I am right, the fitting of the posterior should be independent of the order of magnitude of the parameters and all parameters in the GP should scale with the order. Is this correct?
I ran in this issue while trying to fit a more complex model.

Reproducible Steps:

import numpy as np
import scipy.stats
import matplotlib.pyplot as plt
import logging
logging.basicConfig(level=logging.INFO)

%matplotlib inline
import elfi
seed = (1)

def MA2(t1, t2, n_obs=100, batch_size=1, random_state=None):
    # Make inputs 2d arrays for numpy broadcasting with w
    t1 = np.asanyarray(t1).reshape((-1, 1))/10000 # rescale parameter t1
    t2 = np.asanyarray(t2).reshape((-1, 1))
    random_state = random_state or np.random

    w = random_state.randn(batch_size, n_obs+2)  # i.i.d. sequence ~ N(0,1)
    x = w[:, 2:] + t1*w[:, 1:-1] + t2*w[:, :-2]
    return x

# true parameters
t1_true = 0.6 * 10000 # scale true parameter t1
t2_true = 0.2

y_obs = MA2(t1_true, t2_true)

# Plot the observed sequence
#plt.figure(figsize=(11, 6));
#plt.plot(y_obs.ravel());

# To illustrate the stochasticity, let's plot a couple of more observations with the same true parameters:
plt.plot(MA2(t1_true, t2_true).ravel());
plt.plot(MA2(t1_true, t2_true).ravel());

# a node is defined by giving a distribution from scipy.stats together with any arguments (here 0 and 2)
t1 = elfi.Prior(scipy.stats.uniform, 0, 2 *10000) # scale prior distribution of parameter t1
t2 = elfi.Prior('uniform', 0, 2)

Y = elfi.Simulator(MA2, t1, t2, observed=y_obs)

def autocov(x, lag=1):
    C = np.mean(x[:,lag:] * x[:,:-lag], axis=1)
    return C

S1 = elfi.Summary(autocov, Y)
S2 = elfi.Summary(autocov, Y, 2)  # the optional keyword lag is given the value 2

# Finish the model with the final node that calculates the squared distance (S1_sim-S1_obs)**2 + (S2_sim-S2_obs)**2
d = elfi.Distance('euclidean', S1, S2)

#elfi.draw(d)  # just give it a node in the model, or the model itself (d.model)

log_d = elfi.Operation(np.log, d.model['d'])
# scale also bounds and acquisition noise to have the appropriate std
bolfi = elfi.BOLFI(log_d, batch_size=1, initial_evidence=20, update_interval=10,
                   bounds={'t1':(-20000, 20000), 't2':(-1, 1)}, acq_noise_var=[0.1*10000**2, 0.1], seed=seed)

#%%time
post = bolfi.fit(n_evidence=200)

#%%time 
result_BOLFI = bolfi.sample(1000, info_freq=100)

Current Output:

The acquisition function is screwed:

and also the marginals are uninformative:

Expected Output:

I expected a similar shape of the functions as for the unscaled model.

ELFI Version: 0.7

Python Version: 3.5

Operating System:

The text was updated successfully, but these errors were encountered:

vuolleko · 2018-08-30T13:02:36Z

I'm not an expert in GPs, but I think you should use a different kernel. The default kernel has just one length scale for all parameters. So for example something like:

kernel = GPy.kern.RBF(2, lengthscale=[100, 1], ARD=True)
kernel += GPy.kern.Bias(2)
bounds = {'t1':(-20000, 20000), 't2':(-1, 1)}
gp = elfi.GPyRegression(['t1', 't2'], bounds=bounds, kernel=kernel)
bolfi = elfi.BOLFI(log_d, batch_size=1, initial_evidence=20, update_interval=10,
                   bounds=bounds, target_model=gp, acq_noise_var=[100, 0.1], seed=seed)

I hope this helps you in the right direction.

Also, please update ELFI to the newest version 0.7.3. :)

coschroeder · 2018-08-31T11:12:49Z

Thanks for your fast response. I tried out your suggestion and for the acquisition functions it looks like it helps.
Do you know, if I have to take a similar scaling into account for the NUTS sampling? Because as you can see in the plots below, the chains get stuck in the larger dimension and the posterior are still not that great.
The four chains:

And the resulting marginals of the posterior: True values are (6000,0.2)

vuolleko · 2018-09-04T07:10:04Z

I suppose very different scales can be problematic for NUTS as well, and we have considered implementing some kind of normalizing. You could also try the basic Metropolis sampler, which allows (requires, actually) manually setting the "scale" of each parameter.

Have you considered trying the opposite in your case: having roughly equal parameter scales and then upscaling in the simulator?

coschroeder · 2018-09-04T07:30:56Z

Yeah, I already did a normalization/rescaling in my simulator and it seems to work.
I just thought this is an issue other users could be interested in, since different scales in parameters are quite common for real world problems.
Maybe it would be nice to drop a hint in the documentation. Reading the BOLFI paper this issue wasn't clear to me.

coschroeder closed this as completed Sep 4, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is BOLFI robust to scaling of the parameters? #292

Is BOLFI robust to scaling of the parameters? #292

coschroeder commented Aug 30, 2018 •

edited

Loading

vuolleko commented Aug 30, 2018

coschroeder commented Aug 31, 2018

vuolleko commented Sep 4, 2018

coschroeder commented Sep 4, 2018

Is BOLFI robust to scaling of the parameters? #292

Is BOLFI robust to scaling of the parameters? #292

Comments

coschroeder commented Aug 30, 2018 • edited Loading

Summary:

Description:

Reproducible Steps:

Current Output:

Expected Output:

ELFI Version: 0.7

Python Version: 3.5

Operating System:

vuolleko commented Aug 30, 2018

coschroeder commented Aug 31, 2018

vuolleko commented Sep 4, 2018

coschroeder commented Sep 4, 2018

coschroeder commented Aug 30, 2018 •

edited

Loading