## Diagnostics

In [1]:
%load_ext autoreload
%autoreload 2

import numpy as np
import scipy.stats as ss

import elfi

import logging
logging.basicConfig(level=logging.INFO)

seed = 37

### Summary-statistics Selection

ELFI implements the Two-Stage Procedure proposed by Nunes & Balding (2010). An example of the Two-Stage Procedure's execution is shown in the following code.

For the example, we define a Gaussian-noise model and three summary statistics, one of which is uninformative.

In [2]:
def fn_simulator(mu, sigma, batch_size=1, random_state=None):
    mu, sigma = np.atleast_1d(mu, sigma)
    res = ss.norm.rvs(mu[:, None], sigma[:, None], size=(batch_size, 5), random_state=random_state)
    return res

def mean(y):
    return np.mean(y, axis=1)

def var(y):
    return np.var(y, axis=1)

def uninformative(y):
    return 1

To carry the Two-Stage procedure, Elfi requires to initialise a list of the assessed summary statistics and a simulator with the observations.

In [3]:
mean_obs = 1
std_obs = 3

# Obtain the observations.
y_obs = fn_simulator(mean_obs, std_obs, random_state=np.random.RandomState(seed))

prior_mu = elfi.Prior('uniform', -2, 4, name='mu')
prior_sigma = elfi.Prior('uniform', 1, 4, name='sigma')

# Initialise the simulator.
simulator = elfi.Simulator(fn_simulator, prior_mu, prior_sigma, observed=y_obs)

# Initialise the list of summary statistics.
list_ss = [mean, var, uninformative]

Now, we are ready to initialise the Two-Stage Selection. Note that we can choose a distance metric based on which the summary statistics will be evaluated.

In [4]:
from elfi.methods.diagnostics import TwoStageSelection

diagnostics = TwoStageSelection(list_ss, simulator, 'euclidean', seed)

In this Two-Stage procedure's implementation, we can choose the number of simulations `n_sim` obtained for every assessed summary-statistics combination and the number of such accepted simulations `n_acc`.

Note that the rationale of the Two-Stage procedure is based on finding the combination with the minimum entropy (Stage 1) and then finding the collection which shows the minimum mean sum of squared error (MRRSE) based on the parameters corresponding to the `n_closest` datasets of the minimum-entropy combination (Stage 2).

In [5]:
diagnostics.run(k=4, n_sim=1000, n_acc=100, n_closest=20)

INFO:elfi.methods.diagnostics:Combination ['mean'] shows the entropy of 2.276587
INFO:elfi.methods.diagnostics:Combination ['var'] shows the entropy of 2.331251
INFO:elfi.methods.diagnostics:Combination ['uninformative'] shows the entropy of 2.390263
INFO:elfi.methods.diagnostics:Combination ['mean', 'var'] shows the entropy of 2.183706
INFO:elfi.methods.diagnostics:Combination ['mean', 'uninformative'] shows the entropy of 2.276587
INFO:elfi.methods.diagnostics:Combination ['var', 'uninformative'] shows the entropy of 2.331251
INFO:elfi.methods.diagnostics:Combination ['mean', 'var', 'uninformative'] shows the entropy of 2.183706
INFO:elfi.methods.diagnostics:
The minimum entropy of 2.183706 was found in ['mean', 'var'].

INFO:elfi.methods.diagnostics:Combination ['mean'] shows the MRSSE of 17.146237
INFO:elfi.methods.diagnostics:Combination ['var'] shows the MRSSE of 18.754292
INFO:elfi.methods.diagnostics:Combination ['uninformative'] shows the MRSSE of 18.906968
INFO:elfi.methods.d

(<function __main__.mean>, <function __main__.var>)