Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate using dimensionality-reduction methods on outputs #45

Closed
mstimberg opened this issue May 26, 2021 · 7 comments
Closed

Investigate using dimensionality-reduction methods on outputs #45

mstimberg opened this issue May 26, 2021 · 7 comments

Comments

@mstimberg
Copy link
Member

Instead of asking the user to provide metrics to extract features, it might be possible to automatically reduce the dimensionality of the output (e.g. voltage trace)?

@mstimberg mstimberg added this to To do in sbi integration (GSoC 2021) via automation May 26, 2021
@akapet00
Copy link
Member

akapet00 commented Jun 1, 2021

It is possible, check out the figure below (Fig. 1.B in [1]):
image
The authors in [1] claim, "A minimally invasive extension of (approximate Bayesian computation and classical density estimation-based inference methods) is to first learn summary statistics that have certain optimality properties, before running a standard inference algorithm..."
There are different methods to automatically construct summary statistics, see [2, 3] for example.
On the other hand, there is an interesting, brief discussion in [4] where authors claim that simulation-based inference algorithms such as sequential neural posterior estimation (SNPE) can be applied directly to raw data, or to high-dimensional summary features.
Additionally, one of the examples in the official sbi documentation covers learning summary statistics by using an embedding neural network prior to simulation, but here the data of interest are not time-series as encountered when fitting electrophysiological data.

I would say that it is a good idea to stick with an expert-defined set of summary features for now, it would be straightforward to implement this automatic approach if necessary.
We could make use of the existing function in brian2modelfitting, calc_eFEL, which takes advantage of eFEL package.
We could also manually calculate additional features if requested by the user.

[1] Cranmer, Breher and Louppe. PNAS (2020) 117:30055-30062
[2] Jiang et al. Statistica Sinica (2017) 1595–1618
[3] Izbicki, Lee and Pospisil. Journal of Computational and Graphical Statistics (2019) 28:481-492
[4] Gonçalves et al. eLife (2020) 9:e56261

@mstimberg
Copy link
Member Author

Thanks for looking into this and the references, I'll try to have a closer look soon. Since this is non-trivial (but very interesting!), I agree that we should focus on user-provided summary features first. Interesting note about applying the network directly to the high-dimensional data, this might be something that we can try out easily rather soon.

@mstimberg mstimberg moved this from To do to In progress in sbi integration (GSoC 2021) Jun 9, 2021
@jcalvaradop777
Copy link

Best regards

I am trying to make a comparison of the incremental versions of dimensionality reduction methods. Could someone tell me where I can find the code for Incremental Locally Linear Embedding, or Incremental Multidimensional Scaling or Incremental Laplacian EigenMaps ?

Thank you.

@akapet00
Copy link
Member

Hi @endimeon777,

Regarding the code for the techniques you mentioned, I really have no idea. However, I am not sure if those methods are even applicable for the data we are handling here.

Best,
Ante

@akapet00
Copy link
Member

Hi @mstimberg .

I've decided to play around with the issue of using raw data directly without feature extraction (see my first comment):

On the other hand, there is an interesting, brief discussion in [4] where authors claim that simulation-based inference algorithms such as sequential neural posterior estimation (SNPE) can be applied directly to raw data, or to high-dimensional summary features.

The authors of the mentioned study [Gonçalves et al., 2020], state the following:

SNPE can be applied to, and might benefit from the use of summary features, but it also makes use of the ability of neural networks to automatically learn informative features in high-dimensional data. Thus, SNPE can also be applied directly to raw data (e.g. using recurrent neural networks [Lueckmann et al., 2017]), or to high-dimensional summary features which are challenging for ABC approaches (...).

The procedure is as follows: when SNPE is used in combination with MDN, MDN is augmented with a RNN which runs along the roecorded voltage trace to learn appropriate features and thus constrain the model parameters. I am not sure if this happens automatically when the output dimension is large, but it works.
Check out this notebook where I compared inference procedure for g_Na and g_K when using summary features (the same thing you did in the brian2 official examples here) and with the raw output.

In model fitting toolbox, we could probably check whether the neural posterior class belongs to SNPE and whether the density estimator function is MDN. If this is true, then users does not have to provide list (or dictionary) of features w.r.t. which the inference will be performed if they are only interested in "fitting" the parameters and no special features are of interest to them.

@akapet00
Copy link
Member

akapet00 commented Jul 18, 2021

Continuation of the discussion started in the previous comment: sbi-dev/sbi#527 + some additional info.

Since I will deal with #53, I can also enable empty list/dict for features argument in the constructor of the Inferencer in the same pull request. If the a list/dict of features is empty or set to None, SNPE will extract features automatically either by using either user-provided embedded network or by simple MLP provided in sbi by default. For simple problems, like the one showcased here, default MLP is doing more than a good job. For more complex problems, the user will probably have to utilize more complex recurrent neural nets, e.g., LSTM or GRU.

@akapet00
Copy link
Member

akapet00 commented Aug 6, 2021

With this last PR merged in sbi_support we are able to do automatic feature extraction without providing list of features to Inferencer. Automatic feature extraction will happen automatically by training MLP, if other embedding network is not provided to infer method.
I will close this issue now.

@akapet00 akapet00 closed this as completed Aug 6, 2021
sbi integration (GSoC 2021) automation moved this from In progress to Done Aug 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants