Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/tf probability #110

Merged

Conversation

benoitLebreton-perso
Copy link
Contributor

@benoitLebreton-perso benoitLebreton-perso commented Nov 2, 2021

This pull request is linked to #104
The changes are optional. The user will need to pip install melusine[tf-probability] to use this feature.
The base Melusine is this way not impacted.

Description

New models are available with feature for uncertainty estimation using TFP (tensorflow-probability)
The TFP based models output are no punctual estimations but distribution on probabilities.
In other words, for each prediction : a distribution on the estimated probability is computed.
We propose to use this distribution get

  • a punctual estimation (like the regular melusine models do)
  • an uncertainty estimation around this punctual estimation using the dispersion of the distribution

Fixes #104

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

I plan to test it using an open source dataset to check if the classification performances are the same (punctual estimation)

Test Configuration:

  • OS:
  • Python version: 3.7
  • Melusine version: 2.3.1

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

use tensorflow_probability to build probabilistic models. The goal is to provide better uncertainty estimations.
Also needed to change X_meta in X_meta.to_numpy() to allow __call__ function on the list of inputs.
…darray. Now we cast X_meta to a np.array in any case. It fixes a bug with __call__ method of tensorflow models.
WIP clean it to make a readable tuto instead of an explo
Copy link
Contributor Author

@benoitLebreton-perso benoitLebreton-perso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cleaned this pull request and manage to test it on a larger dataset (the movies dataset : a 100 000 movies plots to classify)
now this pull request looks good for a review please :)

melusine/models/train.py Show resolved Hide resolved
@benoitLebreton-perso benoitLebreton-perso marked this pull request as ready for review August 24, 2022 15:02
@benoitLebreton-perso
Copy link
Contributor Author

Idea to improve (as a next step)
limit the upper/lower boundaries to 0 and 1
Indeed as we use mean+2*std (gaussian) this may give a value greater than 1 or lower than 0 (but we are talking about estimated probabilities so we may limit them to [0;1]

Another idea is to change the way we compute lower/upper bounds : we also could sample the outputs thousands of times and determine a bootstrap 95%-uncertainty area

@hugo-quantmetry hugo-quantmetry merged commit 5a7e621 into MAIF:master Dec 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tensorflow-probability
2 participants