Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue in learning the variance for discontinuous training data #8

Closed
coolbuddyabdur opened this issue Nov 10, 2019 · 2 comments
Closed

Comments

@coolbuddyabdur
Copy link

coolbuddyabdur commented Nov 10, 2019

Thank you very much for your implementation. It's highly helpful.

I'm attempting to learn different functions using the BNN model. I restrict the range of training values to be discontinuous. For example, in an attempt to learn a polynomial function (2*x^2 +2), I slightly change the number of neurons and number of layers for better prediction and train the network with equally sampled values from -4 to -1 and from 1 to 4. When testing the network with equally sampled values from -6 to 6, I would expect the Epistemic uncertainty to be high at the range of values that the network has not seen during the training, i.e from -6 to -4, from -1 to 1 and from 4 to 6.

Observations after testing:
1, The predicted mean is close to the expected test value in the trained range ([-4,-1], [1,4]) as expected
2, The variance of predicted values in the untrained range ([-6,-4], [4,6]) is high as expected

3, The variance of predicted values in the untrained range ([-1,1]), is lower which is contradictory to the concept of determining epistemic uncertainty through BNN. Could there be any explanation for this and Is there some way to make the model to have high variance in all the ranges of untrained data?

Polynomial function

I have also done trained the exact model you have presented for 'Sin()' function with discontinuous range of training data and have observed the same issue
Sin_incomplete

Thanks a lot in advance !!

@krasserm
Copy link
Owner

krasserm commented Nov 11, 2019

That's an interesting observation and I have to take a closer look to understand why this is the case. From another experiment I found that the learnable prior might be an issue but I'm not sure. I'd also try to implement it with tensorflow-probability directly and see if there are similar issues. Having other priorities at the moment but will for sure dedicate some time later. Thanks for reporting!

@krasserm
Copy link
Owner

krasserm commented Sep 19, 2020

Hi @coolbuddyabdur, first of all, sorry for the late follow-up, hope my findings are still of interest to you. I wrote a notebook (still a draft) that hopefully sheds some light on the issue you reported. It also demonstrates how noise contrastive priors (NCPs) can help to get better uncertainty estimates for neural network predictions. Bayesian neural networks with priors in weight space are sometimes over-confident in regions outside the training data distribution (as you already reported):

bnn

NCPs are defined in data space which makes it easier to define a prior for out-of-distribution regions, resulting in more reliable uncertainty estimates

bnn+ncp

NCPs are only one of many possible ways to get better uncertainty estimates but they are a good fit for this specific example.

@krasserm krasserm closed this as completed Oct 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants