Thank you very much for your implementation. It's highly helpful.
I'm attempting to learn different functions using the BNN model. I restrict the range of training values to be discontinuous. For example, in an attempt to learn a polynomial function (2*x^2 +2), I slightly change the number of neurons and number of layers for better prediction and train the network with equally sampled values from -4 to -1 and from 1 to 4. When testing the network with equally sampled values from -6 to 6, I would expect the Epistemic uncertainty to be high at the range of values that the network has not seen during the training, i.e from -6 to -4, from -1 to 1 and from 4 to 6.
Observations after testing:
1, The predicted mean is close to the expected test value in the trained range ([-4,-1], [1,4]) as expected
2, The variance of predicted values in the untrained range ([-6,-4], [4,6]) is high as expected
3, The variance of predicted values in the untrained range ([-1,1]), is lower which is contradictory to the concept of determining epistemic uncertainty through BNN. Could there be any explanation for this and Is there some way to make the model to have high variance in all the ranges of untrained data?
I have also done trained the exact model you have presented for 'Sin()' function with discontinuous range of training data and have observed the same issue
Thanks a lot in advance !!
The text was updated successfully, but these errors were encountered:
That's an interesting observation and I have to take a closer look to understand why this is the case. From another experiment I found that the learnable prior might be an issue but I'm not sure. I'd also try to implement it with tensorflow-probability directly and see if there are similar issues. Having other priorities at the moment but will for sure dedicate some time later. Thanks for reporting!
Hi @coolbuddyabdur, first of all, sorry for the late follow-up, hope my findings are still of interest to you. I wrote a notebook (still a draft) that hopefully sheds some light on the issue you reported. It also demonstrates how noise contrastive priors (NCPs) can help to get better uncertainty estimates for neural network predictions. Bayesian neural networks with priors in weight space are sometimes over-confident in regions outside the training data distribution (as you already reported):
NCPs are defined in data space which makes it easier to define a prior for out-of-distribution regions, resulting in more reliable uncertainty estimates
NCPs are only one of many possible ways to get better uncertainty estimates but they are a good fit for this specific example.