About the eq.4 #3

ykwon0407 · 2018-11-10T10:30:41Z

Could you please let me know whether the eq. 4 in the paper is applicable for multi label segmentation or just the binary segmentation?

Originally posted by @redsadaf in #1 (comment)

ykwon0407 · 2018-11-10T10:31:31Z

Dear redsadaf,

Thank you for your interests! The eq.4 in the paper is defined for multi-label segmentation. So you can apply the equation for not only binary segmentation but multi-class segmentation problems. Please note that if the eq.4 will provide a K by K matrix if there are K categories in your dataset.
.
In the case of binary classification (when K=2), the eq.4 produces a 2 by 2 matrix. However, the two diagonal elements are just the same each other, and similarly, two different off-diagonal terms are also same. Thus, we obtain numeric values, not matrices, for uncertainty maps.
.
Please let me know if you have any further questions and hope this is informative!!

(I copy and paste the reply in #1 ).

mongoose54 · 2019-01-08T07:20:11Z

@ykwon0407 Thank you for the reply. I am posting here my reply to follow the thread properly.

A couple of clarification questions:

For multi-label segmentation with K classes then we need to perform the algebra calculations: transposing ?
In your paper the proof of eq. 4 lies in paragraph A of the Appendix. Is that correct?

ykwon0407 · 2019-01-08T08:44:44Z

@mongoose54 Hi~ Here is the point-by-point response.

Yes, it is. You need to use transpose. Please note that resulting uncertainty matrices are a K by K matrix.
Quite close but not exactly. Appendix A shows the derivation of the equation (2), which is a population version of uncertainties. In contrast, the equation (4) is an estimator for the equation (2) !! That is, we are only able to utilize the equation (4) with data, and its converging point is the equation (4), the variance of a variational predictive distribution.

mongoose54 · 2019-01-09T00:12:01Z

@ykwon0407 Thanks again for the wonderful explanation.
In regards to the KxK matrix for K classes, what does each element represent (is it the degree of uncertainty between 2 classes) and what is the best way to get a single uncertainty value?

ykwon0407 · 2019-01-09T01:30:41Z

@mongoose54 Hello! :)

The K by K matrix can be considered as a proxy to a variance matrix of a multinomial distribution. So each element in the uncertainty matrix is nothing but correlation or variance of two components of an outcome.
[Additional information for 1.] Writing a dependent variable Y with one-hot encoding expression, then Y is a K length vector and it can be assumed to follow a multinomial distribution. I believe that you can easily find almost the same thing in the following wiki. Multinomial Wiki
First of all, as explained in the previous point, each element has the meaning. Thus, picking a specific element may give you some information. I guess there are so many examples can be made but the most interesting example is 'the sum of diagonal elements in the aleatoric uncertainty matrix', which can be shown to be very similar to Shannon's entropy.

ShellingFord221 · 2019-01-17T08:51:16Z

Sorry, why does Eq. 4 provide a K*K matrix when there are K classes?

ShellingFord221 · 2019-01-17T08:55:53Z

Besides, p_hat is just a list of 10 probabilities of some certain class (according to line 63 in /retina/utils.py), why does it have diagonal matrix?

ykwon0407 · 2019-01-17T09:58:49Z

@ShellingFord221 Hi~~ Here is the point-by-point response.

Sorry, why does Eq. 4 provide a K*K matrix when there are K classes?
-> In case you are solving a K-class classification problem, then a probability estimate (p_hat) will be represented as a K-length vector. Then, the proposed uncertainties, which can be considered as a naive variance, are nothing but a K by K matrix.

Besides, p_hat is just a list of 10 probabilities of some certain class (according to line 63 in /retina/utils.py), why does it have diagonal matrix?
-> If you run with setting a number of a random draw T as 10, then p_hat will be a (10, ) numpy array. I have no idea about the diagonal matrix...

ShellingFord221 · 2019-01-17T10:01:31Z

The diagonal matrix is mentioned in Eq. 4 in you paper, diag(p_hat)

ShellingFord221 · 2019-01-17T10:04:31Z

emmm... p_hat shoud be a matrix of size (num_samples, num_classes) (i.e. (10, 3) in my settings)?

ykwon0407 · 2019-01-17T10:26:02Z

@ShellingFord221

The diagonal matrix is mentioned in Eq. 4 in you paper, diag(p_hat)
-> Ah. I got it. The diagonal matrix is from the covariance matrix of the multinomial distribution. Please find the link.

emmm... p_hat shoud be a matrix of size (num_samples, num_classes) (i.e. (10, 3) in my settings)?
-> Yes, it is. Sorry for my binary classification code... (it assumes a lot..)

Let me clear all the details in the following

In /retina/utils.py

p_hat = np.array(p_hat) # line number 64
prediction = np.mean(p_hat, axis=0) # line number 67

p_hat should be a numpy array of size (num_samples, num_classes)
prediction should be a numpy array of size (num_classes, )

Then the aleatoric and epistemic matrix will be as follows.

aleatoric = np.diag(prediction) - p_hat.T.dot(p_hat)/p_hat.shape[0] # 3 by 3 matrix # I corrected an error after the discussion with ShellingFord221
tmp = p_hat - prediction  # 10 by 3 matrix
epistemic = tmp.T.dot(tmp)/tmp.shape[0]

Hope this information helps you!

ShellingFord221 · 2019-01-17T11:52:28Z

Thank you so much!!! But there's still a little question. In Eq. 4 of your paper, the diag is about p_hat, but in your codes above, it seems that diag is about prediction (the mean of p_hat).

ShellingFord221 · 2019-01-17T11:57:34Z

And why should the dot product of the matrix be divided by shape[0]? (p_hat.T.dot(p_hat)/prediction.shape[0])

ykwon0407 · 2019-01-17T13:37:08Z

@ShellingFord221 You're welcome! :)
1.
In Eq. 4 of your paper, the diag is about p_hat, but in your codes above, it seems that diag is about prediction (the mean of p_hat).
-> Because an average of diagonal of p_hat equals a diagonal matrix of prediction.

And why should the dot product of the matrix be divided by shape[0]? (p_hat.T.dot(p_hat)/prediction.shape[0])
-> In Eq.4, we need to divide by a number of random samples T, so I divide the p_hat.T.dot(p_hat) by prediction.shape[0].

ShellingFord221 · 2019-01-18T05:45:05Z

Thanks again for your kindly reply! But prediction.shape[0] should be the number of classes, not the number of samples.

ykwon0407 · 2019-01-18T06:43:26Z

@ShellingFord221 You are right! My bad.
It should be p_hat.shape[0], not prediction.shape[0].
I corrected the above code as well. Thanks!!!

ShellingFord221 · 2019-01-18T09:44:02Z

The sum of diagonal elements in the aleatoric uncertainty matrix is meaningful, is the sum of diagonal elements in the epistemic uncertainty matrix meaningful, too?
Besides, does the aleatoric uncertainty mean the uncertainty about the test data, and epistemic uncertainty mean the uncertainty about the model?

ykwon0407 · 2019-01-18T10:36:05Z

@ShellingFord221

The sum of diagonal elements in the aleatoric uncertainty matrix is meaningful, is the sum of diagonal elements in the epistemic uncertainty matrix meaningful, too?
-> I guess.. somehow yes.

Besides, does the aleatoric uncertainty mean the uncertainty about the test data, and epistemic uncertainty mean the uncertainty about the model?
-> They are not exactly separated, but it can be considered.

ShellingFord221 · 2019-01-18T11:08:01Z

The claim that aleatoric uncertainty means the uncertainty about the test data and epistemic uncertainty means the uncertainty about the model is also from this paper, Bayesian Convolutional Neural Networks with Variational Inference (the paragraph above section 6 experiments). But I have read his code, he mistakes the calculation of uncertainty about binary classification and multi-label classification, therefore his result about these two uncertainty is a number, rather than a K*K matrix (Table 2 in his paper).

ShellingFord221 · 2019-01-18T12:23:20Z

Besides, if I want to calculate the whole uncertainty (i.e. the sum of two uncertainties), should I:

first calculate the sum of diagonal elements in aleatoric matrix and the sum of diagonal elements in epistemic matrix, then calculate the sum of these two numbers
directly calculate the sum of aleatoric matrix and epistemic matrix as the final matrix, then calculate the sum of diagonal elements in this final matrix

ykwon0407 · 2019-01-18T13:06:15Z

@ShellingFord221 Either way is fine!

ShellingFord221 · 2019-08-22T13:04:01Z

Hi, after a half of year, it seems that I am confused again about the code above o(╥﹏╥)o .
The diag of p_hat is averaged, but the p_hat.T.dot(p_hat) part seems only divided by the number of samples. But in Eq. 4, this part should be summed, then be divided (i.e. first \sum_(t=1)^T p_hat.T.dot(p_hat), then this part is divided by the number of samples). And the situation is the same about tmp.T.dot(tmp) part (it is also divided by the number of samples in the code above, but there is no sum of T parts).

ykwon0407 · 2019-08-23T07:16:46Z

@ShellingFord221 Hi, again!. The dot product operations will sum over elements. Please see this link together https://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html.

ykwon0407 closed this as completed Nov 11, 2018

ykwon0407 mentioned this issue Jan 17, 2019

about p_hat #4

Closed

ShellingFord221 mentioned this issue Oct 8, 2019

about the calculation of uncertainty kumar-shridhar/PyTorch-BayesianCNN#32

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the eq.4 #3

About the eq.4 #3

ykwon0407 commented Nov 10, 2018

ykwon0407 commented Nov 10, 2018

mongoose54 commented Jan 8, 2019

ykwon0407 commented Jan 8, 2019

mongoose54 commented Jan 9, 2019

ykwon0407 commented Jan 9, 2019 •

edited

Loading

ShellingFord221 commented Jan 17, 2019

ShellingFord221 commented Jan 17, 2019

ykwon0407 commented Jan 17, 2019

ShellingFord221 commented Jan 17, 2019

ShellingFord221 commented Jan 17, 2019

ykwon0407 commented Jan 17, 2019 •

edited

Loading

ShellingFord221 commented Jan 17, 2019

ShellingFord221 commented Jan 17, 2019

ykwon0407 commented Jan 17, 2019

ShellingFord221 commented Jan 18, 2019

ykwon0407 commented Jan 18, 2019 •

edited

Loading

ShellingFord221 commented Jan 18, 2019

ykwon0407 commented Jan 18, 2019

ShellingFord221 commented Jan 18, 2019

ShellingFord221 commented Jan 18, 2019

ykwon0407 commented Jan 18, 2019

ShellingFord221 commented Aug 22, 2019

ykwon0407 commented Aug 23, 2019

About the eq.4 #3

About the eq.4 #3

Comments

ykwon0407 commented Nov 10, 2018

ykwon0407 commented Nov 10, 2018

mongoose54 commented Jan 8, 2019

ykwon0407 commented Jan 8, 2019

mongoose54 commented Jan 9, 2019

ykwon0407 commented Jan 9, 2019 • edited Loading

ShellingFord221 commented Jan 17, 2019

ShellingFord221 commented Jan 17, 2019

ykwon0407 commented Jan 17, 2019

ShellingFord221 commented Jan 17, 2019

ShellingFord221 commented Jan 17, 2019

ykwon0407 commented Jan 17, 2019 • edited Loading

ShellingFord221 commented Jan 17, 2019

ShellingFord221 commented Jan 17, 2019

ykwon0407 commented Jan 17, 2019

ShellingFord221 commented Jan 18, 2019

ykwon0407 commented Jan 18, 2019 • edited Loading

ShellingFord221 commented Jan 18, 2019

ykwon0407 commented Jan 18, 2019

ShellingFord221 commented Jan 18, 2019

ShellingFord221 commented Jan 18, 2019

ykwon0407 commented Jan 18, 2019

ShellingFord221 commented Aug 22, 2019

ykwon0407 commented Aug 23, 2019

ykwon0407 commented Jan 9, 2019 •

edited

Loading

ykwon0407 commented Jan 17, 2019 •

edited

Loading

ykwon0407 commented Jan 18, 2019 •

edited

Loading