Why don't you apply a softmax function before the final prediction? #41

zwx8981 · 2018-09-04T09:24:47Z

HI, thank you for great work, I have a little question. As a classification task,we usually apply a softmax function to convert the output of a model into a probabilistic vector, each entry of which represents the probability of the input that belonging to the corresponding category. However, it seems that in your code the output of the Mutan model (the output of the second multimodel fusion followed by only a linear transformation without a softmax) is directly fed into the loss function. Is there any special consideration?

vqa.pytorch/vqa/models/att.py

Line 152 in be1b611

x = self.linear_classif(x)

Cadene · 2018-09-04T21:28:20Z

@zwx8981 Here we use nn.CrossEntropyLoss which combines nn.LogSoftmax and nn.NLLLoss. Thus, we don't need to add a softmax. It is already included.

zwx8981 · 2018-09-05T03:17:11Z

@Cadene Thank you!

zwx8981 changed the title ~~Why don't you apply a softmax function after the second multimodel fusion?~~ Why don't you apply a softmax function before the final prediction? Sep 4, 2018

Cadene closed this as completed Sep 4, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why don't you apply a softmax function before the final prediction? #41

Why don't you apply a softmax function before the final prediction? #41

zwx8981 commented Sep 4, 2018 •

edited

Cadene commented Sep 4, 2018

zwx8981 commented Sep 5, 2018

Why don't you apply a softmax function before the final prediction? #41

Why don't you apply a softmax function before the final prediction? #41

Comments

zwx8981 commented Sep 4, 2018 • edited

Cadene commented Sep 4, 2018

zwx8981 commented Sep 5, 2018

zwx8981 commented Sep 4, 2018 •

edited