You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What I understand from the quote below is that they only use the logits (score) for comparison between answers spans, instead of using the probabilities after applying the softmax function.
to allow comparison and aggregation of results from different segments, we remove the final softmaxlayer over different answer spans.
We apply our trained DOCUMENT READER for each single paragraph that appears inthe top 5 Wikipedia articles and it predicts an answer span with a confidence score. To make scores compatible across paragraphs in one or several retrieved documents, we use the unnormalized exponential and take argmax over all considered paragraph spans for our final prediction. This is just a very simple heuristic and there are better ways to aggregate evidence over different paragraphs
See: https://export.arxiv.org/pdf/1902.01718
The text was updated successfully, but these errors were encountered: