New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About the final prediction score s #4
Comments
Sorry for the misleading descriptions in our paper. Actually, we find that using (1) s_iou, (2) sigmoid(s_iou), (3) sigmoid(s_iou)*(s_mm/2+0.5) will not bring significant difference. For (1)&(2), they have no difference in the top-1/top-5 selection of evaluation protocol, due to the monotonicity of sigmoid function. For (2)&(3), we find that they also have no notable difference. Here we provide our explanations: s_iou has the accurate supervisions (scale iou) while s_mm only have binary supervisions (pos. or neg. for both moment and sentence), so it is natural that s_iou has the better final performance in inference. In our experiments, performance of s_iou alone has very significant improvement compared to 2D-TAN because the training process of s_mm has already improved the feature encoding ability of two feature extractors and s_mm itself is not important in inference. In our practice, we actually use (3) in our code because we find it may brings a little improvement (<0.5%) for R@5 metrics. For the implementation details, please refer to L53,L67 of https://github.com/MCG-NJU/MMN/blob/main/mmn/modeling/mmn/mmn.py and L48 of https://github.com/MCG-NJU/MMN/blob/main/mmn/data/datasets/evaluation.py. You may notice that there are also a 'CONTRASTIVE_SCORE_POW', which lead to the final form of prediction score to be sigmoid(s_iou)*torch.pow((s_mm/2+0.5),CONTRASTIVE_SCORE_POW). It also brings no notable difference and we think it is too trivial to be included in the paper. For any further questions, please feel free to comment in this issue. |
Thank you for your prompt reply! |
Hi, thanks for sharing your great work.
In the paper, the final prediction score for a candidate moment is the product of s_iou and s_mm, but s is the cosine similarity result mentioned in Section 3.3, so the range of s is [-1, 1]. Do you actually mean the final matching score is s_iou * s_mm? Or you are trying to say the final score is the product of s after sigmoid function?
The text was updated successfully, but these errors were encountered: