-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluation Metric of WxBS Benchmark #64
Comments
Yes, that should be right. The results should be exactly the ones reported by Mishkin, so if you have further questions about the metrics, please ask him. I remember when rerunning the experiment that the results can fluctuate by about 1% |
Thanks for the prompt reply. I did observe some fluctuation for different runs and the best one I get is 79.97%, a little less than the one reported in your paper (80.1). May I know in such case you just report the best one or the average of multiple runs? What is the common practice? |
Reporting the average I think is reasonable. Otherwise you can increase score by doing more runs :D |
Its likely though that my original numbers are from a single run. |
The original evaluation metric for the WxBS benchmark is the recall on GT correspondences at various pixel thresholds. You paper mentioned that "The metric is mean average precision on ground truth correspondences consistent with the estimated fundamental matrix at a 10 pixel threshold".
In my experiment, I compute this metric by:
where result_dic is the original evaluation results output by the WxBS benchmark (https://ducha-aiki.github.io/wide-baseline-stereo-blog/2021/07/30/Reviving-WxBS-benchmark).
May I check whether this is the same as yours? Thanks.
The text was updated successfully, but these errors were encountered: