-
Notifications
You must be signed in to change notification settings - Fork 331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproducing results with SuperPoint + MNN on Megadepth-1500 #56
Comments
Hi @guipotje sorry for the late reply. The pipeline is resize to 1600px -> inference -> rescale keypoints to original image size -> estimate relative pose. We use top 2048 keypoints and set the detection threshold to 0. The threshold range is correct, we found the best results at th=1 for SP+NN. Here is the pose estimation code for opencv: def estimate_relative_pose(
kpts0, kpts1, K0, K1, thresh, conf=0.99999, solver=cv2.RANSAC
):
if len(kpts0) < 5:
return None
f_mean = np.mean([K0[0, 0], K1[0, 0], K0[1, 1], K1[1, 1]])
norm_thresh = thresh / f_mean
kpts0 = (kpts0 - K0[[0, 1], [2, 2]][None]) / K0[[0, 1], [0, 1]][None]
kpts1 = (kpts1 - K1[[0, 1], [2, 2]][None]) / K1[[0, 1], [0, 1]][None]
E, mask = cv2.findEssentialMat(
kpts0, kpts1, np.eye(3), threshold=norm_thresh, prob=conf, method=solver
)
if E is None:
return None
best_num_inliers = 0
ret = None
for _E in np.split(E, len(E) / 3):
n, R, t, _ = cv2.recoverPose(_E, kpts0, kpts1, np.eye(3), 1e9, mask=mask)
if n > best_num_inliers:
best_num_inliers = n
ret = (R, t[:, 0], mask.ravel() > 0)
return ret For the best results (LO-RANSAC) we used the excellent PoseLib, which provides python bindings. There, we tested thresholds in a range [0.5, 3.0]. Here is a small script for poselib: import poselib
def intrinsics_to_camera(K):
px, py = K[0, 2], K[1, 2]
fx, fy = K[0, 0], K[1, 1]
return {
"model": "PINHOLE",
"width": int(2 * px),
"height": int(2 * py),
"params": [fx, fy, px, py],
}
M, info = poselib.estimate_relative_pose(
kpts0, kpts1,
intrinsics_to_camera(K0),
intrinsics_to_camera(K1),
{"max_epipolar_error": th},
)
R, t, inl = M.R, M.t, info["inliers"] |
Hello @Phil26AT, thank you very much for the detailed answer! Poselib indeed provides impressive gains in pose accuracy. After the suggestions, I was able to reproduce SuperPoint results by running the suggested pipeline only after using MNN + ratio test with r = 0.95 (31.4 AuC @ 5), but with MNN alone the results I obtain are worse than the reported in Table 2 (24.3 AuC @ 5). However, I think this is sufficient to validate the baseline. Thanks a lot! |
Hello, first of all thank you for the great work and to make the license permissive, it will surely boost research in image matching!
I am trying to reproduce SuperPoint + MNN as a baseline. For that, I follow the protocol of the paper, trying to achieve results as close as possible to values reported in Table 2 of the LightGlue paper. I am doing the following steps:
I also attempted to run LO-RANSAC instead of using cv2.RANSAC since it gives a great boost in AUC in Table 2, but without success. I tested the implementation from both pydegensac and cv2.USACs, but with results very far away to the performance of AUC@5 of 0.51, testing with several configurations of inlier thresholds and different flags. Could you guys kindly provide more details on the SuperPoint parameters, RANSAC implementation and hyperparameters used to achieve these results, specifically for SuperPoint + MNN matching (Table 2)?
Thank you in advance!
The text was updated successfully, but these errors were encountered: