Reproducing results with SuperPoint + MNN on Megadepth-1500 #56

guipotje · 2023-08-26T16:06:31Z

Hello, first of all thank you for the great work and to make the license permissive, it will surely boost research in image matching!

I am trying to reproduce SuperPoint + MNN as a baseline. For that, I follow the protocol of the paper, trying to achieve results as close as possible to values reported in Table 2 of the LightGlue paper. I am doing the following steps:

Resize image such that longer dim is 1600 px;
Extract top 2048 keypoints using the default parameters for SuperPoint defined in this repo;
Match descriptors using NN + Mutual Check;
Use OpenCV findEssentialMat with prob=0.99999 and default "classic" cv2.RANSAC. As those details are not explicitly mentioned, I'm basically following LoFTR protocol defined in their original repo as suggested in LightGlue's paper for finding pose AUC @ [5,10,20].
I tested with several inlier thresholds ranging from [0.25, 2.5] px. The best results I can achieve is the following:

ransac_thr = 1.5 
{'auc@5': 0.251782299270867, 'auc@10': 0.3987322068921645, 'auc@20': 0.5415882032042043}

I also attempted to run LO-RANSAC instead of using cv2.RANSAC since it gives a great boost in AUC in Table 2, but without success. I tested the implementation from both pydegensac and cv2.USACs, but with results very far away to the performance of AUC@5 of 0.51, testing with several configurations of inlier thresholds and different flags. Could you guys kindly provide more details on the SuperPoint parameters, RANSAC implementation and hyperparameters used to achieve these results, specifically for SuperPoint + MNN matching (Table 2)?

Thank you in advance!

The text was updated successfully, but these errors were encountered:

Phil26AT · 2023-08-31T08:10:45Z

Hi @guipotje

sorry for the late reply. The pipeline is resize to 1600px -> inference -> rescale keypoints to original image size -> estimate relative pose. We use top 2048 keypoints and set the detection threshold to 0. The threshold range is correct, we found the best results at th=1 for SP+NN.

Here is the pose estimation code for opencv:

def estimate_relative_pose(
    kpts0, kpts1, K0, K1, thresh, conf=0.99999, solver=cv2.RANSAC
):
    if len(kpts0) < 5:
        return None

    f_mean = np.mean([K0[0, 0], K1[0, 0], K0[1, 1], K1[1, 1]])
    norm_thresh = thresh / f_mean

    kpts0 = (kpts0 - K0[[0, 1], [2, 2]][None]) / K0[[0, 1], [0, 1]][None]
    kpts1 = (kpts1 - K1[[0, 1], [2, 2]][None]) / K1[[0, 1], [0, 1]][None]

    E, mask = cv2.findEssentialMat(
        kpts0, kpts1, np.eye(3), threshold=norm_thresh, prob=conf, method=solver
    )

    if E is None:
        return None

    best_num_inliers = 0
    ret = None
    for _E in np.split(E, len(E) / 3):
        n, R, t, _ = cv2.recoverPose(_E, kpts0, kpts1, np.eye(3), 1e9, mask=mask)
        if n > best_num_inliers:
            best_num_inliers = n
            ret = (R, t[:, 0], mask.ravel() > 0)
    return ret

For the best results (LO-RANSAC) we used the excellent PoseLib, which provides python bindings. There, we tested thresholds in a range [0.5, 3.0].

Here is a small script for poselib:

import poselib


def intrinsics_to_camera(K):
    px, py = K[0, 2], K[1, 2]
    fx, fy = K[0, 0], K[1, 1]
    return {
        "model": "PINHOLE",
        "width": int(2 * px),
        "height": int(2 * py),
        "params": [fx, fy, px, py],
    }


M, info = poselib.estimate_relative_pose(
    kpts0, kpts1,
    intrinsics_to_camera(K0),
    intrinsics_to_camera(K1),
    {"max_epipolar_error": th},
)

R, t, inl = M.R, M.t, info["inliers"]

guipotje · 2023-09-13T01:20:42Z

Hello @Phil26AT, thank you very much for the detailed answer!

Poselib indeed provides impressive gains in pose accuracy. After the suggestions, I was able to reproduce SuperPoint results by running the suggested pipeline only after using MNN + ratio test with r = 0.95 (31.4 AuC @ 5), but with MNN alone the results I obtain are worse than the reported in Table 2 (24.3 AuC @ 5). However, I think this is sufficient to validate the baseline. Thanks a lot!

guipotje closed this as completed Sep 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing results with SuperPoint + MNN on Megadepth-1500 #56

Reproducing results with SuperPoint + MNN on Megadepth-1500 #56

guipotje commented Aug 26, 2023

Phil26AT commented Aug 31, 2023 •

edited

Loading

guipotje commented Sep 13, 2023

Reproducing results with SuperPoint + MNN on Megadepth-1500 #56

Reproducing results with SuperPoint + MNN on Megadepth-1500 #56

Comments

guipotje commented Aug 26, 2023

Phil26AT commented Aug 31, 2023 • edited Loading

guipotje commented Sep 13, 2023

Phil26AT commented Aug 31, 2023 •

edited

Loading