Skip to content

Latest commit

 

History

History
186 lines (150 loc) · 11.5 KB

README.zh-CN-simplified.md

File metadata and controls

186 lines (150 loc) · 11.5 KB

English Chinese

GIM: Learning Generalizable Image Matcher From Internet Videos

ICLR 2024 Spotlight Project Page arxiv HuggingFace Space Overview Video GitHub Repo stars

Intel Intel Intel

方法
平均
AUC@5°
(%) ↑
GL3 BLE ETI ETO KIT WEA SEA NIG MUL SCE ICL GTA
传统算法
RootSIFT 31.8 43.5 33.6 49.9 48.7 35.2 21.4 44.1 14.7 33.4 7.6 14.8 35.1
稀疏匹配
SuperGlue (in) 21.6 19.2 16.0 38.2 37.7 22.0 20.8 40.8 13.7 21.4 0.8 9.6 18.8
SuperGlue (out) 31.2 29.7 24.2 52.3 59.3 28.0 28.4 48.0 20.9 33.4 4.5 16.6 29.3
GIM_SuperGlue
(50h)
34.3 43.2 34.2 58.7 61.0 29.0 28.3 48.4 18.8 34.8 2.8 15.4 36.5
LightGlue 31.7 28.9 23.9 51.6 56.3 32.1 29.5 48.9 22.2 37.4 3.0 16.2 30.4
GIM_LightGlue
(100h)
38.3 46.6 38.1 61.7 62.9 34.9 31.2 50.6 22.6 41.8 6.9 19.0 43.4
半密集匹配
LoFTR (in) 10.7 5.6 5.1 11.8 7.5 17.2 6.4 9.7 3.5 22.4 1.3 14.9 23.4
LoFTR (out) 33.1 29.3 22.5 51.1 60.1 36.1 29.7 48.6 19.4 37.0 13.1 20.5 30.3
GIM_LoFTR
(50h)
39.1 50.6 43.9 62.6 61.6 35.9 26.8 47.5 17.6 41.4 10.2 25.6 45.0
🟩 GIM_LoFTR
(100h)
ToDO
密集匹配
DKM (in) 46.2 44.4 37.0 65.7 73.3 40.2 32.8 51.0 23.1 54.7 33.0 43.6 55.7
DKM (out) 45.8 45.7 37.0 66.8 75.8 41.7 33.5 51.4 22.9 56.3 27.3 37.8 52.9
GIM_DKM
(50h)
49.4 58.3 47.8 72.7 74.5 42.1 34.6 52.0 25.1 53.7 32.3 38.8 60.6
GIM_DKM
(100h)
51.2 63.3 53.0 73.9 76.7 43.4 34.6 52.5 24.5 56.6 32.2 42.5 61.6
RoMa (in) 46.7 46.0 39.3 68.8 77.2 36.5 31.1 50.4 20.8 57.8 33.8 41.7 57.6
RoMa (out) 48.8 48.3 40.6 73.6 79.8 39.9 34.4 51.4 24.2 59.9 33.7 41.3 59.2
🟩 GIM_RoMa ToDO

该表格的数据来自论文提出的 ZEB: Zero-shot Evaluation Benchmark for Image Matching, 该 benchmark 由 12 个涵盖各种场景、天气和相机模型的公开数据集组成,对应了表格中从 GL3 开始的 12 列测试序列。我们会尽快公开 ZEB

✅ 待办清单

  • Inference code
    • gim_roma
    • gim_dkm
    • gim_loftr
    • gim_lightglue
  • Training code

剩余的开源工作我们还在抓紧进行,感谢大家的关注。

🤗 在线体验

Huggingface 在线快速体验我们模型的效果

⚙️ 运行环境

我在新服务器上是使用下面的命令进行运行环境的安装。

conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
pip install albumentations==1.0.1 --no-binary=imgaug,albumentations
pip install pytorch-lightning==1.5.10
pip install opencv-python==4.5.3.56
pip install imagesize==1.2.0
pip install kornia==0.6.10
pip install einops==0.3.0
pip install loguru==0.5.3
pip install joblib==1.0.1
pip install yacs==0.1.8
pip install h5py==3.1.0

🔨 使用

克隆本仓库

git clone https://github.com/xuelunshen/gim.git
cd gim

Google Drive 下载 gim_dkm 的模型参数

将模型参数放在文件夹 weights 里面

运行下面的命令

python demo.py --model gim_dkm

or

python demo.py --model gim_lightglue

代码会将 assets/demo 中的 a1.pnga2.png 进行匹配
输出 a1_a2_match.pnga1_a2_warp.png

点击这里查看 a1.pnga2.png.

点击这里查看 a1_a2_match.png.

a1_a2_match.png 是两张图像匹配的可视化

点击这里查看 a1_a2_warp.png.

a1_a2_warp.png 是将图像a2用 homography 投影到图像a1的效果

还有更多图像在文件夹 assets/demo 中, 大家都可以尝试拿来匹配看看.

点击这里查看更多图像

📌 引用

如果我们的代码对你的研究有帮助, 请给我们的论文一个引用 ❤️ 并给 gim 的仓库点个小星星 ⭐️ 吧, 多谢啦~

@inproceedings{
xuelun2024gim,
title={GIM: Learning Generalizable Image Matcher From Internet Videos},
author={Xuelun Shen and Zhipeng Cai and Wei Yin and Matthias Müller and Zijun Li and Kaixuan Wang and Xiaozhi Chen and Cheng Wang},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024}
}

License

This repository is under the MIT License. This content/model is provided here for research purposes only. Any use beyond this is your sole responsibility and subject to your securing the necessary rights for your purpose.