Skip to content

Commit cfd5d3a

Browse files
authored
Release MM-GroundingDINO SwinB and SwinL Weights (#11460)
2 parents 44ebd17 + 2390ebc commit cfd5d3a

12 files changed

+802
-26
lines changed

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,8 @@ Apart from MMDetection, we also released [MMEngine](https://github.com/open-mmla
101101

102102
## What's New
103103

104+
💎 **We have released the pre-trained weights for MM-Grounding-DINO Swin-B and Swin-L, welcome to try and give feedback.**
105+
104106
### Highlight
105107

106108
**v3.3.0** was released in 5/1/2024:

README_zh-CN.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -100,6 +100,8 @@ MMDetection 是一个基于 PyTorch 的目标检测开源工具箱。它是 [Ope
100100

101101
## 最新进展
102102

103+
💎 **我们已经发布了 MM-Grounding-DINO Swin-B 和 Swin-L 预训练权重,欢迎试用和反馈.**
104+
103105
### 亮点
104106

105107
**v3.3.0** 版本已经在 2024.1.5 发布:

configs/mm_grounding_dino/README.md

Lines changed: 34 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -20,22 +20,33 @@ Grounding-DINO is a state-of-the-art open-set detection model that tackles multi
2020

2121
Please refer to [dataset_prepare.md](dataset_prepare.md) or [中文版数据准备](dataset_prepare_zh-CN.md)
2222

23+
## ✨ What's New
24+
25+
💎 **We have released the pre-trained weights for Swin-B and Swin-L, welcome to try and give feedback.**
26+
2327
## Usage
2428

2529
Please refer to [usage.md](usage.md) or [中文版用法说明](usage_zh-CN.md)
2630

2731
## Zero-Shot COCO Results and Models
2832

29-
| Model | Backbone | Style | COCO mAP | Pre-Train Data | Config | Download |
30-
| :--------: | :------: | :-------: | :--------: | :-------------------: | :------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
31-
| GDINO-T | Swin-T | Zero-shot | 46.7 | O365 | | |
32-
| GDINO-T | Swin-T | Zero-shot | 48.1 | O365,GoldG | | |
33-
| GDINO-T | Swin-T | Zero-shot | 48.4 | O365,GoldG,Cap4M | [config](../grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_cap4m.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swint_ogc_mmdet-822d7e9d.pth) |
34-
| MM-GDINO-T | Swin-T | Zero-shot | 48.5(+1.8) | O365 | [config](grounding_dino_swin-t_pretrain_obj365.py) | |
35-
| MM-GDINO-T | Swin-T | Zero-shot | 50.4(+2.3) | O365,GoldG | [config](grounding_dino_swin-t_pretrain_obj365_goldg.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg/grounding_dino_swin-t_pretrain_obj365_goldg_20231122_132602-4ea751ce.pth) \| [log](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg/grounding_dino_swin-t_pretrain_obj365_goldg_20231122_132602.log.json) |
36-
| MM-GDINO-T | Swin-T | Zero-shot | 50.5(+2.1) | O365,GoldG,GRIT | [config](grounding_dino_swin-t_pretrain_obj365_goldg_grit9m.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_20231128_200818-169cc352.pth) \| [log](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_20231128_200818.log.json) |
37-
| MM-GDINO-T | Swin-T | Zero-shot | 50.6(+2.2) | O365,GoldG,V3Det | [config](grounding_dino_swin-t_pretrain_obj365_goldg_v3det.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_v3det_20231218_095741-e316e297.pth) \| [log](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_v3det_20231218_095741.log.json) |
38-
| MM-GDINO-T | Swin-T | Zero-shot | 50.4(+2.0) | O365,GoldG,GRIT,V3Det | [config](grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047-b448804b.pth) \| [log](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047.log.json) |
33+
| Model | Backbone | Style | COCO mAP | Pre-Train Data | Config | Download |
34+
| :----------: | :------: | :-------: | :--------: | :----------------------: | :------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
35+
| GDINO-T | Swin-T | Zero-shot | 46.7 | O365 | | |
36+
| GDINO-T | Swin-T | Zero-shot | 48.1 | O365,GoldG | | |
37+
| GDINO-T | Swin-T | Zero-shot | 48.4 | O365,GoldG,Cap4M | [config](../grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_cap4m.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swint_ogc_mmdet-822d7e9d.pth) |
38+
| MM-GDINO-T | Swin-T | Zero-shot | 48.5(+1.8) | O365 | [config](grounding_dino_swin-t_pretrain_obj365.py) | |
39+
| MM-GDINO-T | Swin-T | Zero-shot | 50.4(+2.3) | O365,GoldG | [config](grounding_dino_swin-t_pretrain_obj365_goldg.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg/grounding_dino_swin-t_pretrain_obj365_goldg_20231122_132602-4ea751ce.pth) \| [log](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg/grounding_dino_swin-t_pretrain_obj365_goldg_20231122_132602.log.json) |
40+
| MM-GDINO-T | Swin-T | Zero-shot | 50.5(+2.1) | O365,GoldG,GRIT | [config](grounding_dino_swin-t_pretrain_obj365_goldg_grit9m.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_20231128_200818-169cc352.pth) \| [log](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_20231128_200818.log.json) |
41+
| MM-GDINO-T | Swin-T | Zero-shot | 50.6(+2.2) | O365,GoldG,V3Det | [config](grounding_dino_swin-t_pretrain_obj365_goldg_v3det.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_v3det_20231218_095741-e316e297.pth) \| [log](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_v3det_20231218_095741.log.json) |
42+
| MM-GDINO-T | Swin-T | Zero-shot | 50.4(+2.0) | O365,GoldG,GRIT,V3Det | [config](grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047-b448804b.pth) \| [log](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047.log.json) |
43+
| MM-GDINO-B | Swin-B | Zero-shot | 52.5 | O365,GoldG,V3Det | [config](grounding_dino_swin-b_pretrain_obj365_goldg_v3det.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-b_pretrain_obj365_goldg_v3det/grounding_dino_swin-b_pretrain_obj365_goldg_v3de-f83eef00.pth) \| [log](<>) |
44+
| MM-GDINO-B\* | Swin-B | - | 59.5 | O365,ALL | [config](grounding_dino_swin-b_pretrain_all.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-b_pretrain_all/grounding_dino_swin-b_pretrain_all-f9818a7c.pth) \| [log](<>) |
45+
| MM-GDINO-L | Swin-L | Zero-shot | 53.0 | O365V2,OpenImageV6,GoldG | [config](grounding_dino_swin-l_pretrain_obj365_goldg.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-l_pretrain_obj365_goldg/grounding_dino_swin-l_pretrain_obj365_goldg-34dcdc53.pth) \| [log](<>) |
46+
| MM-GDINO-L\* | Swin-L | - | 60.3 | O365V2,OpenImageV6,ALL | [config](grounding_dino_swin-l_pretrain_all.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-l_pretrain_all/grounding_dino_swin-l_pretrain_all-56d69e78.pth) \| [log](<>) |
47+
48+
- This * indicates that the model has not been fully trained yet. We will release the final weights in the future.
49+
- ALL: GoldG,V3det,COCO2017,LVISV1,COCO2014,GRIT,RefCOCO,RefCOCO+,RefCOCOg,gRefCOCO.
3950

4051
## Zero-Shot LVIS Results
4152

@@ -361,3 +372,16 @@ Note:
361372
| MM-GDINO | Swin-T | 5e | 45.1 | 64.7 | 42.5 | 65.5 | 40.3 | 63.2 |
362373

363374
- The MM-GDINO-T config file is [here](refcoco/grounding_dino_swin-t_finetune_8xb4_5e_grefcoco.py)
375+
376+
## Citation
377+
378+
If you find this project useful in your research, please consider citing:
379+
380+
```latex
381+
@article{zhao2024open,
382+
title={An Open and Comprehensive Pipeline for Unified Object Grounding and Detection},
383+
author={Zhao, Xiangyu and Chen, Yicheng and Xu, Shilin and Li, Xiangtai and Wang, Xinjiang and Li, Yining and Huang, Haian},
384+
journal={arXiv preprint arXiv:2401.02361},
385+
year={2024}
386+
}
387+
```

0 commit comments

Comments
 (0)