@@ -20,22 +20,33 @@ Grounding-DINO is a state-of-the-art open-set detection model that tackles multi
20
20
21
21
Please refer to [ dataset_prepare.md] ( dataset_prepare.md ) or [ 中文版数据准备] ( dataset_prepare_zh-CN.md )
22
22
23
+ ## ✨ What's New
24
+
25
+ 💎 ** We have released the pre-trained weights for Swin-B and Swin-L, welcome to try and give feedback.**
26
+
23
27
## Usage
24
28
25
29
Please refer to [ usage.md] ( usage.md ) or [ 中文版用法说明] ( usage_zh-CN.md )
26
30
27
31
## Zero-Shot COCO Results and Models
28
32
29
- | Model | Backbone | Style | COCO mAP | Pre-Train Data | Config | Download |
30
- | :--------: | :------: | :-------: | :--------: | :-------------------: | :------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
31
- | GDINO-T | Swin-T | Zero-shot | 46.7 | O365 | | |
32
- | GDINO-T | Swin-T | Zero-shot | 48.1 | O365,GoldG | | |
33
- | GDINO-T | Swin-T | Zero-shot | 48.4 | O365,GoldG,Cap4M | [ config] ( ../grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_cap4m.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swint_ogc_mmdet-822d7e9d.pth ) |
34
- | MM-GDINO-T | Swin-T | Zero-shot | 48.5(+1.8) | O365 | [ config] ( grounding_dino_swin-t_pretrain_obj365.py ) | |
35
- | MM-GDINO-T | Swin-T | Zero-shot | 50.4(+2.3) | O365,GoldG | [ config] ( grounding_dino_swin-t_pretrain_obj365_goldg.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg/grounding_dino_swin-t_pretrain_obj365_goldg_20231122_132602-4ea751ce.pth ) \| [ log] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg/grounding_dino_swin-t_pretrain_obj365_goldg_20231122_132602.log.json ) |
36
- | MM-GDINO-T | Swin-T | Zero-shot | 50.5(+2.1) | O365,GoldG,GRIT | [ config] ( grounding_dino_swin-t_pretrain_obj365_goldg_grit9m.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_20231128_200818-169cc352.pth ) \| [ log] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_20231128_200818.log.json ) |
37
- | MM-GDINO-T | Swin-T | Zero-shot | 50.6(+2.2) | O365,GoldG,V3Det | [ config] ( grounding_dino_swin-t_pretrain_obj365_goldg_v3det.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_v3det_20231218_095741-e316e297.pth ) \| [ log] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_v3det_20231218_095741.log.json ) |
38
- | MM-GDINO-T | Swin-T | Zero-shot | 50.4(+2.0) | O365,GoldG,GRIT,V3Det | [ config] ( grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047-b448804b.pth ) \| [ log] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047.log.json ) |
33
+ | Model | Backbone | Style | COCO mAP | Pre-Train Data | Config | Download |
34
+ | :----------: | :------: | :-------: | :--------: | :----------------------: | :------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
35
+ | GDINO-T | Swin-T | Zero-shot | 46.7 | O365 | | |
36
+ | GDINO-T | Swin-T | Zero-shot | 48.1 | O365,GoldG | | |
37
+ | GDINO-T | Swin-T | Zero-shot | 48.4 | O365,GoldG,Cap4M | [ config] ( ../grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_cap4m.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swint_ogc_mmdet-822d7e9d.pth ) |
38
+ | MM-GDINO-T | Swin-T | Zero-shot | 48.5(+1.8) | O365 | [ config] ( grounding_dino_swin-t_pretrain_obj365.py ) | |
39
+ | MM-GDINO-T | Swin-T | Zero-shot | 50.4(+2.3) | O365,GoldG | [ config] ( grounding_dino_swin-t_pretrain_obj365_goldg.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg/grounding_dino_swin-t_pretrain_obj365_goldg_20231122_132602-4ea751ce.pth ) \| [ log] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg/grounding_dino_swin-t_pretrain_obj365_goldg_20231122_132602.log.json ) |
40
+ | MM-GDINO-T | Swin-T | Zero-shot | 50.5(+2.1) | O365,GoldG,GRIT | [ config] ( grounding_dino_swin-t_pretrain_obj365_goldg_grit9m.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_20231128_200818-169cc352.pth ) \| [ log] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_20231128_200818.log.json ) |
41
+ | MM-GDINO-T | Swin-T | Zero-shot | 50.6(+2.2) | O365,GoldG,V3Det | [ config] ( grounding_dino_swin-t_pretrain_obj365_goldg_v3det.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_v3det_20231218_095741-e316e297.pth ) \| [ log] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_v3det_20231218_095741.log.json ) |
42
+ | MM-GDINO-T | Swin-T | Zero-shot | 50.4(+2.0) | O365,GoldG,GRIT,V3Det | [ config] ( grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047-b448804b.pth ) \| [ log] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047.log.json ) |
43
+ | MM-GDINO-B | Swin-B | Zero-shot | 52.5 | O365,GoldG,V3Det | [ config] ( grounding_dino_swin-b_pretrain_obj365_goldg_v3det.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-b_pretrain_obj365_goldg_v3det/grounding_dino_swin-b_pretrain_obj365_goldg_v3de-f83eef00.pth ) \| [ log] ( < > ) |
44
+ | MM-GDINO-B\* | Swin-B | - | 59.5 | O365,ALL | [ config] ( grounding_dino_swin-b_pretrain_all.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-b_pretrain_all/grounding_dino_swin-b_pretrain_all-f9818a7c.pth ) \| [ log] ( < > ) |
45
+ | MM-GDINO-L | Swin-L | Zero-shot | 53.0 | O365V2,OpenImageV6,GoldG | [ config] ( grounding_dino_swin-l_pretrain_obj365_goldg.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-l_pretrain_obj365_goldg/grounding_dino_swin-l_pretrain_obj365_goldg-34dcdc53.pth ) \| [ log] ( < > ) |
46
+ | MM-GDINO-L\* | Swin-L | - | 60.3 | O365V2,OpenImageV6,ALL | [ config] ( grounding_dino_swin-l_pretrain_all.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-l_pretrain_all/grounding_dino_swin-l_pretrain_all-56d69e78.pth ) \| [ log] ( < > ) |
47
+
48
+ - This * indicates that the model has not been fully trained yet. We will release the final weights in the future.
49
+ - ALL: GoldG,V3det,COCO2017,LVISV1,COCO2014,GRIT,RefCOCO,RefCOCO+,RefCOCOg,gRefCOCO.
39
50
40
51
## Zero-Shot LVIS Results
41
52
@@ -361,3 +372,16 @@ Note:
361
372
| MM-GDINO | Swin-T | 5e | 45.1 | 64.7 | 42.5 | 65.5 | 40.3 | 63.2 |
362
373
363
374
- The MM-GDINO-T config file is [ here] ( refcoco/grounding_dino_swin-t_finetune_8xb4_5e_grefcoco.py )
375
+
376
+ ## Citation
377
+
378
+ If you find this project useful in your research, please consider citing:
379
+
380
+ ``` latex
381
+ @article{zhao2024open,
382
+ title={An Open and Comprehensive Pipeline for Unified Object Grounding and Detection},
383
+ author={Zhao, Xiangyu and Chen, Yicheng and Xu, Shilin and Li, Xiangtai and Wang, Xinjiang and Li, Yining and Huang, Haian},
384
+ journal={arXiv preprint arXiv:2401.02361},
385
+ year={2024}
386
+ }
387
+ ```
0 commit comments