We only use the origin data provided by NTIRE 2025 CDFSOD Challenge, pleas orgnize the data as following:
|--dataset
| |--dataset1
| | |--annotations
| | |--test
| | |--train
| |--dataset2
| | |--annotations
| | |--test
| | |--train
| |--dataset3
| | |--annotations
| | |--test
| | |--train
Our resolution is built upon LLMDet, please refer to LLMDet for more details.
Download the origin checkpoints of LLMDet from huggingface or modelscope and put them under weights
folder. Besides you should download the parameters of BERT and SigLIP.
Then, please down the pretrained parameters of from huggingface and put them in model
folders:
|--weights
| |--fushh7
| | |--LLMDet
| |--google
| | |--siglip-so400m-patch14-384
| |--google-bert
| | |--bert-base-uncased
|--model
| |--dataset1
| | |--1shot_weights.pth
| | |--5shot_weights.pth
| | |--10shot_weights.pth
| |--dataset2
| | |--weights.pth
| |--dataset3
| | |--1shot_weights.pth
| | |--5shot_weights.pth
| | |--10shot_weights.pth
Please use the following command to perform inference:
python inference.py configs/grounding_dino_swin_l.py '{SUB_DATASET_DIR}/test' '{SUB_DATASET_DIR}/annotations/test.json' 'catagory/{DATASET_CAT_JSON}' 'output/{OUTPUT_DATASET_DIR}' --weights {WEIGHTS_FOR_SUB_DATASET} -c
{SUB_DATASET_DIR}
denotes the path of each dataset, and you can replace it with dataset/dataset1
, dataset/dataset2
or dataset/dataset3
. {WEIGHTS_FOR_SUB_DATASET}
is the pretrained weights of responding dataset and k-shot set. {DATASET_CAT_JSON}
is the category json generated by Qwen2.5-VL for each clss. Because the categories generated by Qwen2.5-VL is the same for different k-shot setting, we use one file for all k-shot settings. {OUTPUT_DATASET_DIR}
is the output folder for different dataset. It is worth noting that, we don't finetune model for dataset2, so only one parameter is provided.
For example, you can use the following command to evaluate the dataset1 under 1-shot setting:
python inference.py configs/grounding_dino_swin_l.py 'dataset/dataset1/test' 'dataset/dataset1/annotations/test.json' 'catagory/dataset1.json' 'output/dataset1/' --weights 'model/dataset1/1shot_weights.pth' -c
After the inference, please run the following command to conduct the post processing:
python clean/dataset1/clean_d1_result.py
python clean/dataset2/clean_d2_result.py
python clean/dataset3/clean_d3_result.py
Because we utilize Qwen2.5VL to obtain object categories of each picture, we have provided the pre-computed categories file and you don't need to generate it by your own self.
Finally, please run the obtain_submit.py
to generate the final file for submission.