This repository is the code implementation of the paper Generalized Diffusion Detector: Mining Robust Features from Diffusion Models for Domain-Generalized Detection , which is based on the MMDetection project.
- Linux system, Windows is not tested
- Python 3.8+, recommended 3.10
- PyTorch 2.0 or higher, recommended 2.0.0
- CUDA 11.7 or higher, recommended 11.8
- MMCV 2.0 or higher, recommended 2.1.0
- MMDetection 3.0 or higher, recommended 3.3.0
- diffusers 0.30.0 or higher, recommended 0.30.0
It is recommended to use conda for installation. The following commands will create a virtual environment named GDD and install PyTorch and MMCV. In the following installation steps, the default installed CUDA version is 11.8.
If your CUDA version is not 11.8, please modify it according to the actual situation.
Note: If you are experienced with PyTorch and have already installed it, you can skip to the next section. Otherwise, you can follow the steps below.
Step 1: Create a virtual environment named GDD and activate it.
conda create -n GDD python=3.10 -y
conda activate GDDStep 2: Install PyTorch2.x.
Linux/Windows:
conda install pytorch==2.0.0 torchvision==0.15.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidiaStep 3: Install MMDetection-3.x.
pip install -U openmim
mim install mmengine
mim install "mmcv>=2.1.0"
mim install mmdet=3.3.0Step 4: Prepare for Stable-diffusion-1.5 with diffusers.
git lfs install
git clone https://huggingface.co/runwayml/stable-diffusion-v1-5
GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/runwayml/stable-diffusion-v1-5And you should move the Stable-diffusion-1.5 to the same dir as our GDD. Then:
pip install diffusers==0.30.0The configuration steps for (SD-2.1, SD-3-M) follow the same procedure as previously described.
- Image and annotation download link: Cityscapes, FoggyCityscapes, RainyCityscapes.
- Image and annotation download link: BDD 100k.
- Image and annotation download link: SIM10k.
- Image and annotation download link: VOC 07+12.
- Image and annotation download link: Clipart, Comic, Watercolor.
- Image and annotation download link: Diverse Weather Benchmark.
Important code directories:
DG:The root directory of our config file for GDD.DG/_base_/dg_setting:Training and lr config of GDD.DG/_base_/datasets:Dataset config of GDD.DG/Ours:Detector config of GDD.mmdet/models/backbones/dift_encoder.py:Code and setting of diffusion backbone.mmdet/models/detectors/Z_domain_detector.py:Main code of GDD.mmdet/datasets/transforms/albu_domain_adaption.py:Domain augmentation code.
The models are trained for 20,000 steps on two 3090 GPUs, with a batch size of 16 (For Diverse Weather Benchmark, we use eight 3090 GPUs with a total batch size of 16). If your settings are different from ours, please modify the training steps and default learning rate settings in training config.
sh ./tools/dist_train.sh ${CHECKPOINT_FILE} ${GPU_NUM} # CHECKPOINT_FILE is the configuration file you want to use, GPU_NUM is the number of GPUs usedFor example:
sh ./tools/dist_train.sh DG/Ours/cityscapes/diffusion_guided_detector_cityscapes.py 2 Note: Please change the code here detector.dift_model.config and detector.dift_model.pretrained_model as None before test, to prevent using settings and weights related to diffusion models.
We provide a convenient way to quickly perform DG testing.
sh ./tools/dist_test_dg.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} # CONFIG_FILE is the configuration file you want to use, CHECKPOINT_FILE is the checkpoint file you want to use, GPU_NUM is the number of GPUs usedWe provide all trained models here google drive and baidu link (code: kbe2), each corresponding to its respective config file in DG/Ours.