Skip to content

Adamdad/Repfusion

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Diffusion Model as Representation Learner

This repository contains the official implementation of the ICCV 2023 paper

Diffusion Model as Representation Learner Xingyi Yang, Xinchao Wang

[arxiv] [code]

tiser

In this paper, we conduct an in-depth investigation of the representation power of DPMs, and propose a novel knowledge transfer method that leverages the knowledge acquired by generative DPMs for recognition tasks. We introduce a novel knowledge transfer paradigm named RepFusion. Our paradigm extracts representations at different time steps from off-the-shelf DPMs and dynamically employs them as supervision for student networks, in which the optimal time is determined through reinforcement learning.

File Orgnizations

Basicly, we contain the code for distillation, the 3 downstream tasks including classification, segmentation, landmark

             
├── classification_distill/ 
    # code for image classification 
    # and knowledge distillation
    ├── configs/
        ├── <DATASET>-<DISTILL_LOSS>/
            ddpm-<BACKBONE>_<DISTILL_LOSS>.py
            # config file for Repfussion on <DATASET> 
            # with <DISTILL_LOSS> as loss function 
            # and <BACKBONE> as architecture
        ├── baseline/
            <BACKBONE>_<BATCHSIZE>_<DATASET>_finetune.py
    ├── mmcls/
        ├── models/
            ├── guided_diffusion/
                # code taken from the guided diffusion repo
            ├── classifiers/
                ├── kd.py
                    # distillation baselines
                ├── repfusion.py
                    # core code for distillation from diffusion model


├── landmark/
    # code for facial landmark detection 
    ├── configs/face/2d_kpt_sview_rgb_img/topdown_heatmap/wflw
        <BACKBONE>_wflw_256x256_baseline_<BATCHSIZE>.py
        <BACKBONE>_wflw_256x256_<BATCHSIZE>_repfussion.py
            

├── segmentation/
    # code for face parsing
    ├── configs/
        ├── celebahq_mask/
            bisenetv1_<BACKBONE>_lr5e-3_2x8_448x448_160k_coco-celebahq_mask_baseline.py
            bisenetv1_<BACKBONE>_lr5e-3_2x8_448x448_160k_coco-celebahq_mask_repfusion.py

Installation

We mainly depend on 4 packages, namely

  1. mmclassification. Please install the enviroment using INSTALL
  2. mmsegmentation. Please install the enviroment using INSTALL
  3. mmpose. Please install the enviroment using INSTALL
  4. diffusers. Install via pip install --upgrade diffusers[torch], or go to the official repo for help.

Data Preparation

We use 4 datasets in our paper. Please put them all under the data/<DATASET>

  1. CelabAMask-HQ, and please follow the guideline on official repo.
  2. WFLW. For WFLW data, please download images from WFLW Dataset. Please download the annotation files from wflw_annotations.
  3. TinyImageNet, please download dataset using this script.
  4. CIAFR10, mmcls will automatically download it for you.

Teacher Checkpoints

  • For DPMs based on huggingface, the model will be automatically downloaded. Just make sure you gets the correct model id, e.g.

  • For DPM on Tiny-ImageNet, we download it from the guided-diffusion repo from the link weight.

Training

  • We first do distillation from a trained DPM
# <CONFIG_NAME>: config path for distillation 
# <GPU_NUMS>: num of gpus for training
cd classification_distill
bash tools/dist_train.sh <CONFIG_NAME> <GPU_NUMS>
  • Put the saved checkpoint in config as installization for downstream training. For example
model = dict(
    ...
    backbone=dict(
        ...
        backbone_cfg=dict(
            ...
            init_cfg=dict(
                type='Pretrained', 
                checkpoint=<CHECKPOINT_PATH> , 
                # Put the disilled checkpoint hear
                prefix='student.backbone.')
            )
        ), 
    )
  • Do downstream training
# <CONFIG_NAME>: config path for distillation 
# <GPU_NUMS>: num of gpus for training
# <TASK_NAME>: either 'classification_distill', 'segmentation' or 'landmark'
cd <TASK_NAME>
bash tools/dist_train.sh <CONFIG_NAME> <GPU_NUMS>

Citation

@article{yang2023diffusion,
    author    = {Xingyi Yang, Xinchao Wang},
    title     = {Diffusion Model as Representation Learner},
    journal   = {International Conference on Computer Vision (ICCV)},
    year      = {2023},
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published