GitHub - IshiKura-a/AuG-KD: [ICLR'24] AuG-KD: Anchor-Based Mixup Generation for Out-of-Domain Knowledge Distillation

[ICLR'24 Poster] AuG-KD: Anchor-Based Mixup Generation for Out-of-Domain Knowledge Distillation

Zihao Tang¹, Zheqi Lv¹, Shengyu Zhang^1*, Yifan Zhou², Xinyu Duan³, Fei Wu¹, Kun Kuang^1*

¹ZJU, ²SJTU, ³Huawei Cloud
^*Corresponding Authors

Official Pytorch Implementation for the research paper titled "AuG-KD: Anchor-Based Mixup Generation for Out-of-Domain Knowledge Distillation".

Installation

Clone this repository and install the required packages:

git clone https://github.com/IshiKura-a/AuG-KD.git
cd AuG-KD

conda create -n AuG-KD python=3.7
conda activate AuG-KD
conda install pytorch torchvision torchaudio pytorch-cuda=12.0 -c pytorch -c nvidia

pip install -r requirements.txt

Download datasets:

Train

For example, to get similar results in Office (Amazon, Webcam → DSLR), you should first train the teacher model with the code below:run this code

model=resnet_34
root_dir=xxx
CUDA_VISIBLE_DEVICES=0 python main.py \
  --batch_size 2048 \
  --teacher "${model}" \
  --dataset office \
  --log_dir ${root_dir}/model/"${model}"/office_AW \
  --seed 2023 \
  train_teacher \
  --t_lr 1e-3 \
  --t_wd 1e-4 \
  --t_epoch 400 \
  --t_mode auto_split \
  --t_data ${root_dir}/dataset/office/amazon/images ${root_dir}/dataset/office/webcam/images

Then, use teacher model to train its OOD student:

root_dir=xxx
teacher=resnet34
student=mobilenet_v3_small
CUDA_VISIBLE_DEVICES=0 python main.py \
    --teacher ${teacher} \
    --teacher_dir ${root_dir}/model/${teacher}/office_AW/${teacher}_ckpt.pt \
    --student ${student} \
    --latent_size 100 \
    --dataset office \
    --target ${root_dir}/dataset/office/dslr/images \
    --test ${root_dir}/dataset/office/dslr/images \
    --lr 1e-3 \
    --wd 1e-4 \
    --epoch 200 \
    --batch_size 2048 \
    --seed ${seed} \
    --d_lr 1e-3 \
    --e_lr 1e-4 \
    --g_epoch 200 \
    --a_lr 1e-3 \
    --a_epoch 200 \
    --invariant 0.25 \
    --log_dir ${root_dir}/model/GenericKD/ \
    --postfix s${seed}_AW \
    --a 0.6 \
    --b 0.2

For other settings, we refer readers to Appendix B for hyperparameter settings to replicate our results.

Acknowledgement

This work was supported by National Key R&D Program of China (No. 2022ZD0119100), the National Natural Science Foundation of China (62376243, 62037001, U20A20387), Scientific Research Fund of Zhejiang Provincial Education Department (Y202353679), and the StarryNight Science Fund of Zhejiang University Shanghai Institute for Advanced Study (SN-ZJU-SIAS-0010).

Citation

@inproceedings{
tang2024augkd,
title={AuG-{KD}: Anchor-Based Mixup Generation for Out-of-Domain Knowledge Distillation},
author={Zihao TANG and Shengyu Zhang and Zheqi Lv and Yifan Zhou and Xinyu Duan and Kun Kuang and Fei Wu},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=fcqWJ8JgMR}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
datasets		datasets
models		models
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

datasets

datasets

models

models

utils

utils

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

main.py

main.py

requirements.txt

requirements.txt

Repository files navigation

[ICLR'24 Poster] AuG-KD: Anchor-Based Mixup Generation for Out-of-Domain Knowledge Distillation

Installation

Train

Acknowledgement

Citation

About

Releases

Packages

Languages

License

IshiKura-a/AuG-KD

Folders and files

Latest commit

History

Repository files navigation

Installation

Train

Acknowledgement

Citation

About

Resources

License

Stars

Watchers

Forks

Languages