Skip to content

training code for "H-AT: Hybrid Attention Transfer for Knowledge Distillation"

Notifications You must be signed in to change notification settings

somone23412/H-AT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

H-AT: Hybrid Attention Transfer for Knowledge Distillation

PyTorch training code for:

Origin codes forked from:

Coming:

  • Scenes and CUB code
  • ImageNet code

Requirements

First install PyTorch, then install torchnet:

pip install git+https://github.com/pytorch/tnt.git@master

then install other Python packages:

pip install -r requirements.txt

Experiments

CIFAR

refer to cifar_train.sh

cifar_train.py includes:

  • CIFAR Wide ResNet training code
  • Hybrid Attention Transfer for Knowledge Distillation
  • Activation-based spatial attention transfer implementation
  • Knowledge distillation implementation
  • Similarity-preserving knowledge distillation implementation

train teacher

$PATHTOPYTHON/python3 cifar_train.py \
--save logs/resnet_40_1_teacher \
--depth 40 \
--width 1 \
--gpu_id 4

train student

TEANET="40_1"
DEPTH=16
WIDTH=1
ALPHA=0 # 0.9
BETA=0 # 1000
GAMMA=0 # 3000
DELTA=10  # 10, delta[] mentioned in paper equals to [0.01, 0.1, 1] * DELTA
EPOCHSTEP="[60,120,160]"
GPUID=1
METHOD=CSHAT

STUNET="$DEPTH"_"$WIDTH"
LOGPATH="$METHOD"_"$STUNET"_"$TEANET"
TEACHER="resnet"_"$TEANET"_"teacher"
TRAIN_FILE="cifar_train.py"

for i in seq 1 5`
do
$PATHTOPYTHON/python3 $TRAIN_FILE \
--save logs/$LOGPATH#$i \
--teacher_id $TEACHER \
--epoch_step $EPOCHSTEP \
--depth $DEPTH \
--width $WIDTH \
--alpha $ALPHA \
--beta $BETA \
--gamma $GAMMA \
--delta $DELTA \
--gpu_id $GPUID
done

About

training code for "H-AT: Hybrid Attention Transfer for Knowledge Distillation"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published