Skip to content

LeapLabTHU/LAUDNet

Repository files navigation

LAUDNet

This is the official PyTorch implementation of "Latency-aware Unified Dynamic Networks for Efficient Image Recognition", which is the extension of our NeurIPS 2022 paper: Latency-Aware Spatial-wise Dynamic Networks. The original LASNet code is at this URL.

Introduction

We present Latency-aware Unified Dynamic Networks (LAUDNet), a unified framework that consolidates three representative dynamic paradigms: spatial-wise adaptive computation, dynamic layer skipping and dynamic channel skipping within a singular formulation. To accurately evaluate the practical latency of our model, we present a latency predictor that considers algorithms, scheduling strategies, hardware properties concurrently and accurately evaluates inference latency of dynamic operators. LAUDNet shows superior latency-accuracy tradeoff on a range of tasks (ImageNet classification, COCO object detection and instance segmentation) and a range of hardware devices (V100, RTX3090, RTX3060, TX2 and Nano).

Usage

This repo consists of three components: code for ImageNet classification, MMDetection detection & segmentation and latency predictor.

ImageNet classification

CNNs

Main dependencies:

  • Python: 3.9
  • PyTorch: 1.13.1
  • Torchvision: 0.14.1
  • Timm: 0.6.12

See a sample training script for training details.

Vision Transformers

We implement the three dynamic-inference paradigms (i.e. token skipping, layer (block) skipping, and head (channel) skipping) based on the AdaViT repo.

MMDetection detection & segmentation

RetinaNet, Faster-RCNN and MaskRCNN

Prerequisites:

  1. Prepare an ImageNet pretrained LAUDNet model.
  2. Setup a MMDetection-2.21.0 environment.
  3. Replace corresponding files in your mmcv environment with files in mmcv_replace_file.

See a sample training script for training details.

DDQ-DETR and Mask2Former

Prerequisites:

  1. Prepare an ImageNet pretrained LAUDNet model.
  2. Setup a MMDetection-3.3.0 environment.

See a sample training script for training details.

Latency predictor

See a sample evaluation script for evaluation details.

Performance

fig1

Model Zoo

model Checkpoint Link
LAUD-ResNet101 channel-2222 target-0.5 Tsinghua Cloud
LAUD-ResNet101 layer target-0.5 Tsinghua Cloud

Citation

@misc{han2024latencyaware,
      title={Latency-aware Unified Dynamic Networks for Efficient Image Recognition}, 
      author={Yizeng Han and Zeyu Liu and Zhihang Yuan and Yifan Pu and Chaofei Wang and Shiji Song and Gao Huang},
      year={2024},
      eprint={2308.15949},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Contact

If you have any questions, please feel free to contact the authors.

Yizeng Han: hanyz18@mails.tsinghua.edu.cn, yizeng38@gmail.com.

Zeyu Liu: liuzeyu20@mails.tsinghua.edu.cn, liuzeyu0020@gmail.com.

Zhihang Yuan: hahnyuan@gmail.com.

Yifan Pu: puyf23@mails.tsinghua.edu.cn.