Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
Oct 8, 2020
Oct 8, 2020
Oct 8, 2020
Oct 8, 2020
Oct 8, 2020
Oct 8, 2020
Oct 8, 2020
Oct 8, 2020
Oct 8, 2020

README.md

GFNet-Pytorch (NeurIPS 2020)

This repo contains the official code and pre-trained models for the glance and focus network (GFNet).

Citation

@inproceedings{NeurIPS2020_7866,
        title = {Glance and Focus: a Dynamic Approach to Reducing Spatial Redundancy in Image Classification},
       author = {Wang, Yulin and Lv, Kangchen and Huang, Rui and Song, Shiji and Yang, Le and Huang, Gao},
    booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
         year = {2020},
}

Update on 2020/10/08: Release Pre-trained Models and the Inference Code on ImageNet.

Introduction

Inspired by the fact that not all regions in an image are task-relevant, we propose a novel framework that performs efficient image classification by processing a sequence of relatively small inputs, which are strategically cropped from the original image. Experiments on ImageNet show that our method consistently improves the computational efficiency of a wide variety of deep models. For example, it further reduces the average latency of the highly efficient MobileNet-V3 on an iPhone XS Max by 20% without sacrificing accuracy.

Results

  • Top-1 accuracy on ImageNet v.s. Multiply-Adds

  • Top-1 accuracy on ImageNet v.s. Inference Latency (ms) on an iPhone XS Max

  • Visualization

Pre-trained Models

Backbone CNNs Patch Size T Links
ResNet-50 96x96 5 Tsinghua Cloud / Google Drive
ResNet-50 128x128 5 Tsinghua Cloud / Google Drive
DenseNet-121 96x96 5 Tsinghua Cloud / Google Drive
DenseNet-169 96x96 5 Tsinghua Cloud / Google Drive
DenseNet-201 96x96 5 Tsinghua Cloud / Google Drive
RegNet-Y-600MF 96x96 5 Tsinghua Cloud / Google Drive
RegNet-Y-800MF 96x96 5 Tsinghua Cloud / Google Drive
RegNet-Y-1.6GF 96x96 5 Tsinghua Cloud / Google Drive
MobileNet-V3-Large (1.00) 96x96 3 Tsinghua Cloud / Google Drive
MobileNet-V3-Large (1.00) 128x128 3 Tsinghua Cloud / Google Drive
MobileNet-V3-Large (1.25) 128x128 3 Tsinghua Cloud / Google Drive
EfficientNet-B2 128x128 4 Tsinghua Cloud / Google Drive
EfficientNet-B3 128x128 4 Tsinghua Cloud / Google Drive
EfficientNet-B3 144x144 4 Tsinghua Cloud / Google Drive
  • What are contained in the checkpoints:
**.pth.tar
├── model_name: name of the backbone CNNs (e.g., resnet50, densenet121)
├── patch_size: size of image patches (i.e., H' or W' in the paper)
├── model_prime_state_dict, model_state_dict, fc, policy: state dictionaries of the four components of GFNets
├── model_flops, policy_flops, fc_flops: Multiply-Adds of inferring the encoder, patch proposal network and classifier for once
├── flops: a list containing the Multiply-Adds corresponding to each length of the input sequence during inference
├── anytime_classification: results of anytime prediction (in Top-1 accuracy)
├── dynamic_threshold: the confidence thresholds used in budgeted batch classification
├── budgeted_batch_classification: results of budgeted batch classification (a two-item list, [0] and [1] correspond to the two coordinates of a curve)

Requirements

  • python 3.7.7
  • torch 1.3.1
  • torchvision 0.4.2
  • pyyaml 5.3.1 (for RegNets)

Get Started

Read the evaluation results saved in pre-trained models

CUDA_VISIBLE_DEVICES=0 python inference.py --checkpoint_path PATH_TO_CHECKPOINTS  --eval_mode 0

Read the confidence thresholds saved in pre-trained models and infer the model on the validation set

CUDA_VISIBLE_DEVICES=0 python inference.py --data_url PATH_TO_DATASET --checkpoint_path PATH_TO_CHECKPOINTS  --eval_mode 1

Determine confidence thresholds on the training set and infer the model on the validation set

CUDA_VISIBLE_DEVICES=0 python inference.py --data_url PATH_TO_DATASET --checkpoint_path PATH_TO_CHECKPOINTS  --eval_mode 2

The dataset is expected to be prepared as follows:

ImageNet
├── train
│   ├── folder 1 (class 1)
│   ├── folder 2 (class 1)
│   ├── ...
├── val
│   ├── folder 1 (class 1)
│   ├── folder 2 (class 1)
│   ├── ...

Contact

If you have any question, please feel free to contact the authors. Yulin Wang: wang-yl19@mails.tsinghua.edu.cn.

Acknowledgment

Our code of MobileNet-V3 and EfficientNet is from here. Our code of RegNet is from here.

To Do

Update the code for training and visualizing.

About

A general framework for inferring CNNs efficiently. Reduce the inference latency of MobileNet-V3 by 20% on an iPhone XS Max without sacrificing accuracy.

Resources

Releases

No releases published

Packages

No packages published
You can’t perform that action at this time.