Skip to content
/ SeaM Public

Reusing Deep Neural Network Models through Model Re-engineering (ICSE'23)

Notifications You must be signed in to change notification settings

qibinhang/SeaM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Reusing Deep Neural Network Models through Model Re-engineering

Abstract

Training deep neural network (DNN) models, which has become an important task in today's software development, is often costly in terms of computational resources and time. With the inspiration of software reuse, building DNN models through reusing existing ones has gained increasing attention recently. Prior approaches to DNN model reuse have two main limitations: 1) reusing the entire model, while only a small part of the model's functionalities (labels) are required, would cause much overhead (e.g., computational and time costs for inference), and 2) model reuse would inherit the defects and weaknesses of the reused model, and hence put the new system under threats of security attack. To solve the above problem, we propose SeaM, a tool that re-engineers a trained DNN model to improve its reusability. Specifically, given a target problem and a trained model, SeaM utilizes a gradient-based search method to search for the model's weights that are relevant to the target problem. The re-engineered model that only retains the relevant weights is then reused to solve the target problem. Evaluation results on widely-used models show that the re-engineered models produced by SeaM only contain 10.11% weights of the original models, resulting 42.41% reduction in terms of inference time. For the target problem, the re-engineered models even outperform the original models in classification accuracy by 5.85%. Moreover, reusing the re-engineered models inherits an average of 57% fewer defects than reusing the entire model. We believe our approach to reducing reuse overhead and defect inheritance is one important step forward for practical model reuse.

Requirements

  • advertorch 0.2.3
  • fvcore 0.1.5.post20220512
  • matplotlib 3.4.2
  • numpy 1.19.2
  • python 3.8.10
  • pytorch 1.8.1
  • torchvision 0.9.0
  • tqdm 4.61.0
  • GPU with CUDA support is also needed

Structure of the directories

  |--- README.md                        :  user guidance
  |--- data/                            :  experimental data
  |--- src/                             :  source code of our work
  |------ global_config.py              :  setting the path
  |------ binary_class/                 :  direct reuse on binary classification problems
  |--------- model_reengineering.py     :  re-engineering a trained model and then reuse the re-engineered model
  |--------- calculate_flop.py          :  calculating the number of FLOPs required by reusing the re-engineered and original models
  |--------- calculate_time_cost.py     :  calculating the inference time required by reusing the re-engineered and original models
  |--------- ......                 
  |------ multi_class/                  :  direct reuse on multi-class classification problems
  |--------- ...... 
  |------ defect_inherit/               :  indirect reuse 
  |--------- reengineering_finetune.py  :  re-engineering a trained model and then fine-tuning the re-engineered model
  |--------- standard_finetune.py       :  using standard fine-tuning approach to fine-tune a trained model
  |--------- eval_robustness.py         :  calculating the defect inheritance rate
  |--------- ......

The following sections describe how to reproduce the experimental results in our paper.

Downloading experimental data

  1. We provide the trained models and datasets used in the experiments, as well as the corresponding re-engineered models.
    One can download data/ from here and then move it to SeaM/.
    The trained models will be downloaded automatically by PyTorch when running our project. If the download fails, please move our provided trained models to the folder according to the failure information given by PyTorch.
    Due to the huge size of ImageNet, please download it from its webpage.
  2. Modify self.root_dir in src/global_config.py.

Direct model reuse

Re-engineering on binary classification problems

  1. Go to the directory of experiments related to the binary classification problems.
cd src/binary_class
  1. Re-engineer VGG16-CIFAR10 on a binary classification problem.
python model_reengineering.py --model vgg16 --dataset cifar10 --target_class 0 --lr_mask 0.01 --alpha 1
  1. Compute the number of FLOPs required by the original and re-engineered VGG16-CIFAR10, respectively. This command also presents the accuracy of models.
python calculate_flop.py --model vgg16 --dataset cifar10 --target_class 0 --lr_mask 0.01 --alpha 1
  1. Compute the time cost for inference required by the original and re-engineered VGG16-CIFAR10, respectively. This command also presents the number of a model's weights.
python calculate_time_cost.py --model vgg16 --dataset cifar10 --target_class 0 --lr_mask 0.01 --alpha 1

Re-engineering on multi-class classification problems

  1. Go to the directory of experiments related to the multi-class classification problems.
cd src/multi_class
  1. Re-engineer ResNet20-CIFAR100 on a multi-class classification problem.
python model_reengineering.py --model resnet20 --dataset cifar100 --target_superclass_idx 0 --lr_mask 0.1 --alpha 2
  1. Compute the number of FLOPs required by the original and re-engineered ResNet20-CIFAR100, respectively. This command also presents the accuracy of models.
python calculate_flop.py --model resnet20 --dataset cifar100 --target_superclass_idx 0 --lr_mask 0.1 --alpha 2
  1. Compute the time cost for inference required by the original and re-engineered ResNet20-CIFAR100, respectively. This command also presents the number of a model's weights.
python calculate_time_cost.py --model resnet20 --dataset cifar100 --target_superclass 0 --lr_mask 0.1 --alpha 2

NOTE: When computing the time cost for inference, DeepSparse runs a model on several CPUs. The inference process would be interfered with other active processes, leading to fluctuations in inference time cost. In our experiments, we manually kill as many other processes as possible and enable the inference process to occupy the CPUs exclusively.

Indirect model reuse

  1. Go to the directory of experiments related to the defect inheritance.
cd src/defect_inherit
  1. Re-engineer ResNet18-ImageNet and then fine-tune the re-engineered model on the target dataset Scenes.
python reengineering_finetune.py --model resnet18 --dataset mit67 --lr_mask 0.05 --alpha 0.5 --prune_threshold 0.6
  1. Compute the defect inheritance rate of fine-tuned re-engineered ResNet18-Scenes.
python eval_robustness.py --model resnet18 --dataset mit67 --eval seam --lr_mask 0.05 --alpha 0.5 --prune_threshold 0.6
  1. Fine-tune the original ResNet18-ImageNet on the target dataset Scenes.
python standard_finetune.py --model resnet18 --dataset mit67
  1. Compute the defect inheritance rate of fine-tuned original ResNet18-Scenes.
python eval_robustness.py --model resnet18 --dataset mit67 --eval standard

Supplementary experimental results

The values of major parameters

The following table shows the default hyperparameters. The details of settings for re-engineering each trained model on each target problem can be obtained according to the experimental result files.
For instance, the values of learning rate and alpha for the re-engineered model file SeaM/data/multi_class_classification/resnet20_cifar100/predefined/tsc_0/lr_head_mask_0.1_0.05_alpha_1.0.pth are 0.05 and 1.0, respectively.

Target Problem Model Name Learning Rate Alpha
Binary
Classification
VGG16-CIFAR10 0.01 1.00
VGG16-CIFAR100 0.05 1.50
ResNet20-CIFAR10 0.05 1.00
ResNet20-CIFAR100 0.12 1.50
Multi-class
Classification
ResNet20-CIFAR100 0.10 2.00
ResNet50-ImageNet 0.05 2.00

The impact of reducing the number of weights on ACC and DIR. (for RQ3)

We investigate the impact of the reduction in the number of weights on the ACC and DIR. A threshold $t$ is used to early stop model re-engineering when the rate of removed weights reaches the threshold. The following figure shows the ACC and DIR of the fine-tuned ResNet18 with different values of $t$, where $t=0.3, 0.4, 0.5, 0.6, 0.7$. We find that, as the number of weights reduces, the DIR reduces significantly, while the ACC is stable overall. The results demonstrate the effectiveness of model re-engineering in reducing the DIR and the robustness of SeaM.

img

About

Reusing Deep Neural Network Models through Model Re-engineering (ICSE'23)

Topics

Resources

Stars

Watchers

Forks

Languages