Deep Networks with Fast Retraining

Abstract:

Recent work [1] has utilized Moore-Penrose (MP) inverse in deep convolutional neural network (DCNN) training, which achieves better generalization performance over the DCNN with a stochastic gradient descent (SGD) pipeline. However, the MP technique cannot be processed in the GPU environment due to its high demands of computational resources. This paper proposes a fast DCNN learning strategy with MP inverse to achieve better testing performance without introducing a large calculation burden. We achieve this goal through an SGD and MP inverse-based two-stage training procedure. In each training epoch, a random learning strategy that controls the number of convolutional layers trained in backward pass is utilized, and an MP inverse-based batch-by-batch learning strategy is developed that enables the network to be implemented with GPU acceleration and to refine the parameters in dense layer. Through experiments on image classification datasets with various training images ranging in amount from 3,060 (Caltech101) to 1,803,460 (Place365), we empirically demonstrate that the fast retraining is a unified strategy that can be utilized in all DCNNs. Our method obtains up to 1% Top-1 testing accuracy boosts over the state-of-the-art DCNN learning pipeline, yielding a savings in training time of 15% to 25% over the work in [1].

[1] Yimin Yang, Q.M.Jonathan Wu, et al., “Recomputation of dense layers for the performance improvement of dcnn,” IEEE Trans. Pattern Anal. Mach. Intell., 2019.

Full-text

arXiv: Deep Networks with Fast Retraining

Learning Structure:

Step 1 - Random learning with SGD. In each epoch, users randomly activate La number of convolutional layers, while excluding the rest Li number of convolutional layers from backward pass. La and Li are determined by a predefined hyperparameter ra.

Step 2 - Retraining with MP inverse-based batch-by-batch strategy. \eta^n and \eta^{n-1} are obtained by Procedure I, while e^{n-1} and e^{n-2} are received via Procedure II. The details for Procedure I and II can be found from Algorithm 1.

Downloads:

Caltech-256

Caltech-256 dataset: Caltech-256
Code (SGD): Code-for-Caltech
Code (Adam): Code-for-R-Caltech

Dependencies:

Matlab version 2020a,
A workstation with a 256GB memory and an E5-2650 processor.

Reproduce the Experimental Results

In "Demo0.zip", run script "main_ResNet_ori.m" for original SGD optimization, "main_ResNet_ori_RL.m" for SGD plus random convolutional learning, "main_ResNet_FR.m" for Yang's retraining strategy, or "main_ResNet_FR_RL.m" for the proposed fast retraining algorithm (random SGD plus batch-by-batch MP).

#The code is released in content-obscured version (p files). After acceptance of the manuscript (if decided so), the source code will be made public.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
1.jpg		1.jpg
2.jpg		2.jpg
Demo.zip		Demo.zip
Demo0.zip		Demo0.zip
Demo1.zip		Demo1.zip
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Networks with Fast Retraining

Abstract:

Full-text

Learning Structure:

Downloads:

Caltech-256

Dependencies:

Reproduce the Experimental Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Deep Networks with Fast Retraining

Abstract:

Full-text

Learning Structure:

Downloads:

Caltech-256

Dependencies:

Reproduce the Experimental Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages