Skip to content

Yonghongwei/Advanced-optimizer-with-Gradient-Centralization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Advanced-optimizer-with-Gradient-Centralization

Advanced optimizer with Gradient-Centralization Please Refer to

Introduction

We embed GC into some advanced DNN optimizers, including SGD.py, Adam.py, AdamW, RAdam,Lookahead+SGD.py, Lookahead+Adam.py, Ranger.

There are three hyper-parameters use_gc, gc_conv_only and gc_loc. use_gc=True means that the algorithm adds GC operation, otherwise, not. gc_conv_only=True means the algorithm only adds GC operation for Conv layer, otherwise, for both Conv and FC layer. gc_loc controls the location of GC operation for adaptive learning rate algorithms, including Adam, Radam, Ranger and so on. There are two locations in the algorithm to add GC operation for original gradient and generalized gradient, respectively. Generalized gradient is the variable which is directly used to update the weight. For adaptive learning rate algorithms, we suggest gc_loc=False. For SGD, these two locations for GC are equivalent, so we do not introduce the hyper-parameter gc_loc.

We also give an example of how to use these algorithms in Cifar. For example:

# SGD
optimizer = SGD(net.parameters(), lr=args.lr, momentum=0.9,weight_decay = args.weight_decay,use_gc=True, gc_conv_only=False) 
# Adam
optimizer = Adam(net.parameters(), lr=args.lr, weight_decay = args.weight_decay,use_gc=True, gc_conv_only=False,gc_loc=False) 
# RAdam
optimizer = RAdam(net.parameters(), lr=args.lr, weight_decay = args.weight_decay,use_gc=True, gc_conv_only=False,gc_loc=False)
# lookahead+SGD
base_opt = SGD(net.parameters(), lr=args.lr, momentum=0.9,weight_decay = args.weight_decay,use_gc=False, gc_conv_only=False)
optimizer = Lookahead(base_opt, k=5, alpha=0.5)
# Ranger
optimizer = Ranger(net.parameters(), lr=args.lr, weight_decay = args.weight_decay,use_gc=True, gc_conv_only=False,gc_loc=False)

References:

About

Advanced optimizer with Gradient-Centralization

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages