You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You can use staged learning rate, i.e. different learning rate for different layers. This can be managed by three arguments: (a) --staged-lr: called to use staged learning rate; (b) --new-layers: list of layer names (strings) indicating which layers use the default learning rate and the rest use a scaled learning rate; (c) --base-lr-mult: learning rate multiplier for base layers. For example, when you train resnet50, if you want the randomly initialized self.classifier to have --lr 0.1 and the rest with learning rate scaled by 0.1, you can add --staged-lr --new-layers classifier --base-lr-mult 0.1 to the argument list. See here for more details.
The text was updated successfully, but these errors were encountered:
In addition, when you do --load-weights path_to_pth, load_pretrained_weights() can handle keys with module., i.e. weights previously saved with nn.DataParallel. Check the code for more details.
Updates:
--augdata-re
.--staged-lr
: called to use staged learning rate; (b)--new-layers
: list of layer names (strings) indicating which layers use the default learning rate and the rest use a scaled learning rate; (c)--base-lr-mult
: learning rate multiplier for base layers. For example, when you trainresnet50
, if you want the randomly initializedself.classifier
to have--lr 0.1
and the rest with learning rate scaled by 0.1, you can add--staged-lr --new-layers classifier --base-lr-mult 0.1
to the argument list. See here for more details.The text was updated successfully, but these errors were encountered: