Skip to content

LR Finder doesn't restore original model weights? #11

@AmarSaini

Description

@AmarSaini

Hey! I love this repo, thanks for making it 💯

Everything works well except for one thing, after some digging around/experimenting, here's what I've found:

Below are some figures for the training loss and training accuracy (on MNIST, using a resnet18).

Problem:

  1. Using LRFinder on a model, and then training with it afterwards appears to hurt the models learning (see pink curve below).

Solution:

  1. Using LRFinder on a model, and manually restoring the weights, appears to train the model optimally. (see green curve below).
  2. Using LRFinder on a clone of the model, and then using the original model for training, appears to train the model optimally. (see green curve below).

Regarding the figure/graphs below, both models used the same hyperparameters.

An in-code example of option 1) would be similar to what was given in the README.md:

from torch_lr_finder import LRFinder

model = ...
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=1e-7, weight_decay=1e-2)
lr_finder = LRFinder(model, optimizer, criterion, device="cuda")
lr_finder.range_test(trainloader, end_lr=100, num_iter=100)
lr_finder.plot()

// Then use "model" for training

An in-code example of option 3) would be:

from torch_lr_finder import LRFinder

model = ...
temp_model = *create model with same architecture*
// copy weights over
temp_model.load_state_dict(model.state_dict)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=1e-7, weight_decay=1e-2)
// use temp model in lr_finder
lr_finder = LRFinder(temp_model, optimizer, criterion, device="cuda")
lr_finder.range_test(trainloader, end_lr=100, num_iter=100)
lr_finder.plot()

image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions