learn.freeze_to and learn.fit_one_cycle will accept params that presuppose state of the model that is not correct #228

radekosmulski · 2019-09-28T18:49:18Z

The core of the issue here is 1) lack of mechanism to easily gauge state of the model (number of param groups, which param groups are unfrozen 2) methods accepting params without complaints that are in some way incompatible with the model.

I don't know what is the correct fix here and this is more for me about sharing ideas and observations. Maybe 1) is completely not needed, but there were times even with fastai 1x where I would have appreciated having a mechanism that would print out to me whether param groups were frozen or not. But I think this is at best a nice to have - once everything works as intended and blows up when there are issues I don't think this will add that much value.

2 is a bigger issue. Right now, even if I have just a single param group, I can call learn.freeze_to(-2). If I don't realize that the cutting of model didn't go as plan (as is the case right now where the cutting doesn't seem to work) I will never be informed of the problem (can still most likely infer that that is the case from the training time, etc, but that requires deeper understanding and paying attention).

Same for learn.fit. I can right now call `learn.fit([1, 1, ...]) with an arbitrarily long list of lrs and that method will not complain, regardless how many param groups there are.

Probably calling learn.freeze_to with an argument that is incompatible with the model should raise. With learn.fit and the lrs I am not sure how to handle this - the two options that come to mind are being okay with a single lr and with multiple lrs raising either when len(lrs) != len(param_groups) or len(lrs) != len(trainable_param_groups).

The text was updated successfully, but these errors were encountered:

tianjianjiang · 2019-09-29T02:00:09Z

I have a similar experience and end up creating a data class to store them.
Besides the info of frozen layers and parameter groups, extensions from callbacks can be additional sources. For example, I have to memorize the dynamic loss scale of fp16.

sgugger · 2019-09-30T15:04:52Z

Fixed in several ways:

The state of the model can be seen in learn.summary() (defined in 15). This will show the parameters that are frozen and unfrozen and contains a line model forzen to parameter group number xxx
Trying to freeze up to a layer that's bigger than the number of parameter groups will issue a warning that the whole model is frozen
Trying to set an hyper-parameter with a collection containing more items than parameter groups will raise an exception (we chose to use number of parameter groups, not trainable parameter groups)

sgugger closed this as completed in 5047c2b Sep 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

learn.freeze_to and learn.fit_one_cycle will accept params that presuppose state of the model that is not correct #228

learn.freeze_to and learn.fit_one_cycle will accept params that presuppose state of the model that is not correct #228

radekosmulski commented Sep 28, 2019

tianjianjiang commented Sep 29, 2019

sgugger commented Sep 30, 2019

learn.freeze_to and learn.fit_one_cycle will accept params that presuppose state of the model that is not correct #228

learn.freeze_to and learn.fit_one_cycle will accept params that presuppose state of the model that is not correct #228

Comments

radekosmulski commented Sep 28, 2019

tianjianjiang commented Sep 29, 2019

sgugger commented Sep 30, 2019