Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support discriminative learning with OptimWrapper #2829

Closed
KushajveerSingh opened this issue Sep 23, 2020 · 9 comments
Closed

Support discriminative learning with OptimWrapper #2829

KushajveerSingh opened this issue Sep 23, 2020 · 9 comments

Comments

@KushajveerSingh
Copy link
Contributor

Currently, the following code gives error

from fastai.vision.all import *

def SGD_opt(params, **kwargs): return OptimWrapper(torch.optim.SGD(params, **kwargs))

path = untar_data(URLs.PETS)/'images'

def is_cat(x): return x[0].isupper()
dls = ImageDataLoaders.from_name_func(
    path, get_image_files(path), valid_pct=0.2, seed=42,
    label_func=is_cat, item_tfms=Resize(224))

learn = cnn_learner(dls, resnet34, metrics=error_rate, opt_func=SGD_opt)
learn.fit_one_cycle(1)

The error is as follows:

TypeError                                 Traceback (most recent call last)
<ipython-input-133-20a3ebb82957> in <module>
     10     label_func=is_cat, item_tfms=Resize(224))
     11 
---> 12 learn = cnn_learner(dls, resnet34, metrics=error_rate, opt_func=SGD_opt)
     13 learn.fit_one_cycle(1)

~/miniconda3/lib/python3.8/site-packages/fastcore/logargs.py in _f(*args, **kwargs)
     50         log_dict = {**func_args.arguments, **{f'{k} (not in signature)':v for k,v in xtra_kwargs.items()}}
     51         log = {f'{f.__qualname__}.{k}':v for k,v in log_dict.items() if k not in but}
---> 52         inst = f(*args, **kwargs) if to_return else args[0]
     53         init_args = getattr(inst, 'init_args', {})
     54         init_args.update(log)

~/miniconda3/lib/python3.8/site-packages/fastai/vision/learner.py in cnn_learner(dls, arch, loss_func, pretrained, cut, splitter, y_range, config, n_out, normalize, **kwargs)
    175     model = create_cnn_model(arch, n_out, ifnone(cut, meta['cut']), pretrained, y_range=y_range, **config)
    176     learn = Learner(dls, model, loss_func=loss_func, splitter=ifnone(splitter, meta['split']), **kwargs)
--> 177     if pretrained: learn.freeze()
    178     return learn
    179 

~/miniconda3/lib/python3.8/site-packages/fastai/learner.py in freeze(self)
    513 
    514 @patch
--> 515 def freeze(self:Learner): self.freeze_to(-1)
    516 
    517 @patch

~/miniconda3/lib/python3.8/site-packages/fastai/learner.py in freeze_to(self, n)
    508 @patch
    509 def freeze_to(self:Learner, n):
--> 510     if self.opt is None: self.create_opt()
    511     self.opt.freeze_to(n)
    512     self.opt.clear_state()

~/miniconda3/lib/python3.8/site-packages/fastai/learner.py in create_opt(self)
    139     def _bn_bias_state(self, with_bias): return norm_bias_params(self.model, with_bias).map(self.opt.state)
    140     def create_opt(self):
--> 141         self.opt = self.opt_func(self.splitter(self.model), lr=self.lr)
    142         if not self.wd_bn_bias:
    143             for p in self._bn_bias_state(True ): p['do_wd'] = False

<ipython-input-133-20a3ebb82957> in SGD_opt(params, **kwargs)
      1 from fastai.vision.all import *
      2 
----> 3 def SGD_opt(params, **kwargs): return OptimWrapper(torch.optim.SGD(params, **kwargs))
      4 
      5 path = untar_data(URLs.PETS)/'images'

~/miniconda3/lib/python3.8/site-packages/torch/optim/sgd.py in __init__(self, params, lr, momentum, dampening, weight_decay, nesterov)
     66         if nesterov and (momentum <= 0 or dampening != 0):
     67             raise ValueError("Nesterov momentum requires a momentum and zero dampening")
---> 68         super(SGD, self).__init__(params, defaults)
     69 
     70     def __setstate__(self, state):

~/miniconda3/lib/python3.8/site-packages/torch/optim/optimizer.py in __init__(self, params, defaults)
     49 
     50         for param_group in param_groups:
---> 51             self.add_param_group(param_group)
     52 
     53     def __getstate__(self):

~/miniconda3/lib/python3.8/site-packages/torch/optim/optimizer.py in add_param_group(self, param_group)
    208         for param in param_group['params']:
    209             if not isinstance(param, torch.Tensor):
--> 210                 raise TypeError("optimizer can only optimize Tensors, "
    211                                 "but one of the params is " + torch.typename(param))
    212             if not param.is_leaf:

TypeError: optimizer can only optimize Tensors, but one of the params is list

The error is due to the fact that pytorch optimizers want the param list to be of the format list(dict(params='model_parameters')). In case of fastai this is list(list(params='model_parameters')).

It was also verified that the error does not occur when discriminative learning is not used.

One possible solution to the problem is to update the splitter to output list(dict) as shown below

def splitter(m):
    ps = L(m[0][:3], m[0][3:], m[1:]).map(params)
    param_list = []
    for p in ps: param_list.append(dict(params=p))
    return p

I don't know if this is the best solution but as of now it get the work done.

Working code

from fastai.vision.all import *

def SGD_opt(params, **kwargs): return OptimWrapper(torch.optim.SGD(params, **kwargs))

path = untar_data(URLs.PETS)/'images'

def is_cat(x): return x[0].isupper()
dls = ImageDataLoaders.from_name_func(
    path, get_image_files(path), valid_pct=0.2, seed=42,
    label_func=is_cat, item_tfms=Resize(224))

learn = cnn_learner(dls, resnet34, metrics=error_rate, opt_func=SGD_opt, splitter=splitter)
learn.fit_one_cycle(1)

Maybe the splitter code can be made part of OptimWrapper but I am not sure.

@kevinbird15
Copy link
Contributor

Here is another option, but it doesn't always work:

@delegates(torch.optim.AdamW)
def AdamW_opt(params, **kwargs): 
    param_list = []
    for p in params: param_list.append(dict(params=p))
    return OptimWrapper(torch.optim.AdamW(param_list, **kwargs))

This works fine when passed here:

path = untar_data(URLs.PETS)/'images'

def is_cat(x): return x[0].isupper()
dls = ImageDataLoaders.from_name_func(
    path, get_image_files(path), valid_pct=0.2, seed=42,
    label_func=is_cat, item_tfms=Resize(224))

learn = cnn_learner(dls, resnet34, metrics=error_rate, opt_func=AdamW_opt)

But when I tried the same thing for a second optimizer from pytorch:

@delegates(torch.optim.LBFGS)
def LBFGS_opt(params, **kwargs): 
    param_list = []
    for p in params: param_list.append(dict(params=p))
    return OptimWrapper(torch.optim.LBFGS(param_list, **kwargs))

path = untar_data(URLs.PETS)/'images'

def is_cat(x): return x[0].isupper()
dls = ImageDataLoaders.from_name_func(
    path, get_image_files(path), valid_pct=0.2, seed=42,
    label_func=is_cat, item_tfms=Resize(224))

learn = cnn_learner(dls, resnet34, metrics=error_rate, opt_func=LBFGS_opt)
learn._step = partial(_step_LBFGS,learn)

I get the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-44-c2be8a648cb1> in <module>
      6     label_func=is_cat, item_tfms=Resize(224))
      7 
----> 8 learn = cnn_learner(dls, resnet34, metrics=error_rate, opt_func=LBFGS_opt)
      9 learn._step = partial(_step_LBFGS,learn)

~/Environment_personal/development/fastcore/fastcore/logargs.py in _f(*args, **kwargs)
     50         log_dict = {**func_args.arguments, **{f'{k} (not in signature)':v for k,v in xtra_kwargs.items()}}
     51         log = {f'{f.__qualname__}.{k}':v for k,v in log_dict.items() if k not in but}
---> 52         inst = f(*args, **kwargs) if to_return else args[0]
     53         init_args = getattr(inst, 'init_args', {})
     54         init_args.update(log)

~/Environment_personal/development/fastai/fastai/vision/learner.py in cnn_learner(dls, arch, loss_func, pretrained, cut, splitter, y_range, config, n_out, normalize, **kwargs)
    175     model = create_cnn_model(arch, n_out, ifnone(cut, meta['cut']), pretrained, y_range=y_range, **config)
    176     learn = Learner(dls, model, loss_func=loss_func, splitter=ifnone(splitter, meta['split']), **kwargs)
--> 177     if pretrained: learn.freeze()
    178     return learn
    179 

~/Environment_personal/development/fastai/fastai/learner.py in freeze(self)
    513 
    514 @patch
--> 515 def freeze(self:Learner): self.freeze_to(-1)
    516 
    517 @patch

~/Environment_personal/development/fastai/fastai/learner.py in freeze_to(self, n)
    508 @patch
    509 def freeze_to(self:Learner, n):
--> 510     if self.opt is None: self.create_opt()
    511     self.opt.freeze_to(n)
    512     self.opt.clear_state()

~/Environment_personal/development/fastai/fastai/learner.py in create_opt(self)
    139     def _bn_bias_state(self, with_bias): return norm_bias_params(self.model, with_bias).map(self.opt.state)
    140     def create_opt(self):
--> 141         self.opt = self.opt_func(self.splitter(self.model), lr=self.lr)
    142         if not self.wd_bn_bias:
    143             for p in self._bn_bias_state(True ): p['do_wd'] = False

<ipython-input-34-dc56115c5594> in LBFGS_opt(params, **kwargs)
      3     param_list = []
      4     for p in params: param_list.append(dict(params=p))
----> 5     return OptimWrapper(torch.optim.LBFGS(param_list, **kwargs))

~/anaconda3/envs/fastai_dev/lib/python3.8/site-packages/torch/optim/lbfgs.py in __init__(self, params, lr, max_iter, max_eval, tolerance_grad, tolerance_change, history_size, line_search_fn)
    234 
    235         if len(self.param_groups) != 1:
--> 236             raise ValueError("LBFGS doesn't support per-parameter options "
    237                              "(parameter groups)")
    238 

ValueError: LBFGS doesn't support per-parameter options (parameter groups)

But using the splitter method that has no issues.

@jph00
Copy link
Member

jph00 commented Sep 23, 2020

Many thanks.

So I guess there might be two possible solutions: try to make fastai's optimizers have the same structure as PyTorch's, or else try to make OptimWrapper or something similar change the splitter, right?

@jph00 jph00 changed the title OptimWrapper cannot handle discriminative learning (cannot use pytorch optimizers) Support discriminative learning with OptimWrapper Sep 23, 2020
@KushajveerSingh
Copy link
Contributor Author

KushajveerSingh commented Sep 23, 2020

I think it makes more sense to maintain consistency between pytorch and fastai (as it might help in the long run). For the implementation we can use this:

def l2d(ps): return L(dict(params=p) for p in ps)

def splitter(m): return l2d(L(m[0][:3], m[0][3:], m[1:]).map(params))

With this we only need to add l2d and then wrap previous splitters with this function.

@KushajveerSingh
Copy link
Contributor Author

I will try to work on this tomorrow.

@KushajveerSingh
Copy link
Contributor Author

@kevinbird15 Did the splitter function work for you in all the cases? There are some tests failing in my case. In vision.learner I cannot do learn.fit with the new splitter.

@jph00
Copy link
Member

jph00 commented Nov 9, 2020

@KushajveerSingh why did you close this? I don't believe it's resolved yet.

@jph00 jph00 reopened this Nov 9, 2020
@KushajveerSingh
Copy link
Contributor Author

Oh no. Was playing with gh and accidentally used the link for this issue.

@marii-moe
Copy link
Collaborator

@KushajveerSingh Are you working on this? Tag me and I will assign you if so.

@KushajveerSingh
Copy link
Contributor Author

Currently, I am not working on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants