Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make tutorial for enabling different learning rates #1183

Open
2 tasks done
dfdazac opened this issue Dec 7, 2022 · 4 comments
Open
2 tasks done

Make tutorial for enabling different learning rates #1183

dfdazac opened this issue Dec 7, 2022 · 4 comments
Labels
documentation Improvements or additions to documentation

Comments

@dfdazac
Copy link

dfdazac commented Dec 7, 2022

Problem Statement

First of all, thanks for the great work with the library!
It would be very useful to be able to specify different learning rates. Right now, when running a pipeline, an instance of the optimizer is created by passing all parameters in the model:

optimizer_instance = optimizer_resolver.make(
optimizer,
optimizer_kwargs,
params=model_instance.get_grad_params(),
)

However, in some cases we might also want to apply per-parameter options, for example

optim.SGD([
    {'params': model.base.parameters()},
    {'params': model.classifier.parameters(), 'lr': 1e-3}
], lr=1e-2, momentum=0.9)

Describe the solution you'd like

A possible solution could be an optional dictionary passed when creating the pipeline, e.g. optimizer_params. If it's not provided, then the pipeline would default to the above, otherwise the user could choose different learning rates for modules in a custom model:

optimizer_instance = optimizer_resolver.make(
        optimizer,
        optimizer_kwargs,
        params=optimizer_params if optimizer_params else model_instance.get_grad_params(),
    )

Describe alternatives you've considered

I tried getting access to the optimizer via a TrainingCallback, and I considered modifying the learning rate for different modules in the pre_step method:

class MultiLearningRateCallback(TrainingCallback):
    ....
    pre_step(self, **kwargs):
        # Here we have access to the optimizer via self.optimizer

The problem is that at this point the optimizer has already been initialized and has been assigned Parameters, which are difficult to map to the original modules.

Additional information

No response

Issue Template Checks

  • This is not a bug report (use a different issue template if it is)
  • This is not a question (use the discussions forum instead)
@dfdazac dfdazac added the enhancement New feature or request label Dec 7, 2022
@cthoyt
Copy link
Member

cthoyt commented Dec 7, 2022

I'm hesitant about this because the built-in pipeline is only supposed to cover most simple use cases. Every addition makes it more difficult to maintain, to document, and to learn. Further, I don't see any obvious simple ways to configure this from a high level.

As an alternative, it's possible to roll your own pipeline that does exactly what you want. I'd suggest checking out https://pykeen.readthedocs.io/en/stable/tutorial/first_steps.html#beyond-the-pipeline on how to roll your own pipeline.

@dfdazac
Copy link
Author

dfdazac commented Dec 8, 2022

Thank you @cthoyt, I understand. I wanted to figure out if there was an alternative, because we are using so many useful parts of the built-in pipeline right now. I'll give it a try, please feel free to close this issue if you think it won't be discussed any further.

@cthoyt
Copy link
Member

cthoyt commented Dec 8, 2022

@dfdazac if you create a minimal working example, we would love to include it in the documentation. Do you think you could do the following:

  1. Write 1-2 sentences about why you would want to have per-parameter options in KGEM training (like besides just more configurability, what's a concrete scenario where this would actually be helpful?)
  2. Give an end-to-end code example, maybe based on the beyond-the-pipeline section but includes your updates

You could make your own RST document in https://github.com/pykeen/pykeen/tree/master/docs/source/tutorial in a PR that includes this.

@cthoyt cthoyt changed the title Enabling different learning rates Make tutorial for enabling different learning rates Dec 8, 2022
@cthoyt cthoyt added documentation Improvements or additions to documentation and removed enhancement New feature or request labels Dec 8, 2022
@mberr
Copy link
Member

mberr commented Jan 9, 2023

I might be late to the party, but another option (still quite hacky) would be to create a custom subclass of Optimizer, and register it with the resolver

from pykeen.optimizers import optimizer_resolver
from torch import optim, nn

class ModifiedSGD(optim.SGD):
  def __init__(self, params: Iterable[nn.Parameter], custom_lrs: list[tuple[list[nn.Parameter], float]], **kwargs):
    custom_param_ids = set(id(p) for p in custom_params for custom_params, _ in custom_lrs)
    default_params = (p for p in params if id(p) not in custom_param_ids)
    super().__init__(params=[
      {"params": default_params},
      *({"params": custom_params, "lr": custom_lr} for custom_params, custom_lr in custom_lrs),
    ], **kwargs)


optimizer_resolver.register(ModifiedSGD)

You can now use this optimizer with the pipeline

from pykeen.pipeline import pipeline

pipeline(
  optimizer=ModifiedSGD,
  optimizer_kwargs=dict(
    custom_params=[(model.classifier.parameters(), dict(lr=1e-3))],
    lr=1e-2,
    momentum=0.9,
  ),
  ...
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants