Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support optimizers with different parameters #96

Merged
merged 6 commits into from May 31, 2022

Conversation

DavdGao
Copy link
Collaborator

@DavdGao DavdGao commented May 20, 2022

  • This PR is to solve the issue A problem when using Adam optimizer #91
  • Solution
    • Specific the parameters of the local optimizer by adding new parameters under the config cfg.optimizer and cfg.fedopt.optimizer. :
    • The calling of get_optimizer is as follows
    optimizer = get_optimizer(model=model, **cfg.optimizer)
  • Example:
    • Taking cfg.optimizer as an example, the original config file is as follows
    # ------------------------------------------------------------------------ #
    # Optimizer related options
    # ------------------------------------------------------------------------ #
    cfg.optimizer = CN(new_allowed=True)

    cfg.optimizer.type = 'SGD'
    cfg.optimizer.lr = 0.1
  • By setting new_allowed=True in cfg.optimizer, we allow the users to add new parameters according to the type of their optimizers. For example, if I want to use the optimizer registered as myoptimizer, as well as its new parameters mylr and mynorm. I just need to write the yaml file as follows, and the new parameters will be added automatically.
optimizer:
    type: myoptimizer
    mylr: 0.1
    mynorm: 1

@joneswong
Copy link
Collaborator

joneswong commented May 20, 2022

I am wondering whether it is necessary to change the exposed parameters. To resolve this issue, why not just change the argument of get_optimizer? For example, let the caller pass cfg.optimizer instead of each of the optional ones (e.g., weight_decay and momentum), and, in get_optimizer(), pack them into the appropriate kwargs regarding type of optimizer.

@joneswong
Copy link
Collaborator

I am wondering whether it is necessary to change the exposed parameters. To resolve this issue, why not just change the argument of get_optimizer? For example, let the caller pass cfg.optimizer instead of each of the optional ones (e.g., weight_decay and momentum), and, in get_optimizer(), pack them into the appropriate kwargs regarding type of optimizer.

what is your opinion? @DavdGao @rayrayraykk @yxdyc @xieyxclack

@rayrayraykk
Copy link
Collaborator

I am wondering whether it is necessary to change the exposed parameters. To resolve this issue, why not just change the argument of get_optimizer? For example, let the caller pass cfg.optimizer instead of each of the optional ones (e.g., weight_decay and momentum), and, in get_optimizer(), pack them into the appropriate kwargs regarding type of optimizer.

what is your opinion? @DavdGao @rayrayraykk @yxdyc @xieyxclack

Agreed, we could use a filter to pass the args:


This function can be applied to optimizers too.

@DavdGao
Copy link
Collaborator Author

DavdGao commented May 25, 2022

I am wondering whether it is necessary to change the exposed parameters. To resolve this issue, why not just change the argument of get_optimizer? For example, let the caller pass cfg.optimizer instead of each of the optional ones (e.g., weight_decay and momentum), and, in get_optimizer(), pack them into the appropriate kwargs regarding type of optimizer.

what is your opinion? @DavdGao @rayrayraykk @yxdyc @xieyxclack

The solution is updated accordingly.

@DavdGao
Copy link
Collaborator Author

DavdGao commented May 25, 2022

I am wondering whether it is necessary to change the exposed parameters. To resolve this issue, why not just change the argument of get_optimizer? For example, let the caller pass cfg.optimizer instead of each of the optional ones (e.g., weight_decay and momentum), and, in get_optimizer(), pack them into the appropriate kwargs regarding type of optimizer.

what is your opinion? @DavdGao @rayrayraykk @yxdyc @xieyxclack

Agreed, we could use a filter to pass the args:

This function can be applied to optimizers too.

The solution is updated accordingly.

@joneswong
Copy link
Collaborator

This configuring mechanism looks cool to me! Could you provide a test case for it?

@DavdGao
Copy link
Collaborator Author

DavdGao commented May 31, 2022

This configuring mechanism looks cool to me! Could you provide a test case for it?

Thanks, the unittest is added.

Copy link
Collaborator

@yxdyc yxdyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, plz see the inline comments

dataset_name=self._cfg.data.type,
fl_local_update_num=self._cfg.federate.local_update_steps,
fl_type_optimizer=self._cfg.fedopt.type_optimizer,
fl_type_optimizer=self._cfg.fedopt.optimizer.type,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we ensure forward compatibility? That is, both optimizer.grad_clip or grad.grad_clip are ok. Otherwise, we may have to exhaustively modify the historical codes to ensure that the previous experiments still work correctly

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • @yxdyc Thanks, I have checked the historical codes and make sure that all unittest works well.
  • As for gradient clipping, since it is not a common parameter in general optimizers (like the learning rate), maybe we should consider it as an independent operation and separate it from the optimizer.
  • Since we have set cfg.optimizer = CN(new_allowed=True), our modification also supports optimizer.grad_clip as a parameter for customized optimziers.

fl_lr=self._cfg.optimizer.lr,
batch_size=100)

# self.optimizer = get_optimizer(type=self._cfg.fedopt.type_optimizer, model=self.model,lr=self._cfg.fedopt.lr_server)
# self.optimizer = get_optimizer(type=self._cfg.fedopt.type_optimizer, model=self.model,lr=self._cfg.fedopt.optimizer.lr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same problem as "optimizer.grad_clip vs. grad.grad_clip" above.

Copy link
Collaborator

@joneswong joneswong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@joneswong joneswong added the bug Something isn't working label May 31, 2022
@yxdyc yxdyc merged commit cf97ccb into alibaba:master May 31, 2022
@xieyxclack xieyxclack linked an issue Jun 6, 2022 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

A problem when using Adam optimizer
4 participants