-
Notifications
You must be signed in to change notification settings - Fork 418
Description
Motivation
Optimizers are currently passed as a single object to the optimizer. In some cases, we'd like to have multiple different optimizers for a problem (for instance to update the parameters only every N operations).
Optimizers should be implemented using a hook (for instance _optimizer_hook placed in between _post_loss_hook and _post_optim_hook, and that would receive the loss as input) that would call each optimizer.
Since the loss is a TensorDict of losses, we could even contemplate the possibility of having a backward for each optim hook, and hence each optim hook would do its own backward, optim step and zero_grad over its set of parameters.
We should keep the optimizer in the args of the Trainer.__init__ method as it's convenient for the users to pass it there, but optimizer=None should be supported (then the _optimizer_hook would not see anything and would just be a no-op).
Doc: https://pytorch.org/rl/reference/trainers.html#torchrl-trainers-package