How to change optimizer and lr scheduler in the middle of training? #3095

abecciu · 2020-08-21T19:19:04Z

What is your question?

I need to train a model with a pre-trained backbone. For the first 10 epochs, I want to have the backbone completely frozen (ie. not touched by the optimizer). After epoch 10, I want to start training certain layers of the backbone. In regular pytorch, I would instantiate a new optimizer adding the backbone params that I want to train. Then I'd swap both optimizer and lr_scheduler.

What's the recommended way to do something like this in PL?

github-actions · 2020-08-21T19:19:46Z

Hi! thanks for your contribution!, great first issue!

rohitgr7 · 2020-08-22T13:41:05Z

I would suggest doing trainer.fit twice:

class PreFineLightningModule(pl.LightningModule):

	def __init__(self, mode='pre_train'):
		super().__init__()

		self.mode = mode
		# define other params

	def _configure_optim_backbone(self):
		# return optimizers and schedulers for pre-training
		optimizer = # pre-train optimizer
		scheduler = # pre-train scheduler
		return [optimizer], [optimizer]

	def _configure_optim_finetune(self):
		# return optimizers and scheduler for fine-tine
		optimizer = # fine-tune optimizer
		scheduler = # fine-tune scheduler
		return [optimizer], [scheduler]

	def configure_optimizers(self):
		if self.mode == 'pre_train':
			return self._configure_optim_backbone()
		elif self.mode == 'fine_tune':
			return self._configure_optim_finetune()


total_epochs=15 # can be anything

# Pre Train
model = PreFineLightningModule(mode='pre_train')
freeze_backbone(model)
trainer = Trainer(max_epochs=10)
trainer.fit(model)

# Fine Tune
model = PreFineLightningModule.load_from_checkpoint(saved_model_checkpoint, mode='fine_tune')
unfreeze_layers(model)
trainer = Trainer(max_epochs=total_epochs-10)
trainer.fit(model)

abecciu · 2020-08-22T17:36:55Z

Thanks so much for your response! I was hoping to avoid that since it would force me to write a lot more boilerplate code in my project to keep the whole training process automated, configurable, and model agnostic.

Why doesn't PL provide an api to replace instances of optimizers and schedulers? Have you guys considered that?

rohitgr7 · 2020-08-22T17:56:38Z

yeah, you can do that with a callback too using trainer.init_optimizers(new_optimizers, schedulers) but need to take care of 16-bit precision too. I guess pl can have trainer method replace_optimizers or setup_optimizers that take any optimizers and schedulers new/old and do all the stuff to make this process even better. What do you think @awaelchli ??

abecciu · 2020-08-23T16:28:27Z

What would be ideal for me is to be able to replace optimizer and scheduler from within the LightningModule class that's being used for training. Would it be possible to achieve this from an existing callback like optimizer_step by simply returning new instances of optimizer and scheduler?

rohitgr7 · 2020-08-23T17:43:46Z

you can try the on_train_epoch_start method in the callback and reconfigure the optimizers and schedulers the way you want. Here are some links that can help:

set:

class SomeCallback(Callback):
    def on_train_epoch_start(self, trainer. pl_module):
        if trainer.current_epoch == 10:
            trainer.optimizers = [new_optimizers]
            trainer.lr_schedulers = trainer.configure_schedulers([new_schedulers])
            trainer.optimizer_frequencies = [] # or optimizers frequencies if you have any

trainer = Trainer(callbacks=[SomeCallback()], ...)

I think this should work.

abecciu · 2020-08-23T17:54:13Z

Thanks so much! This looks very promising, I couldn't tell from the docs that the on_train_epoch_start receives trainer and pl_module as params. Seems like a great solution.

I'll try this out and report back.

rohitgr7 · 2020-08-23T17:59:07Z

fixed the link

abecciu · 2020-09-05T22:24:40Z

This solution worked great for me. It's important to note that the on_train_epoch_start callback does not exist in older versions of PL.

Thanks again @rohitgr7 for the help!

icedpanda · 2022-01-12T08:41:36Z

Hi @rohitgr7, how to change optimizer and LR scheduler if I am using ReduceLROnPlateau? I am not sure how to set monitor, interval, etc while using ReduceLROnPlateau

This is my sample configure_optimizers for init :

    def configure_optimizers(self):
        optimizer = optim.Adam(
            filter(
                lambda p: p.requires_grad,
                self.net.parameters()),
            lr=self.lr,
            weight_decay=self.weight_decay)
        scheduler = optim.lr_scheduler.ReduceLROnPlateau(
            optimizer,
            mode="min",
            factor=0.1,
            patience=self.scheduler_patience,
            cooldown=3,
        )
        return {
            "optimizer": optimizer,
            "lr_scheduler": {
                "scheduler": scheduler,
                "monitor": "val/loss",
                "interval": "epoch",
                "frequency": 1,
            },
        }

However, how to set this monitor, interval while using on_train_start in the middle of training?

    def on_train_start(self, trainer, pl_module) -> None:

        if trainer.current_epoch == self._unfreeze_at_epoch:
            print("unfreeze and add param group...")
            pl_module.net.freeze_backbone(False)
            new_optimizer = optim.Adam(
                filter(
                    lambda p: p.requires_grad,
                    pl_module.net.parameters()),
                lr=pl_module.lr,
                weight_decay=pl_module.weight_decay)
            new_schedulers = optim.lr_scheduler.ReduceLROnPlateau(
                new_optimizer,
                mode="min",
                factor=0.1,
                patience=pl_module.scheduler_patience,
                cooldown=3,
            )
            trainer.optimizers = [new_optimizer]
            trainer.lr_schedulers = [new_schedulers]
            trainer.optimizer_frequencies =1

Thanks in advance

lxxue · 2022-04-10T15:38:21Z

Hi @rohitgr7, how to change optimizer and LR scheduler if I am using ReduceLROnPlateau? I am not sure how to set monitor, interval, etc while using ReduceLROnPlateau

This is my sample configure_optimizers for init :

    def configure_optimizers(self):
        optimizer = optim.Adam(
            filter(
                lambda p: p.requires_grad,
                self.net.parameters()),
            lr=self.lr,
            weight_decay=self.weight_decay)
        scheduler = optim.lr_scheduler.ReduceLROnPlateau(
            optimizer,
            mode="min",
            factor=0.1,
            patience=self.scheduler_patience,
            cooldown=3,
        )
        return {
            "optimizer": optimizer,
            "lr_scheduler": {
                "scheduler": scheduler,
                "monitor": "val/loss",
                "interval": "epoch",
                "frequency": 1,
            },
        }

However, how to set this monitor, interval while using on_train_start in the middle of training?

    def on_train_start(self, trainer, pl_module) -> None:

        if trainer.current_epoch == self._unfreeze_at_epoch:
            print("unfreeze and add param group...")
            pl_module.net.freeze_backbone(False)
            new_optimizer = optim.Adam(
                filter(
                    lambda p: p.requires_grad,
                    pl_module.net.parameters()),
                lr=pl_module.lr,
                weight_decay=pl_module.weight_decay)
            new_schedulers = optim.lr_scheduler.ReduceLROnPlateau(
                new_optimizer,
                mode="min",
                factor=0.1,
                patience=pl_module.scheduler_patience,
                cooldown=3,
            )
            trainer.optimizers = [new_optimizer]
            trainer.lr_schedulers = [new_schedulers]
            trainer.optimizer_frequencies =1

Thanks in advance

I just tried the callback solution and found that you cannot set

trainer.lr_schedulers = [new_schedulers]

but need to do this

trainer.lr_schedulers = trainer._configure_schedulers(
                    schedulers, monitor=None, is_manual_optimization=False)

to properly set the schedulers with monitor and add other default configurations

wangm23456 · 2023-05-17T07:32:39Z

What should I do in 2.0.2 ? Why we can't have a function like configure_optimizers()?

wangm23456 · 2023-05-17T08:03:52Z

What should I do in 2.0.2 ? Why we can't have a function like configure_optimizers()?在2.0.2中我应该做什么？为什么我们不能有一个像configure_optimizers（）这样的函数？

    def setup_optimizers(self, trainer: "pl.Trainer") -> None:
        """Creates optimizers and schedulers.

        Args:
            trainer: the Trainer, these optimizers should be connected to
        """
        if trainer.state.fn != TrainerFn.FITTING:
            return
        assert self.lightning_module is not None
        self.optimizers, self.lr_scheduler_configs = _init_optimizers_and_lr_schedulers(self.lightning_module)

    def setup(self, trainer: "pl.Trainer") -> None:
        """Setup plugins for the trainer fit and creates optimizers.

        Args:
            trainer: the trainer instance
        """
        assert self.accelerator is not None
        self.accelerator.setup(trainer)
        self.setup_optimizers(trainer)
        self.setup_precision_plugin()
        _optimizers_to_device(self.optimizers, self.root_device)

So, just self.trainer.strategy.setup(self.trainer)?

BrunoBelucci · 2024-04-05T14:54:15Z

You could use only the setup_optimizers method in the strategy to not mess up with the other configurations, so I would say maybe trainer.strategy.setup_optimizers(trainer)

abecciu added the question Further information is requested label Aug 21, 2020

abecciu changed the title ~~How to change optimizer and lr scheduler in the middle of training~~ How to change optimizer and lr scheduler in the middle of training? Aug 21, 2020

abecciu closed this as completed Sep 5, 2020

qibaoyuan mentioned this issue Apr 3, 2024

[BUG] LISA: same loss regardless of lisa_activated_layers OptimalScale/LMFlow#726

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to change optimizer and lr scheduler in the middle of training? #3095

How to change optimizer and lr scheduler in the middle of training? #3095

abecciu commented Aug 21, 2020 •

edited

github-actions bot commented Aug 21, 2020

rohitgr7 commented Aug 22, 2020

abecciu commented Aug 22, 2020

rohitgr7 commented Aug 22, 2020

abecciu commented Aug 23, 2020

rohitgr7 commented Aug 23, 2020 •

edited

abecciu commented Aug 23, 2020

rohitgr7 commented Aug 23, 2020

abecciu commented Sep 5, 2020 •

edited

icedpanda commented Jan 12, 2022 •

edited

lxxue commented Apr 10, 2022

wangm23456 commented May 17, 2023

wangm23456 commented May 17, 2023

BrunoBelucci commented Apr 5, 2024

How to change optimizer and lr scheduler in the middle of training? #3095

How to change optimizer and lr scheduler in the middle of training? #3095

Comments

abecciu commented Aug 21, 2020 • edited

What is your question?

github-actions bot commented Aug 21, 2020

rohitgr7 commented Aug 22, 2020

abecciu commented Aug 22, 2020

rohitgr7 commented Aug 22, 2020

abecciu commented Aug 23, 2020

rohitgr7 commented Aug 23, 2020 • edited

abecciu commented Aug 23, 2020

rohitgr7 commented Aug 23, 2020

abecciu commented Sep 5, 2020 • edited

icedpanda commented Jan 12, 2022 • edited

lxxue commented Apr 10, 2022

wangm23456 commented May 17, 2023

wangm23456 commented May 17, 2023

BrunoBelucci commented Apr 5, 2024

abecciu commented Aug 21, 2020 •

edited

rohitgr7 commented Aug 23, 2020 •

edited

abecciu commented Sep 5, 2020 •

edited

icedpanda commented Jan 12, 2022 •

edited