-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow extra_epochs
flag in Trainer.fit
to control finetuning time
#13273
Comments
You accomplish this by doing: trainer.fit_loop.max_epochs += 100 before |
If this worked, it would be very counter intuitive, because the current number of epochs is known only after calling When trying your solution trainer = pl.Trainer(**trainer_params)
trainer.fit_loop.max_epochs += 2
trainer.fit(model, train_dl, val_dl, ckpt_path=best_ckpt) I don't find the desired behavior |
Hello, any further news or alternative answer? |
@franchesoni If I understand correctly, you are saying this is not an option for you? model = Model.load_from_checkpoint("path/to/pretrained/checkpoint.ckpt")
trainer = pl.Trainer(**trainer_params, max_epochs=N)
trainer.fit(model, train_dl, val_dl) I assume because you want some parts of the trainer state restored from the checkpoint, e.g. optimizer state, but not the full loop state. Then I think this is just an other version of this request #5339 to be able to control what is getting restored. I think this is something we need to start adding to the roadmap and think hard about. |
There are 2 potential solutions:
ckpt = torch.load(...)
current_epoch = ckpt["current_epoch"]
trainer = Trainer(max_epochs=current_epoch + N) An issue with this method is that it loads the fully checkpoint just for this change. This relates to #5339 and #12712
|
This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions - the Lightning Team! |
Hello, I got a image inpainting project Paint-by-Example implemented in pytorch_lightning. I want to finetune the stable diffusion model using LoRA, but I can't find the model definition and don't know how to add lora finetuning process to the project. Can you give me some advice? |
馃殌 Feature
Trainer(max_epochs=100).fit(model, train_dl, ckpt_path=ckpt_path, extra_epochs=True)
would finetune for 100 epochsMotivation
Finetuning for N epochs requires knowing the previous number of epochs M and setting
Trainer(max_epochs=M+N)
. Google did not tell me how to achieve this.Pitch
Finetuning training time or number of epochs should be configurable.
Alternatives
Setting many epochs and manually stopping
Additional context
It would be cool with
max_time
too. I hope this is already solved and this issue is unnecessary.cc @justusschock @kaushikb11 @awaelchli @Borda @rohitgr7
The text was updated successfully, but these errors were encountered: