-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warn when running an infinite epoch and overriding "epoch end" accumulating hooks #11554
Comments
I'd like to work on this, could you assign this issue to me? |
Hey @vedpatwardhan, Yes, go on ! |
@carmocca Update: Well, I wasn't correct at all. If one specifies only Then, the 1.5 release note was quite confusing because it's not really an endless epoch with
|
I had a doubt while writing a test for the change being made. In the |
Here's one suggestion to test.
You can follow the pattern for the |
Okay, I'll do that. |
@ananthsub could you please help me understand what's going wrong? I think I've made the change, but many of the tests are failing. |
🚀 Feature
When the user configures
Trainer(max_steps=-1, max_epochs=-1)
an endless epoch runs, so overridingtraining_epoch_end
orvalidation_epoch_end
withval_check_interval==float
can be a problem because they will keep outputs in memory indefinitely.Motivation
Many users are not aware of the impact of overriding these hooks so infinite epochs open the door to "memory leaks".
Pitch
Raise a warning in this case informing the user of this behaviour.
Additional context
Proposed in #11480 (comment)
If you enjoy Lightning, check out our other projects! ⚡
Metrics: Machine learning metrics for distributed, scalable PyTorch applications.
Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.
Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.
Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.
Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.
cc @Borda @carmocca @awaelchli @ninginthecloud @daniellepintz @rohitgr7 @justusschock @kaushikb11
The text was updated successfully, but these errors were encountered: