Skip to content

Trainer should overwrite max_epoch when max_time is specified #8210

@kevinNejad

Description

@kevinNejad

🚀 Feature

Currently, the Trainer terminates at max_epoch (1000) even if the max_time is specified. Even if min_epoch is set to be greater than 1000, the Trainer should either overwrite the max_epoch value, or the documentation should state that min_epoch is only applicable if early_stopping is enabled. It could also throw warning when min_epoch is set while early_stopping is disabled.

Motivation

To continue training for X amount of time (e.g 7 days) when you don't know the maximum number of epochs and no early_stopping/criteria is specified.

Pitch

I think overwriting max_epoch with values of min_epoch or max_time should be reasonable when they are larger than max_epoch. I wanted to train a big model for 2 days, but after 1 day if was stopped despite setting the max_time and min_epoch parameters.
If you are concerned with the duration of training, or you are interested in the asymptotic behavior of your model in infinite time horizon, it makes sense to stop training when it reaches a max_time regardless of max_epoch.

Alternatives

The alternative is to set a very large max_epoch to ensure the training won't stop until max_time is reached.

Metadata

Metadata

Assignees

Labels

featureIs an improvement or enhancementhelp wantedOpen to be worked on

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions