Does Gluon-ts support early stopping? #555

kmosby1992 · 2020-01-09T15:24:58Z

Does Gluon-ts support early stopping (for example: stop training after the loss fails to reduce after 50 epochs)? I see that we can set the learning rate to reduce if the loss fails to go down but could not find any options for early stopping

lostella · 2020-01-09T15:41:51Z

@kmosby1992 you're right, there is no such an option currently. Maybe one natural way of modifying the training loop to get that, is to have the learning rate reduction mechanism to actually halt training once the minimum learning rate is hit, and no progress is done withing the prescribed patience. What do you think?

jackdent · 2020-01-18T16:54:42Z

How about add an optional additional argument to the Trainer which is a function that gets called at the end of each epoch, and returns either True or False (whether to continue training). This function could use a closure to keep track of the historical losses for each epoch, which would also be useful if you wanted to plot the loss curves after training e.g.

train_losses = []
validation_losses = []

def cb(training_loss, validation_loss):
    train_losses.append(training_loss)
    validation_losses.append(validation_loss)

    if len(train_losses) >= 50:
        return abs(train_losses[-50] - train_losses[-1]) > epsilon
    else:
        return True

trainer = Trainer(..., cb=cb)

lostella · 2020-01-20T17:00:06Z

@jackdent I think a callback mechanism like that could be useful in general, I agree we could have that option in the trainer. But even before that, maybe the current mechanism of learning rate scheduling should be adjusted so that once the patience is exceeded and the minimum stepsize is reached, the iterations stop.

MaximilianPavon · 2020-03-11T10:43:03Z

I'd love to see a callback mechanism being implemented for the Trainer e.g. for external logging.

lostella · 2020-03-13T08:43:48Z

@kmosby1992 the changes in #701 now allow for early stopping: essentially, the learning rate scheduler stop the training loop once the loss stops decreasing and the learning rate has reached the minimum

@jackdent @MaximilianProll I'll open a separate issue for the callback mechanism

MaximilianPavon · 2020-03-13T09:56:27Z

Thanks a lot, @lostella, can you link the issue here once it's created?

lostella · 2020-03-13T19:16:16Z

@MaximilianProll see #706

Arfea · 2020-06-01T09:57:28Z

I'd also love to see an implementation of callbacks. Would you suggest where to start if I want to implement a callback method?

kaijennissen · 2020-07-30T08:19:57Z

How exactly does the early stopping work? Is there a way to stop training if the validation loss does not decrease for a given number of iterations? As far as I understand the code, the patience arguement in Trainer controls the learning_rate_scheduler and not the early_stopping.

jaheba · 2020-07-30T08:44:56Z

There is a mechanism for early_stopping in the lrs of the trainer:

https://github.com/awslabs/gluon-ts/blob/e52864f7ee5d173dac38e7a984b9ea615397e2f2/src/gluonts/mx/trainer/learning_rate_scheduler.py#L124-L129

Once min_lr is reached, training stops. With the current default settings this is the case after five cycles of learning-rate reductions.

That said, there are probably more intuitive ways to set this behaviour.

kmosby1992 added the question Further information is requested label Jan 9, 2020

lostella added the enhancement New feature or request label Jan 19, 2020

StatMixedML mentioned this issue Feb 12, 2020

Whats the impact of choosing the best model based on train set vs validation set? #618

Open

lostella added good first issue Good for newcomers help wanted Extra attention is needed labels Feb 14, 2020

lostella mentioned this issue Mar 10, 2020

Improve trainer handling of learning rate scheduling and logging #701

Merged

lostella closed this as completed Mar 13, 2020

lostella mentioned this issue Mar 13, 2020

Callback mechanism in Trainer #706

Closed

kaijennissen mentioned this issue Nov 27, 2020

UnboundLocalError: local variable 'lv' referenced before assignment #997

Closed

DayanSiddiquiNXD mentioned this issue Dec 1, 2020

Validation-based early stopping #1184

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does Gluon-ts support early stopping? #555

Does Gluon-ts support early stopping? #555

kmosby1992 commented Jan 9, 2020

lostella commented Jan 9, 2020

jackdent commented Jan 18, 2020 •

edited

Loading

lostella commented Jan 20, 2020

MaximilianPavon commented Mar 11, 2020

lostella commented Mar 13, 2020

MaximilianPavon commented Mar 13, 2020

lostella commented Mar 13, 2020

Arfea commented Jun 1, 2020

kaijennissen commented Jul 30, 2020

jaheba commented Jul 30, 2020

Does Gluon-ts support early stopping? #555

Does Gluon-ts support early stopping? #555

Comments

kmosby1992 commented Jan 9, 2020

lostella commented Jan 9, 2020

jackdent commented Jan 18, 2020 • edited Loading

lostella commented Jan 20, 2020

MaximilianPavon commented Mar 11, 2020

lostella commented Mar 13, 2020

MaximilianPavon commented Mar 13, 2020

lostella commented Mar 13, 2020

Arfea commented Jun 1, 2020

kaijennissen commented Jul 30, 2020

jaheba commented Jul 30, 2020

jackdent commented Jan 18, 2020 •

edited

Loading