-
Notifications
You must be signed in to change notification settings - Fork 440
another ValidationMonitor with validation(+early stopping) per epoch #133
Comments
@dansbecker I also noticed the inefficiency mentioned in #102 by @mheilman. I think the inefficiency problem is in this loop: https://github.com/tensorflow/skflow/blob/master/skflow/trainer.py#L113 Can we consider moving the monitor.update() to https://github.com/tensorflow/skflow/blob/master/skflow/estimators/base.py#L236 ? def fit(self, X, y, monitor=None, logdir=None):
...
for epoch in range(monitor.n_epochs_max_tolerable):
self._trainer.train()
monitor.update()
if monitor.monitor_inducing_stop():
break In this way, the monitor is invoked every epoch to check over-fitting(is it called over-training or over-fitting?) and stop the fit process when over-training occurs. |
Actually may be a better option is to have monitor in a separate thread and just push some information into it from time to time from main thread. |
I've struggled with the inefficiency mentioned here as well. My validation set is 25,000 records (30% of my data), and my mini-batch is 20. When I use the ValidationMonitor, I end up training on 20 records and then calculating the validation error on 25,000 records, which slows my training by a 100x or more. Putting the monitor in a separate thread, as @ilblackdragon suggested, is interesting but won't solve the problem in every case. For example, if training a mini-batch takes 1 second and calculating the validation error takes a 100 seconds, then the monitor thread will fall behind and won't be able to stop the training in time. I solved this locally by modifying To address the original issue of this thread (validation every epoch), the value of If I get a thumbs up on this approach, I can create a PR for it. |
I think the problem you observe can be fixed by adding validation over On Sun, Apr 24, 2016 at 3:15 AM, Waleed notifications@github.com wrote:
Best regards, |
@ilblackdragon That's a good solution. I remember seeing a discussion about supporting more early stopping options, and what you mentioned seems like it belongs as part of that. In the meantime, if someone needs an urgent fix, here is the the two lines I changed to fix the performance issue for me. It simply calculates the validation error once every print_steps rather than with every step. |
Let me add this actually to the master - I think it's an important fix. On Fri, Apr 29, 2016 at 5:33 PM, Waleed notifications@github.com wrote:
Best regards, |
Feel like this is addressed in the latest version. Please submit an issue/PR to TensorFlow if it's not there. Thanks! |
From what I understand, the existing ValidationMonitor performs validation every [print_steps] steps, and checks for stop condition every [early_stopping_rounds] steps. I'd like to add another ValidationMonitor that performs validation once and checks for stoping condition once every epoch. Is this the recommended practice in machine learning regarding validation and early stopping? I mean I'd like to add a fit process something like this:
The text was updated successfully, but these errors were encountered: