Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tf.keras.callbacks.ProgbarLogger(count_mode='samples') does not work #38765

Closed
dfangs opened this issue Apr 21, 2020 · 8 comments
Closed

tf.keras.callbacks.ProgbarLogger(count_mode='samples') does not work #38765

dfangs opened this issue Apr 21, 2020 · 8 comments
Assignees
Labels
comp:keras Keras related issues Fixed in Nightly Issues that are resolved in nightly version regression issue To spot regression issues in latest version stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author TF 2.2 Issues related to TF 2.2 type:bug Bug

Comments

@dfangs
Copy link

dfangs commented Apr 21, 2020

System information
(I'm sorry, this is my first time writing an issue.)

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Google Colab
  • TensorFlow version (use command below): 2.2.0-rc3
  • Python version: 3.6

Describe the current behavior
I noticed from this commit that the default behavior of ProgbarLogger has been changed to always show the number of 'steps' instead of 'samples'. I was curious and tried to manually use a ProgbarLogger callback argument to Model.fit() with count_mode='samples' instead, but then an error showed up.

Describe the expected behavior
I expected it to work normally as with the older version of TensorFlow?

Standalone code to reproduce the issue

# Assuming we use mnist data set
model = Sequential([
    Flatten(),
    Dense(128, activation='relu'),
    Dense(10)
])

loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(optimizer='adam', loss=loss_fn, metrics=['accuracy'])

model.fit(x_train, y_train, callbacks=[tf.keras.callbacks.ProgbarLogger('steps')])

Other info / logs
On my local machine (TF 2.1), this is the default behavior:

Epoch 1/5
16500/16500 [==============================] - 3s 207us/sample - loss: 0.4841 - accuracy: 0.8584
Epoch 2/5
16500/16500 [==============================] - 2s 95us/sample - loss: 0.2430 - accuracy: 0.9276
Epoch 3/5
...

On Google Colab (TF 2.2), I got this when I tried my code:

0/Unknown - 1s 0s/sample - loss: 0.3902 - accuracy: 0.8912

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/utils/generic_utils.py:595: RuntimeWarning: divide by zero encountered in log10
  numdigits = int(np.log10(self.target)) + 1

---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
<ipython-input-51-834d420b09ab> in <module>()
      8 model.compile(optimizer='adam', loss=loss_fn, metrics=['accuracy'])
      9 
---> 10 model.fit(x_train, y_train, callbacks=[tf.keras.callbacks.ProgbarLogger('samples')])

5 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py in _method_wrapper(self, *args, **kwargs)
     64   def _method_wrapper(self, *args, **kwargs):
     65     if not self._in_multi_worker_mode():  # pylint: disable=protected-access
---> 66       return method(self, *args, **kwargs)
     67 
     68     # Running inside `run_distribute_coordinator` already.

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing, **kwargs)
    877           epoch_logs.update(val_logs)
    878 
--> 879         callbacks.on_epoch_end(epoch, epoch_logs)
    880         if self.stop_training:
    881           break

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/callbacks.py in on_epoch_end(self, epoch, logs)
    363     logs = self._process_logs(logs)
    364     for callback in self.callbacks:
--> 365       callback.on_epoch_end(epoch, logs)
    366 
    367   def on_train_batch_begin(self, batch, logs=None):

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/callbacks.py in on_epoch_end(self, epoch, logs)
    892 
    893   def on_epoch_end(self, epoch, logs=None):
--> 894     self._finalize_progbar(logs)
    895 
    896   def on_test_end(self, logs=None):

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/callbacks.py in _finalize_progbar(self, logs)
    933       self.progbar.target = self.seen
    934     logs = logs or {}
--> 935     self.progbar.update(self.seen, list(logs.items()), finalize=True)
    936 
    937 

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/utils/generic_utils.py in update(self, current, values, finalize)
    593 
    594       if self.target is not None:
--> 595         numdigits = int(np.log10(self.target)) + 1
    596         bar = ('%' + str(numdigits) + 'd/%d [') % (current, self.target)
    597         prog = float(current) / self.target

OverflowError: cannot convert float infinity to integer
@dfangs dfangs added the type:bug Bug label Apr 21, 2020
@ravikyram ravikyram added the TF 2.2 Issues related to TF 2.2 label Apr 22, 2020
@ravikyram
Copy link
Contributor

@dfangs

Looks like code is incomplete. Request you to provide colab link or simple standalone code to reproduce the issue reported here.It helps us in localizing the issue faster.Thanks!

@ravikyram ravikyram added the stat:awaiting response Status - Awaiting response from author label Apr 22, 2020
@dfangs
Copy link
Author

dfangs commented Apr 22, 2020

@ravikyram

Here is the link to a simple colab that I made, thank you.

@goldiegadde goldiegadde added the regression issue To spot regression issues in latest version label Apr 22, 2020
@goldiegadde
Copy link
Contributor

@dfangs as noted here can you add verbose to model.fit() call. this should work.

@dfangs
Copy link
Author

dfangs commented Apr 22, 2020

@goldiegadde But that would default to count_mode='steps'. The thing is I would like the progress bar to display the number of samples (instead of steps), which is not the default behavior per the commit I mentioned above. Since apparently there is no way to configure that, I tried to manually insert tf.keras.callbacks.ProgbarLogger(count_mode='samples') into the callbacks parameter. I wish there was an option to configure that.

I looked at the link you gave me but it seems that the problem wasn't resolved. Hopefully this makes it clear.

@ravikyram ravikyram added the comp:keras Keras related issues label Apr 23, 2020
@tensorflowbutler tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Apr 25, 2020
@geetachavan1 geetachavan1 added this to To do in TensorFlow 2.3.0 Apr 28, 2020
@geetachavan1 geetachavan1 removed this from To do in TensorFlow 2.3.0 Apr 28, 2020
@geetachavan1 geetachavan1 added this to To do in TensorFlow 2.3.0 Apr 28, 2020
@ravikyram ravikyram assigned jvishnuvardhan and unassigned ravikyram May 5, 2020
@jvishnuvardhan jvishnuvardhan added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label May 5, 2020
@goldiegadde goldiegadde removed this from To do in TensorFlow 2.3.0 Sep 11, 2020
@goldiegadde goldiegadde added this to To do in TensorFlow 2.4.0 via automation Sep 11, 2020
@ravikyram
Copy link
Contributor

@dfangs

I tried in colab with TF nightly version(2.4.0-dev20200916) and i am not seeing any issue.Please, find the gist here..Please, verify once and close the issue. Thanks!

@ravikyram ravikyram added stat:awaiting response Status - Awaiting response from author and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Sep 17, 2020
@google-ml-butler
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.

@google-ml-butler google-ml-butler bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Sep 25, 2020
@google-ml-butler
Copy link

Closing as stale. Please reopen if you'd like to work on this further.

TensorFlow 2.4.0 automation moved this from To do to Done Oct 2, 2020
@google-ml-butler
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

@ravikyram ravikyram added the Fixed in Nightly Issues that are resolved in nightly version label Oct 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:keras Keras related issues Fixed in Nightly Issues that are resolved in nightly version regression issue To spot regression issues in latest version stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author TF 2.2 Issues related to TF 2.2 type:bug Bug
Projects
Development

No branches or pull requests

7 participants