Ploting Gradients to Tensorboard and Console #31542

SPP3000 · 2019-08-12T13:46:08Z

System information

Windows 10 Pro Version 1903
TensorFlow installed from pip in Anaconda:
TensorFlow version 2.0.0-beta1 (gpu)
Python version: 3.7.3 (default, Apr 24 2019, 15:29:51) [MSC v.1915 64 bit (AMD64)]
CUDA/cuDNN version: Cuda compilation tools, release 10.0, V10.0.130
GPU model and memory: GeForce GTX 980 Ti major: 5 minor: 2 memoryClockRate(GHz): 1.2785

Describe the current behavior
Program ends with an unclear error, while trying to retrieve the bias gradients of the two dense layers in the model.

Writing to tensorboard (console parameter = False)

Train on 60000 samples
Epoch 1/5
2019-08-12 15:24:48.713962: I tensorflow/core/profiler/lib/profiler_session.cc:174] Profiler session started.
2019-08-12 15:24:48.718362: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library cupti64_100.dll
   32/60000 [..............................] - ETA: 12:48 - loss: 2.2374 - accuracy: 0.18752019-08-12 15:24:48.840661: I tensorflow/core/platform/default/device_tracer.cc:641] Collecting 81 kernel records, 14 memcpy rec
ords.
59744/60000 [============================>.] - ETA: 0s - loss: 0.2971 - accuracy: 0.9123Traceback (most recent call last):
  File "C:/Users/Harald Schweiger/PycharmProjects/Gradients/gradient_test.py", line 42, in <module>
    model.fit(x_train, y_train, epochs=5, callbacks=[gradient_cb, tensorboard_cb])
  File "C:\Users\Harald Schweiger\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\training.py", line 643, in fit
    use_multiprocessing=use_multiprocessing)
  File "C:\Users\Harald Schweiger\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\training_arrays.py", line 664, in fit
    steps_name='steps_per_epoch')
  File "C:\Users\Harald Schweiger\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\training_arrays.py", line 439, in model_iteration
    callbacks.on_epoch_end(epoch, epoch_logs)
  File "C:\Users\Harald Schweiger\Anaconda3\lib\site-packages\tensorflow\python\keras\callbacks.py", line 295, in on_epoch_end
    callback.on_epoch_end(epoch, logs)
  File "C:/Users/Harald Schweiger/PycharmProjects/Gradients/gradient_test.py", line 32, in on_epoch_end
    tf.summary.histogram(t.name, data=t)
  File "C:\Users\Harald Schweiger\Anaconda3\lib\site-packages\tensorboard\plugins\histogram\summary_v2.py", line 77, in histogram
    tensor = _buckets(data, bucket_count=buckets)
  File "C:\Users\Harald Schweiger\Anaconda3\lib\site-packages\tensorboard\plugins\histogram\summary_v2.py", line 139, in _buckets
    return tf.cond(is_empty, when_empty, when_nonempty)
  File "C:\Users\Harald Schweiger\Anaconda3\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 1382, in cond_for_tf_v2
    return cond(pred, true_fn=true_fn, false_fn=false_fn, strict=True, name=name)
  File "C:\Users\Harald Schweiger\Anaconda3\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "C:\Users\Harald Schweiger\Anaconda3\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 1177, in cond
    result = false_fn()
  File "C:\Users\Harald Schweiger\Anaconda3\lib\site-packages\tensorboard\plugins\histogram\summary_v2.py", line 137, in when_nonempty
    return tf.cond(is_singular, when_singular, when_nonsingular)
  File "C:\Users\Harald Schweiger\Anaconda3\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 1382, in cond_for_tf_v2
    return cond(pred, true_fn=true_fn, false_fn=false_fn, strict=True, name=name)
  File "C:\Users\Harald Schweiger\Anaconda3\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "C:\Users\Harald Schweiger\Anaconda3\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 1174, in cond
    if pred:
  File "C:\Users\Harald Schweiger\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 698, in __bool__
    raise TypeError("Using a `tf.Tensor` as a Python `bool` is not allowed. "
TypeError: Using a `tf.Tensor` as a Python `bool` is not allowed. Use `if t is not None:` instead of `if t:` to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the
 value of a tensor.

Priniting bias gradients to console(console parameter = True)

Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Train on 60000 samples
Epoch 1/5
2019-08-12 15:26:01.265400: I tensorflow/core/profiler/lib/profiler_session.cc:174] Profiler session started.
2019-08-12 15:26:01.268877: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library cupti64_100.dll
   32/60000 [..............................] - ETA: 12:53 - loss: 2.4094 - accuracy: 0.09382019-08-12 15:26:01.391432: I tensorflow/core/platform/default/device_tracer.cc:641] Collecting 81 kernel records, 14 memcpy rec
ords.
59776/60000 [============================>.] - ETA: 0s - loss: 0.3015 - accuracy: 0.9121Tensor: Adam/gradients_1/dense128/BiasAdd_grad/BiasAddGrad:0
Traceback (most recent call last):
  File "C:/Users/Harald Schweiger/PycharmProjects/Gradients/gradient_test.py", line 42, in <module>
    model.fit(x_train, y_train, epochs=5, callbacks=[gradient_cb, tensorboard_cb])
  File "C:\Users\Harald Schweiger\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\training.py", line 643, in fit
    use_multiprocessing=use_multiprocessing)
  File "C:\Users\Harald Schweiger\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\training_arrays.py", line 664, in fit
    steps_name='steps_per_epoch')
  File "C:\Users\Harald Schweiger\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\training_arrays.py", line 439, in model_iteration
    callbacks.on_epoch_end(epoch, epoch_logs)
  File "C:\Users\Harald Schweiger\Anaconda3\lib\site-packages\tensorflow\python\keras\callbacks.py", line 295, in on_epoch_end
    callback.on_epoch_end(epoch, logs)
  File "C:/Users/Harald Schweiger/PycharmProjects/Gradients/gradient_test.py", line 30, in on_epoch_end
    print('{}\n'.format(K.get_value(t)[:10]))
  File "C:\Users\Harald Schweiger\Anaconda3\lib\site-packages\tensorflow\python\keras\backend.py", line 2981, in get_value
    return x.numpy()
AttributeError: 'Tensor' object has no attribute 'numpy'

Describe the expected behavior

Writing to tensorboard (console parameter = False)
Tensorboard event file which contains the distribution and histograms of gradients
derived from the total loss that has been accumulated over the last epoch.
Priniting to console(console parameter = True)
The program should print the first 10 gradient bias values of each of the two dense layer
to the console.

If the exceptions produced here are the expected behavior due to errors in the developers code
a more meaningful error message would be appriciated.
In that case a correction of the code would be useful for me and other people as well who had to update their code as the write_grads parameter has been removed from the tensorboard callback in version 2.0.

Code to reproduce the issue

import tensorflow as tf
from tensorflow.python.keras import backend as K

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu', name='dense128'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax', name='dense10')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])


class GradientCallback(tf.keras.callbacks.Callback):
    console = True

    def on_epoch_end(self, epoch, logs=None):
        weights = [w for w in self.model.trainable_weights if 'dense' in w.name and 'bias' in w.name]
        loss = self.model.total_loss
        optimizer = self.model.optimizer
        gradients = optimizer.get_gradients(loss, weights)
        for t in gradients:
            if self.console:
                print('Tensor: {}'.format(t.name))
                print('{}\n'.format(K.get_value(t)[:10]))
            else:
                tf.summary.histogram(t.name, data=t)


file_writer = tf.summary.create_file_writer("./metrics")
file_writer.set_as_default()

# write_grads has been removed
tensorboard_cb = tf.keras.callbacks.TensorBoard(histogram_freq=1, write_grads=True)
gradient_cb = GradientCallback()

model.fit(x_train, y_train, epochs=5, callbacks=[gradient_cb, tensorboard_cb])

The text was updated successfully, but these errors were encountered:

oanush · 2019-08-13T05:54:54Z

Issue replicating for TF version 2.0beta,please find the Gist of Colab. Thanks!

richardwth · 2020-04-22T09:57:34Z

I also find it challenging to plot gradients to Tensorboard in TF 2.2.0-rc3 on Colab. My case is different from @SPP3000 in that instead of

if self.console:
    print('Tensor: {}'.format(t.name))
    print('{}\n'.format(K.get_value(t)[:10]))
else:
    tf.summary.histogram(t.name, data=t)

I simply have tf.summary.histogram(t.name, data=t). The error I came across is:

AttributeError: 'Sequential' object has no attribute 'total_loss'

Using tf.keras.Model, the error became 'Model' object has no attribute 'total_loss'.

teodor440 · 2020-05-18T18:02:51Z

Im also affected by the issue

richardwth · 2020-05-18T23:59:47Z

Here is a workaround where the gradients are explicitly calculated. It avoids the total_loss error I mentioned above.

class ExtendedTensorBoard(tf.keras.callbacks.TensorBoard):
  def _log_gradients(self, epoch):
    step = tf.cast(tf.math.floor((epoch+1)*num_instance/batch_size), dtype=tf.int64)
    writer = self._get_writer(self._train_run_name)

    with writer.as_default(), tf.GradientTape() as g:
      # here we use test data to calculate the gradients
      _x_batch = x_te[:100]
      _y_batch = y_te[:100]

      g.watch(_x_batch)
      _y_pred = self.model(_x_batch)  # forward-propagation
      loss = self.model.loss(y_true=_y_batch, y_pred=_y_pred)  # calculate loss
      gradients = g.gradient(loss, self.model.trainable_weights)  # back-propagation

      # In eager mode, grads does not have name, so we get names from model.trainable_weights
      for weights, grads in zip(self.model.trainable_weights, gradients):
        tf.summary.histogram(
            weights.name.replace(':', '_')+'_grads', data=grads, step=step)
    
    writer.flush()

  def on_epoch_end(self, epoch, logs=None):  
    # This function overwrites the on_epoch_end in tf.keras.callbacks.TensorBoard
    # but we do need to run the original on_epoch_end, so here we use the super function. 
    super(ExtendedTensorBoard, self).on_epoch_end(epoch, logs=logs)

    if self.histogram_freq and epoch % self.histogram_freq == 0:
      self._log_gradients(epoch)

ExtendedTensorBoard can then be used in replace of tf.keras.callbacks.TensorBoard.

mmehedin · 2020-11-12T18:44:38Z

Thanks for posting the code. I am getting and error saying:
NameError: name 'num_instance' is not defined
The same goes for batch_size, x_te, y_te. How can I fix this. Thanks in advance.

Alwaysproblem · 2021-03-01T04:01:05Z

Thanks for posting the code. I am getting and error saying:
NameError: name 'num_instance' is not defined
The same goes for batch_size, x_te, y_te. How can I fix this. Thanks in advance.

you can try this code

import tensorflow as tf
from tensorflow.python.keras import backend as K

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu', name='l_1st'),
  tf.keras.layers.Dense(128, activation='relu', name='l_2nd'),
  tf.keras.layers.Dense(128, activation='relu', name='l_3rd'),
  tf.keras.layers.Dense(128, activation='relu', name='l_4th'),
  tf.keras.layers.Dense(128, activation='relu', name='l_5th'),
  tf.keras.layers.Dense(128, activation='relu', name='l_6th'),
  tf.keras.layers.Dense(128, activation='relu', name='l_7th'),
  tf.keras.layers.Dense(128, activation='relu', name='l_8th'),
  tf.keras.layers.Dense(128, activation='relu', name='l_9th'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax', name='dense10')
])

l = tf.keras.losses.SparseCategoricalCrossentropy()
opt = tf.keras.optimizers.Adam(0.001)

model.compile(optimizer=opt, loss=l, metrics=['accuracy'])

class ExtendedTensorBoard(tf.keras.callbacks.TensorBoard):

  def _log_gradients(self, epoch):
    step = tf.cast(epoch, dtype=tf.int64)
    writer = self._train_writer
    # writer = self._get_writer(self._train_run_name)

    with writer.as_default(), tf.GradientTape() as g:
      # here we use test data to calculate the gradients
      _x_batch = x_train[:100]
      _y_batch = y_train[:100]

      g.watch(tf.convert_to_tensor(_x_batch))
      _y_pred = self.model(_x_batch)  # forward-propagation
      loss = self.model.loss(y_true=_y_batch, y_pred=_y_pred)  # calculate loss
      gradients = g.gradient(loss, self.model.trainable_weights)  # back-propagation

      # In eager mode, grads does not have name, so we get names from model.trainable_weights
      for weights, grads in zip(self.model.trainable_weights, gradients):
        tf.summary.histogram(
            weights.name.replace(':', '_')+'_grads', data=grads, step=step)

    writer.flush()

  def on_epoch_end(self, epoch, logs=None):  
  # def on_train_batch_end(self, batch, logs=None):  
    # This function overwrites the on_epoch_end in tf.keras.callbacks.TensorBoard
    # but we do need to run the original on_epoch_end, so here we use the super function. 
    super(ExtendedTensorBoard, self).on_epoch_end(epoch, logs=logs)
    # super(ExtendedTensorBoard, self).on_train_batch_end(batch, logs=logs)
    if self.histogram_freq and epoch % self.histogram_freq == 0:
      self._log_gradients(epoch)

ee = ExtendedTensorBoard(histogram_freq=1, write_images=True, update_freq='batch')
model.fit(x_train, y_train, epochs=10, callbacks=[ee], validation_data=(x_test, y_test), )
# model.fit(x_train, y_train, epochs=5, callbacks=[gradient_cb, tensorboard_cb])

sushreebarsa · 2021-05-27T17:47:48Z

I tried to run the code on colab using tf v2.5 and faced attribute error ,please find the gist here..Thanks !

sachinprasadhs · 2021-06-23T18:42:07Z

Now i'm able to get specific error message in the recent Tensorflow version, please find the gist here and confirm the same. Thanks!

google-ml-butler · 2021-07-03T01:13:39Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.

google-ml-butler · 2021-07-10T01:47:08Z

Closing as stale. Please reopen if you'd like to work on this further.

google-ml-butler · 2021-07-10T01:47:52Z

Are you satisfied with the resolution of your issue?
Yes
No

oanush self-assigned this Aug 13, 2019

oanush added 2.0.0-beta0 comp:tensorboard Tensorboard related issues type:bug Bug labels Aug 13, 2019

oanush assigned ymodak and unassigned oanush Aug 13, 2019

ymodak assigned gowthamkpr and unassigned ymodak Aug 13, 2019

gowthamkpr assigned nfelt and unassigned gowthamkpr Sep 3, 2019

gowthamkpr added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Sep 3, 2019

jvishnuvardhan added TF 2.0 Issues relating to TensorFlow 2.0 and removed TF 2.0.0-beta0 labels Oct 8, 2019

sachinprasadhs added stat:awaiting response Status - Awaiting response from author and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Jun 23, 2021

google-ml-butler bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Jul 3, 2021

google-ml-butler bot closed this as completed Jul 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ploting Gradients to Tensorboard and Console #31542

Ploting Gradients to Tensorboard and Console #31542

SPP3000 commented Aug 12, 2019 •

edited

oanush commented Aug 13, 2019 •

edited

richardwth commented Apr 22, 2020

teodor440 commented May 18, 2020

richardwth commented May 18, 2020

mmehedin commented Nov 12, 2020 •

edited

Alwaysproblem commented Mar 1, 2021

sushreebarsa commented May 27, 2021 •

edited

sachinprasadhs commented Jun 23, 2021

google-ml-butler bot commented Jul 3, 2021

google-ml-butler bot commented Jul 10, 2021

google-ml-butler bot commented Jul 10, 2021

Ploting Gradients to Tensorboard and Console #31542

Ploting Gradients to Tensorboard and Console #31542

Comments

SPP3000 commented Aug 12, 2019 • edited

oanush commented Aug 13, 2019 • edited

richardwth commented Apr 22, 2020

teodor440 commented May 18, 2020

richardwth commented May 18, 2020

mmehedin commented Nov 12, 2020 • edited

Alwaysproblem commented Mar 1, 2021

sushreebarsa commented May 27, 2021 • edited

sachinprasadhs commented Jun 23, 2021

google-ml-butler bot commented Jul 3, 2021

google-ml-butler bot commented Jul 10, 2021

google-ml-butler bot commented Jul 10, 2021

SPP3000 commented Aug 12, 2019 •

edited

oanush commented Aug 13, 2019 •

edited

mmehedin commented Nov 12, 2020 •

edited

sushreebarsa commented May 27, 2021 •

edited