Messages "Fallback to op-by-op mode because memset node breaks graph update" since Keras 3, don't occur in Keras 2 #19081

SteffenBauer · 2024-01-22T08:11:26Z

Since Keras 3, messages "Fallback to op-by-op mode because memset node breaks graph update" start to clutter the log output.

Example code, plain vanilla MNIST network:

#!/usr/bin/env python3

import os
os.environ["KERAS_BACKEND"] = "tensorflow"
import keras
import numpy as np

(train_images, train_labels), (test_images, test_labels) = keras.datasets.mnist.load_data()
train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype(keras.backend.floatx()) / 255
test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype(keras.backend.floatx()) / 255

train_labels = keras.utils.to_categorical(train_labels)
test_labels = keras.utils.to_categorical(test_labels)

def network_basic():
    inp = keras.layers.Input(shape = (28, 28, 1), name='Input')
    x = keras.layers.Conv2D(20, (3, 3), activation='relu', name='Conv_1')(inp)
    x = keras.layers.MaxPooling2D((2, 2), name='Pool_1')(x)
    x = keras.layers.Conv2D(50, (3, 3), activation='relu', name='Conv_2')(x)
    x = keras.layers.MaxPooling2D((2, 2), name='Pool_2')(x)
    x = keras.layers.Flatten()(x)
    x = keras.layers.Dense(500, activation='relu')(x)
    out = keras.layers.Dense(10, activation='softmax', name='predictions')(x)
    network = keras.models.Model(inputs=inp, outputs=out)
    return network

network = network_basic()
network.compile(optimizer=keras.optimizers.SGD(learning_rate=0.01, momentum=0.9),
                loss='categorical_crossentropy',
                metrics=['accuracy'])
network.summary()
history = network.fit(train_images, train_labels, epochs=5, batch_size=64, validation_data=(test_images, test_labels))
test_loss, test_acc = network.evaluate(test_images, test_labels)
print()
print("Test loss", test_loss)
print("Test accuracy", test_acc)

Output with Keras 3.0.4:

Model: "functional_1"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Layer (type)                       ┃ Output Shape                  ┃     Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ Input (InputLayer)                 │ (None, 28, 28, 1)             │           0 │
├────────────────────────────────────┼───────────────────────────────┼─────────────┤
│ Conv_1 (Conv2D)                    │ (None, 26, 26, 20)            │         200 │
├────────────────────────────────────┼───────────────────────────────┼─────────────┤
│ Pool_1 (MaxPooling2D)              │ (None, 13, 13, 20)            │           0 │
├────────────────────────────────────┼───────────────────────────────┼─────────────┤
│ Conv_2 (Conv2D)                    │ (None, 11, 11, 50)            │       9,050 │
├────────────────────────────────────┼───────────────────────────────┼─────────────┤
│ Pool_2 (MaxPooling2D)              │ (None, 5, 5, 50)              │           0 │
├────────────────────────────────────┼───────────────────────────────┼─────────────┤
│ flatten (Flatten)                  │ (None, 1250)                  │           0 │
├────────────────────────────────────┼───────────────────────────────┼─────────────┤
│ dense (Dense)                      │ (None, 500)                   │     625,500 │
├────────────────────────────────────┼───────────────────────────────┼─────────────┤
│ predictions (Dense)                │ (None, 10)                    │       5,010 │
└────────────────────────────────────┴───────────────────────────────┴─────────────┘
 Total params: 639,760 (2.44 MB)
 Trainable params: 639,760 (2.44 MB)
 Non-trainable params: 0 (0.00 B)
Epoch 1/5
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1705909017.454655    1642 device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
W0000 00:00:1705909017.473550    1642 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
932/938 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - accuracy: 0.8086 - loss: 0.6093W0000 00:00:1705909028.941317    1643 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
938/938 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - accuracy: 0.8093 - loss: 0.6070W0000 00:00:1705909030.596956    1641 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
938/938 ━━━━━━━━━━━━━━━━━━━━ 23s 16ms/step - accuracy: 0.8094 - loss: 0.6066 - val_accuracy: 0.9776 - val_loss: 0.0684
Epoch 2/5
938/938 ━━━━━━━━━━━━━━━━━━━━ 8s 8ms/step - accuracy: 0.9793 - loss: 0.0674 - val_accuracy: 0.9865 - val_loss: 0.0406
Epoch 3/5
938/938 ━━━━━━━━━━━━━━━━━━━━ 8s 8ms/step - accuracy: 0.9872 - loss: 0.0411 - val_accuracy: 0.9874 - val_loss: 0.0377
Epoch 4/5
938/938 ━━━━━━━━━━━━━━━━━━━━ 8s 8ms/step - accuracy: 0.9907 - loss: 0.0305 - val_accuracy: 0.9870 - val_loss: 0.0379
Epoch 5/5
938/938 ━━━━━━━━━━━━━━━━━━━━ 8s 8ms/step - accuracy: 0.9915 - loss: 0.0273 - val_accuracy: 0.9896 - val_loss: 0.0320
W0000 00:00:1705909064.679903    1644 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
313/313 ━━━━━━━━━━━━━━━━━━━━ 2s 5ms/step - accuracy: 0.9856 - loss: 0.0412

Test loss 0.032079923897981644
Test accuracy 0.9896000027656555

Output same code with Keras 2.15.0:

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #
=================================================================
 Input (InputLayer)          [(None, 28, 28, 1)]       0

 Conv_1 (Conv2D)             (None, 26, 26, 20)        200

 Pool_1 (MaxPooling2D)       (None, 13, 13, 20)        0

 Conv_2 (Conv2D)             (None, 11, 11, 50)        9050

 Pool_2 (MaxPooling2D)       (None, 5, 5, 50)          0

 flatten (Flatten)           (None, 1250)              0

 dense (Dense)               (None, 500)               625500

 predictions (Dense)         (None, 10)                5010

=================================================================
Total params: 639760 (2.44 MB)
Trainable params: 639760 (2.44 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/5
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1705909693.429130    2404 device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
938/938 [==============================] - 31s 17ms/step - loss: 0.2407 - accuracy: 0.9257 - val_loss: 0.0857 - val_accuracy: 0.9730
Epoch 2/5
938/938 [==============================] - 14s 15ms/step - loss: 0.0645 - accuracy: 0.9800 - val_loss: 0.0516 - val_accuracy: 0.9840
Epoch 3/5
938/938 [==============================] - 14s 15ms/step - loss: 0.0435 - accuracy: 0.9863 - val_loss: 0.0369 - val_accuracy: 0.9872
Epoch 4/5
938/938 [==============================] - 14s 15ms/step - loss: 0.0318 - accuracy: 0.9899 - val_loss: 0.0404 - val_accuracy: 0.9861
Epoch 5/5
938/938 [==============================] - 14s 15ms/step - loss: 0.0262 - accuracy: 0.9918 - val_loss: 0.0307 - val_accuracy: 0.9890
313/313 [==============================] - 2s 8ms/step - loss: 0.0307 - accuracy: 0.9890

Test loss 0.030733594670891762
Test accuracy 0.9890000224113464

These messages also appear in some of the examples on the official Keras webpage, for example here:
https://keras.io/examples/generative/vae/

so it does not seem to be a big deal, but I wanted to report this, as it catched my attention, these messages are cluttering the log, and it perhaps should be investigated what is going on here and if this might indicate some deeper issue.

Environment:

Ubuntu 22.04, ARM64 Jetson Orin Dev kit
Tensorflow 2.15.0
Keras 3.0.4 / 2.15.0

The text was updated successfully, but these errors were encountered:

sachinprasadhs · 2024-01-23T23:23:54Z

Hi, This warning is not related to or handled by Keras, it is coming from the tensorflow backend you are using.
When tensorflow uses XLA, for various reason they choose to go ahead with op-to-op mode, one such reason is here https://github.com/tensorflow/tensorflow/blob/master/third_party/xla/xla/service/gpu/runtime/graph_launch.cc#L637-L639.
You can find many in the same file and same thing is displayed as a warning in the output.

SteffenBauer · 2024-01-24T08:11:39Z

I see, I already suspected that it is something with the backend. I am just puzzled why the message only happens with Keras 3, but not with Keras 2. Both Keras versions log that XLA was used, but only Keras 3 shows that warning.

qlzh727 · 2024-01-25T17:58:40Z

Triage notes: one reason might be that Keras 3 use jit compile by default, and Keras 2 doesn't (user need to explicitly enable jit), and the warning message is probably from jit/xla.

juneedpk · 2024-04-01T10:24:47Z

how we can try to avoid that

github-actions bot assigned sachinprasadhs Jan 22, 2024

sachinprasadhs added the type:Bug label Jan 23, 2024

sachinprasadhs added backend:tensorflow stat:awaiting response from contributor labels Jan 23, 2024

google-ml-butler bot removed the stat:awaiting response from contributor label Jan 24, 2024

sachinprasadhs added the keras-team-review-pending Pending review by a Keras team member. label Jan 24, 2024

qlzh727 assigned haifeng-jin Jan 25, 2024

qlzh727 removed the keras-team-review-pending Pending review by a Keras team member. label Jan 25, 2024

tilakrayal mentioned this issue Apr 29, 2024

W0000 00:00:1714315704.778368 2252 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update tensorflow/tensorflow#66589

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Messages "Fallback to op-by-op mode because memset node breaks graph update" since Keras 3, don't occur in Keras 2 #19081

Messages "Fallback to op-by-op mode because memset node breaks graph update" since Keras 3, don't occur in Keras 2 #19081

SteffenBauer commented Jan 22, 2024 •

edited

sachinprasadhs commented Jan 23, 2024

SteffenBauer commented Jan 24, 2024

qlzh727 commented Jan 25, 2024

juneedpk commented Apr 1, 2024

Messages "Fallback to op-by-op mode because memset node breaks graph update" since Keras 3, don't occur in Keras 2 #19081

Messages "Fallback to op-by-op mode because memset node breaks graph update" since Keras 3, don't occur in Keras 2 #19081

Comments

SteffenBauer commented Jan 22, 2024 • edited

sachinprasadhs commented Jan 23, 2024

SteffenBauer commented Jan 24, 2024

qlzh727 commented Jan 25, 2024

juneedpk commented Apr 1, 2024

SteffenBauer commented Jan 22, 2024 •

edited