tensorflow.keras behaves wrongly in autoencoder setup compared to keras #39432

fedxa · 2020-05-11T23:36:04Z

Attempt to recreate the simplistic autoencoder works properly if keras (2.3.1) is used, but fails to converge if tensorflow.keras (2.3.0-tf).

The basic realisation can be found in https://gist.github.com/fedxa/45eb1a412964ddf19820fff347c5b2de

from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model
# from keras.layers import Input, Dense
# from keras.models import Model

input_img = Input(shape=(784,))
encoded = Dense(32, activation='relu')(input_img)
decoded = Dense(784, activation='sigmoid')(encoded)

autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

from keras.datasets import mnist
import numpy as np
(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))

autoencoder.fit(x_train, x_train,
                epochs=20,
                batch_size=256,
                shuffle=True,
                validation_data=(x_test, x_test))

If the separate keras 2.3.1 is used -- the example converges fast, if the tensorflow.keras 2.3.0-tf is used no convergence is observed (loss funciton is ~0.6 all the time and autoencoder encodes noise only).

The problem is present in google collab, on linux and MacOS with tensorflow versions 2.0, 2.1, 2.2

The text was updated successfully, but these errors were encountered:

iobtl · 2020-05-12T05:50:56Z

It seems like the problem may lie with optimizer='adadelta'.
Running the code example provided above with the following modification seems to run fine:

autoencoder.compile(optimizer='adam', loss='binary_crossentropy')

Try using this instead for now.

Saduf2019 · 2020-05-12T05:55:34Z

@fedxa
Please update as per above comment

fedxa · 2020-05-12T09:08:52Z

@Saduf2019 Oh, yes! The difference is in the default learning rate for the Adadelta optimizer in keras and tensorflow.keras

The keras version has the defaults:

class Adadelta(Optimizer):
    def __init__(self, learning_rate=1.0, rho=0.95, **kwargs):

whiel the tensorflow.keras is

class Adadelta(optimizer_v2.OptimizerV2):
    def __init__(self,
               learning_rate=0.001,
               rho=0.95,
               epsilon=1e-7,
               name='Adadelta',
               **kwargs):

Initialising the optimiser explicitly with the same learning rate makes the example behave in the same way for both keras versions.

Seems it is a feature, not a bug.

gowthamkpr · 2020-05-12T17:30:40Z

@fedxa This has been clearly mentioned in the docs here
learning_rate: A Tensor, floating point value, or a schedule that is a tf.keras.optimizers.schedules.LearningRateSchedule. The learning rate. To match the exact form in the original paper use 1.0

More over the learning rate of all the optimizers have been set to 0.001. This is expected.

fedxa · 2020-05-12T18:08:27Z

@gowthamkpr Yes, for sure, missed this bit!
By the way, for the documentation writers -- it may be sensible to change on the mentioned page

x_t := x_{t-1}+\Delta x_t

to

x_t := x_{t-1}+ \lambda \Delta x_t

with \lambda being the learning rate.

iobtl · 2020-05-12T22:49:25Z

@fedxa The algorithm is correct. The learning rate is contained within the term \Delta x_t, since it employs an adaptive learning rate. Look at the Adadelta paper for more information.

fedxa · 2020-05-13T09:09:40Z

@iobtl Hmm, seems the actual code in tensorflow/python/keras/optimizers.py#L448 does not really correspond to Algorithm 1 in the paper, but is additionally multiplied by the learning rate. C.f. also line-by-line comparison https://stackoverflow.com/questions/56730888/what-is-the-learning-rate-parameter-in-adadelta-optimiser-for-in-keras
So seems that there is an additional learning rate introduced in the code compared to the "automatic" calculation "RMS[Δx]_{t-1}/RMS[g]_t" of learning rate by the algorithm given in the paper and in documentation.

short-circuitt · 2020-05-29T10:39:43Z

Similar observation here. I believe it is related to the tf.keras.losses binary_crossentropy implementation.

I have run 3 versions in Colab of the Keras VAE tutorial:
https://keras.io/examples/variational_autoencoder/

Version 1 using Keras
Version 2 using tf.keras
Version 3 using tf.keras + a non-standard implementation of binary_crossentropy

Version 1 and 3 converge, version 2 does not

Results:
Version 1:
loss: 156.5520 - val_loss: 156.6273

Version 2:
loss: 460.1004 - val_loss: 459.4628

Version 3:
loss: 157.9199 - val_loss: 158.2236

Colab notebook:
https://gist.github.com/short-circuitt/3f2a004f6726d03f06785b9d2accfa23

fedxa · 2020-05-29T14:30:37Z

Looks like a separate problem for me... Mine was clearly the initial learning rate in AdaDelta optimizer. Yours looks like different binary_crossentropy realizations

short-circuitt · 2020-06-11T07:32:57Z

Hi, thanks for your message, it does look like a separate problem so I will open a different issue for it.

jvishnuvardhan · 2022-03-01T08:36:19Z

@fedxa This is a stale issue. Now, Keras code moved to separate repo keras-team/keras and the code is same whether it is imported from Keras or tf.keras.

Can we close this issue? Thanks!

google-ml-butler · 2022-03-08T09:31:13Z

This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.

google-ml-butler · 2022-03-15T10:27:57Z

Closing as stale. Please reopen if you'd like to work on this further.

fedxa added the type:bug Bug label May 11, 2020

google-ml-butler bot assigned Saduf2019 May 11, 2020

fedxa mentioned this issue May 11, 2020

Problems with running simple autoencoder in R keras rstudio/keras#1037

Closed

Saduf2019 added the comp:keras Keras related issues label May 12, 2020

Saduf2019 added the stat:awaiting response Status - Awaiting response from author label May 12, 2020

Saduf2019 added type:feature Feature requests and removed stat:awaiting response Status - Awaiting response from author type:bug Bug labels May 12, 2020

Saduf2019 assigned gowthamkpr and unassigned Saduf2019 May 12, 2020

gowthamkpr added the stat:awaiting response Status - Awaiting response from author label May 12, 2020

gowthamkpr assigned omalleyt12 and unassigned gowthamkpr May 14, 2020

gowthamkpr added stat:awaiting tensorflower Status - Awaiting response from tensorflower and removed stat:awaiting response Status - Awaiting response from author labels May 14, 2020

jvishnuvardhan self-assigned this Mar 1, 2022

jvishnuvardhan added stat:awaiting response Status - Awaiting response from author and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Mar 1, 2022

google-ml-butler bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Mar 8, 2022

google-ml-butler bot closed this as completed Mar 15, 2022

jvishnuvardhan unassigned omalleyt12 Mar 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tensorflow.keras behaves wrongly in autoencoder setup compared to keras #39432

tensorflow.keras behaves wrongly in autoencoder setup compared to keras #39432

fedxa commented May 11, 2020

iobtl commented May 12, 2020 •

edited

Saduf2019 commented May 12, 2020

fedxa commented May 12, 2020

gowthamkpr commented May 12, 2020

fedxa commented May 12, 2020

iobtl commented May 12, 2020

fedxa commented May 13, 2020

short-circuitt commented May 29, 2020

fedxa commented May 29, 2020

short-circuitt commented Jun 11, 2020

jvishnuvardhan commented Mar 1, 2022

google-ml-butler bot commented Mar 8, 2022

google-ml-butler bot commented Mar 15, 2022

tensorflow.keras behaves wrongly in autoencoder setup compared to keras #39432

tensorflow.keras behaves wrongly in autoencoder setup compared to keras #39432

Comments

fedxa commented May 11, 2020

iobtl commented May 12, 2020 • edited

Saduf2019 commented May 12, 2020

fedxa commented May 12, 2020

gowthamkpr commented May 12, 2020

fedxa commented May 12, 2020

iobtl commented May 12, 2020

fedxa commented May 13, 2020

short-circuitt commented May 29, 2020

fedxa commented May 29, 2020

short-circuitt commented Jun 11, 2020

jvishnuvardhan commented Mar 1, 2022

google-ml-butler bot commented Mar 8, 2022

google-ml-butler bot commented Mar 15, 2022

iobtl commented May 12, 2020 •

edited