Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tf2.0.0 keras2.3.0] keras.layers.GRU incorrect output of model.fit_generator trying to run Francois Chollet's notebook #32987 #13391

Closed
dbonner opened this issue Oct 3, 2019 · 1 comment

Comments

@dbonner
Copy link

dbonner commented Oct 3, 2019

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
  • TensorFlow installed from (source or binary): built from source
  • TensorFlow version (use command below): 2.0.0 (i.e. the release)
  • Keras version: 2.3.0
  • Python version: 3.7 conda
  • Bazel version (if compiling Tensorflow from source): 0.26.1
  • GCC/Compiler version (if compiling Tensorflow from source): 7.4.0
  • CUDA/cuDNN version: 10 / 7.6.4
  • GPU model and memory: RTX 2080 Ti and Tesla V100 (tried on both. error occurs on both)

Describe the current behavior
I am going through Francois Chollet's book "Deep Learning with Python" and running the code in his Jupyter Notebooks with Tensorflow 2.0.0 as a backend to Keras 2.3.0. Notebook 6.3, (under the heading "1.6 Using recurrent dropout to fight overfitting") has a model with a tensorflow.keras.layers.GRU(32, dropout=0.2, recurrent_dropout=0.2, input_shape=(None, float_data.shape[-1])). The data is read earlier in the notebook from jena_climate_2009_2016.csv. I get a loss of 699013271268870062080.0000 after the first epoch and similar figures after subsequent epochs. This figure is simply wrong (see below). The original notebook (from Francois Chollet) is here: link to github and includes the correct output.

Describe the expected behavior
The loss after 1 or 2 epochs is supposed to be around 0.3

Code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate the problem.
Download the data as follows:
cd ~
mkdir Datasets
cd ~/Datasets
mkdir jena_climate
cd jena_climate
wget https://s3.amazonaws.com/keras-datasets/jena_climate_2009_2016.csv.zip
unzip jena_climate_2009_2016.csv.zip

Run jupyter notebook and load the notebook in a Python 3.7 environment with tensorflow 2.0.0 as the backend and keras 2.30.
Run each cell from the beginning of the notebook so you load the data and create the generators before you get to the example under heading 1.6. Then try to run the example. You will find that the loss is terribly wrong.

The code under heading "1.7 Stacking recurrent layers" also runs incorrectly. The loss produced is "nan" and val_loss is "nan" (both should be around 0.3). I think it is the same problem with layers.GRU

I have reproduced this problem running tensorflow.keras in tensorflow 2.0.0.
The problem also occurs running Keras 2.3.0 with a tensorflow 1.1.4 backend.
The problem does not occur with tensorflow.keras in tensorflow 1.1.4.

@tctco
Copy link

tctco commented Feb 19, 2020

Have you figured out how to fix this problem? I have the same issue running the same code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants