Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception upon loading back a model in Keras 1.0.0 #2281

Closed
karoly-zsolnai-feher opened this issue Apr 12, 2016 · 31 comments
Closed

Exception upon loading back a model in Keras 1.0.0 #2281

karoly-zsolnai-feher opened this issue Apr 12, 2016 · 31 comments

Comments

@karoly-zsolnai-feher
Copy link

Hello,

I have freshly upgraded to Keras 1.0.0 from 0.3.x and an error message was raised upon loading back a previously saved network. Using theano backend, tried with theano 0.8.0 and 0.8.1 with the same results.

The error message is the following:

Using gpu device 0: GeForce GTX TITAN X (CNMeM is enabled with initial size: 90.0% of memory, CuDNN 4007)
  File "nn.py", line 185, in <module>
    model.load_weights('my_model_weights_best.h5')
  File "/usr/lib/python2.7/site-packages/keras/engine/topology.py", line 2286, in load_weights
    str(len(flattened_layers)) + ' layers.')
Exception: You are trying to load a weight file containing 17 layers into a model with 16 layers.

The network is defined as follows:

    model = Sequential()
    model.add(Convolution2D(32, 5, 5,
                            border_mode='valid',
                            input_shape=(loadedImage_rolled.shape)))
    model.add(Activation('relu'))
    model.add(Convolution2D(32, 5, 5))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Convolution2D(64, 3, 3, border_mode='valid'))
    model.add(Activation('relu'))
    model.add(Convolution2D(64, 3, 3))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Flatten())
    model.add(Dense(fc_neurons))
    model.add(Activation('relu'))
    model.add(Dropout(dropout))

    model.add(Dense(num_variables))
    model.add(Activation('linear'))

Please note that upon starting the training, the following warning is raised since Keras 1.0.0 (didn't appear with the previous version):

/usr/lib/python2.7/site-packages/keras/backend/theano_backend.py:484: UserWarning: theano.function was asked to create a function computing outputs given certain inputs, but the provided input variable at index 3 is not part of the computational graph needed to compute the outputs: keras_learning_phase.
To make this warning into an error, you can pass the parameter on_unused_input='raise' to theano.function. To disable it completely, use on_unused_input='ignore'.

After a few epochs, both the model topology and weights were saved and later loaded back with the appropriate functions shown in the Keras documentation [1]. The same piece of code worked without a hiccup with the previous version of Keras (0.3.x).

Let me know if you need more information and thanks for your time in advance!

[1] http://keras.io/faq/#how-can-i-save-a-keras-model

@fchollet
Copy link
Member

I will look into it. Can you upload your old save file somewhere for me to download it?

@fchollet
Copy link
Member

Please note that upon starting the training, the following warning is raised since Keras 1.0.0 (didn't appear with the previous version):

Your model has a Dropout layer therefore it uses the learning phase. But apparently it isn't part of the graph --maybe you passed an invalid dropout value to your Dropout layer?

Note: I just fixed it, passing invalid dropout values no longer makes the model use the learning phase.

@karoly-zsolnai-feher
Copy link
Author

Thanks for the swift answer! I am passing 0.0 for the dropout and 256 for the fc_neurons. I usually start to train a network without dropout (i.e., dropout line with 0.0) until it stops improving, and reload it with a non-zero dropout to improve it from there. If the dropout layer is missing from the model, I may have a difficulty loading back the topology from the json file, therefore I usually leave it there with 0.0 at first. I can imagine that this might cause some unexpected behaviors, but it didn't with the earlier version of Keras.

I will be able to upload the network topology and weights as soon as I get access to that computer.

@karoly-zsolnai-feher
Copy link
Author

Excellent, thanks for the fix, I'll try it as soon as possible!

@isaacgerg
Copy link

Does this have to do with keras.engine.topology.InputLayer being augmented to the model? I am finding with Keras 1.0 that all my models get this layer added on to them. FWIW, I am saving my model in a derirved Callback class.

@isaacgerg
Copy link

Attached is a model and json that do not line up.

keras model json issue

Github would not take .zip files, despite it saying it does so, I renamed it to a .jpg. Please rename

@isaacgerg
Copy link

I just posted the files to the keras google groups area.

@karoly-zsolnai-feher
Copy link
Author

I don't see anything wildly different in the json file I get. However, even if I remove the dropout layer, the problem still persists. The topology and the model weights are now available here, they are saved with a dropout layer with a probability of 0.0 and 8 neurons in the FC layer to be as minimal as possible. Please note that at the moment, I cannot try the mentioned fix, but since the problem persists even without dropout, maybe there is something still going on.

@isaacgerg
Copy link

can you tell us what model.layers says? I suspect it will add the
topological layer which is the issue.

On Tue, Apr 12, 2016 at 2:52 PM, zsolnaifk notifications@github.com wrote:

I don't see anything wildly different in the json file I get. However,
even if I remove the dropout layer, the problem still persists. The
topology and the model weights are now available here
https://users.cg.tuwien.ac.at/%7Ezsolnai/francois/,
they are saved
with a dropout layer with a probability of 0.0 and 8 neurons in the FC
layer to be as minimal as possible. Please note that at the moment, I
cannot try the mentioned fix
d50f469,
but since the problem persists even without dropout, maybe there is
something still going on.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#2281 (comment)

@isaacgerg
Copy link

Just verified, if I remove the topological layer from the model, the model
can save and load fine via json and h5.

        modelCopy = self.model
        modelCopy.layers = modelCopy.layers[1:]
        modelCopy.save_weights(fn, overwrite=True)

On Tue, Apr 12, 2016 at 2:55 PM, Isaac Gerg isaac.gerg@gergltd.com wrote:

can you tell us what model.layers says? I suspect it will add the
topological layer which is the issue.

On Tue, Apr 12, 2016 at 2:52 PM, zsolnaifk notifications@github.com
wrote:

I don't see anything wildly different in the json file I get. However,
even if I remove the dropout layer, the problem still persists. The
topology and the model weights are now available here
https://users.cg.tuwien.ac.at/%7Ezsolnai/francois/,
they are saved
with a dropout layer with a probability of 0.0 and 8 neurons in the FC
layer to be as minimal as possible. Please note that at the moment, I
cannot try the mentioned fix
d50f469,
but since the problem persists even without dropout, maybe there is
something still going on.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#2281 (comment)

@rcmalli
Copy link

rcmalli commented Apr 12, 2016

I got exactly same error while loading trained model which was created with new functional API. I do not think that the problem is about dropout layer. I have checked the source code and found that there are two layer names for the first conv2d layer like "conv2d_input_1" and "conv2d_1" while there is only one layer in weights file.

@isaacgerg
Copy link

You have to remove the topological layer from your model when you save and then it will work.

@fchollet
Copy link
Member

Can anybody provide a script I can run to reproduce such an issue? Completely unable to reproduce this right now. I can load any model saved with Keras 0.3.3 into Keras 1.0.

Steps:

  • Checkout to Keras 0.3.3, install
  • run the code posted above (OP), save the model
  • checkout to Keras 1.0, install
  • load the weights previously saved

No issue.

@isaacgerg
Copy link

see the attached files I sent in a previous message.

On Tue, Apr 12, 2016 at 4:44 PM, François Chollet notifications@github.com
wrote:

Can anybody provide a script I can run to reproduce such an issue?
Completely unable to reproduce this right now. I can load any model saved
with Keras 0.3.3 into Keras 1.0.

Steps:

  • Checkout to Keras 0.3.3, install
  • run the code posted above (OP), save the model
  • checkout to Keras 1.0, install
  • load the weights previously saved

No issue.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#2281 (comment)

@fchollet
Copy link
Member

see the attached files I sent in a previous message.

As I said, I cannot reproduce your issue. Anyone else?

@isaacgerg
Copy link

Have you looked at my files? Can you verify you see the *_input_1 layer in
the h5 file? that should not be there.

On Tue, Apr 12, 2016 at 4:53 PM, François Chollet notifications@github.com
wrote:

see the attached files I sent in a previous message.

As I said, I cannot reproduce your issue. Anyone else?


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#2281 (comment)

@isaacgerg
Copy link

For me, the problem only happens when I save the model in a
keras.callbacks.Callback derived class.

You should verify that when you create your model and print out
model.layers, do you see a keras.engine.topology.InputLayer layer augmented
to your model? You will only see this after the model in compiled i
believe.

On Tue, Apr 12, 2016 at 4:54 PM, Isaac Gerg isaac.gerg@gergltd.com wrote:

Have you looked at my files? Can you verify you see the *_input_1 layer
in the h5 file? that should not be there.

On Tue, Apr 12, 2016 at 4:53 PM, François Chollet <
notifications@github.com> wrote:

see the attached files I sent in a previous message.

As I said, I cannot reproduce your issue. Anyone else?


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#2281 (comment)

@karoly-zsolnai-feher
Copy link
Author

I have tried to put together a minimal example that yields the mentioned error on my side. A trivial python script and the model resources are available here:
https://users.cg.tuwien.ac.at/~zsolnai/francois/

Here is the log of the output as I ran it locally:

zsolnai@world test % python2.7 nn2.py                                                                                                                             Using Theano backend.
Using gpu device 0: GeForce GTX TITAN X (CNMeM is enabled with initial size: 90.0% of memory, CuDNN 4007)
Traceback (most recent call last):
  File "nn2.py", line 12, in <module>
    model.load_weights('my_model_weights_best.h5')
  File "/usr/lib/python2.7/site-packages/keras/engine/topology.py", line 2286, in load_weights
    str(len(flattened_layers)) + ' layers.')
Exception: You are trying to load a weight file containing 16 layers into a model with 15 layers.

zsolnai@world test % pip2.7 show keras                                                                                                                            ---
Metadata-Version: 1.1
Name: Keras
Version: 1.0.0
Summary: Deep Learning for Python
Home-page: https://github.com/fchollet/keras
Author: Francois Chollet
Author-email: francois.chollet@gmail.com
License: MIT
Location: /usr/lib/python2.7/site-packages
Requires: theano, pyyaml, six
Classifiers:

@isaacgerg
Copy link

My model shows:

[<keras.engine.topology.InputLayer object at 0x0000000011089080>,
<keras.layers.convolutional.Convolution2D object at 0x000000001136C630>,
<keras.layers.convolutional.AveragePooling2D object at 0x00000000114B0F98>,
<keras.layers.core.Dropout object at 0x00000000114E6128>,
<keras.layers.convolutional.Convolution2D object at 0x0000000005EAB9E8>,
<keras.layers.convolutional.MaxPooling2D object at 0x000000001157EBE0>,
<keras.layers.core.Dropout object at 0x000000001157EDD8>,
<keras.layers.convolutional.Convolution2D object at 0x000000001157E198>,
<keras.layers.convolutional.MaxPooling2D object at 0x00000000115A1940>,
<keras.layers.core.Dropout object at 0x00000000115A1B38>,
<keras.layers.convolutional.Convolution2D object at 0x00000000115A15F8>,
<keras.layers.convolutional.MaxPooling2D object at 0x00000000115BF5C0>,
<keras.layers.core.Dropout object at 0x00000000115BF7B8>,
<keras.layers.convolutional.Convolution2D object at 0x00000000115BFC18>,
<keras.layers.core.Flatten object at 0x000000001163D4E0>,
<keras.layers.core.MaxoutDense object at 0x00000000117070B8>,
<keras.layers.core.Dense object at 0x0000000011713D30>]

that first layer is what messes up the output to hdf5 adding an extra
layer. If i manually remove that layer, things seem to work.

On Tue, Apr 12, 2016 at 5:05 PM, zsolnaifk notifications@github.com wrote:

I have tried to put together a minimal example that yields the mentioned
error on my side. A trivial python script and the model resources are
available here:
https://users.cg.tuwien.ac.at/~zsolnai/francois/

Here is the log of the output as I ran it locally:

zsolnai@world test % python2.7 nn2.py Using Theano backend.
Using gpu device 0: GeForce GTX TITAN X (CNMeM is enabled with initial size: 90.0% of memory, CuDNN 4007)
Traceback (most recent call last):
File "nn2.py", line 12, in
model.load_weights('my_model_weights_best.h5')
File "/usr/lib/python2.7/site-packages/keras/engine/topology.py", line 2286, in load_weights
str(len(flattened_layers)) + ' layers.')
Exception: You are trying to load a weight file containing 16 layers into a model with 15 layers.
zsolnai@world test % pip2.7 show keras ---
Metadata-Version: 1.1
Name: Keras
Version: 1.0.0
Summary: Deep Learning for Python
Home-page: https://github.com/fchollet/keras
Author: Francois Chollet
Author-email: francois.chollet@gmail.com
License: MIT
Location: /usr/lib/python2.7/site-packages
Requires: theano, pyyaml, six
Classifiers:


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#2281 (comment)

@fchollet
Copy link
Member

model = model_from_json(open('my_model_architecture.json').read())
model.load_weights('my_model_weights_best.h5')

All that this scripts tells me is that there is a mismatch between the JSON file and the weights file that you have uploaded. What I would like to see is an example of a script where a same model can generate a mismatched JSON file and weight file, or where a same model can generate a weight file that cannot be loaded with the new version.

The following works perfectly with both Keras 0.3 and 1.0, including when the weight file is saved with Keras 0.3 then loaded with Keras 1.0:

from keras.models import Sequential, model_from_json
from keras.layers import *

model = Sequential()
model.add(Convolution2D(32, 5, 5,
                        border_mode='valid',
                        input_shape=(3, 216, 384)))
model.add(Activation('relu'))
model.add(Convolution2D(32, 5, 5))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Convolution2D(64, 3, 3, border_mode='valid'))
model.add(Activation('relu'))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(8))
model.add(Activation('relu'))
model.add(Dropout(0.))

model.add(Dense(9))
model.add(Activation('linear'))

# this line can be commented out when running
# with Keras 1.0, so that the save file will be
# from Keras 0.3
model.save_weights('test.h5')

config = model.to_json()
model = model_from_json(config)
model.load_weights('test.h5')

@rcmalli
Copy link

rcmalli commented Apr 12, 2016

I did clean installation of Keras 1.0 and tried to run the code sample from this gist and it produces error that we are talking about.

@isaacgerg
Copy link

Are u pulling the master or the keras 1.0 branch? I am pulling the master

Sent from my Android.
On Apr 12, 2016 5:22 PM, "Refikcanmalli" notifications@github.com wrote:

I did clean installation of Keras 1.0 and tried to run the code sample
from this gist
https://gist.github.com/Refikcanmalli/21a96032d7dadf2faf594aaeb35a3761
and it produces error that we are talking about.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#2281 (comment)

@fchollet
Copy link
Member

@Refikcanmalli: thanks, I can repro this. Will look into it.

@karoly-zsolnai-feher
Copy link
Author

Excellent, thanks for the help! The minimal example Francois provided runs without a problem, but my own full script does raise the error message. The only difference is that there is actual training going on with the full program. The gist is probably the easiest way to go for now, but if you need me to pass you my full script, just let me know, I'll be able to do it tomorrow!

@fchollet
Copy link
Member

Figured it out, fix pending. It's unfortunate that you guys (isaacgerg?)
seemed bent on providing misinformation and indirection instead of
reproduction scripts (thanks to the person who did provide one).

  • this was not a compatibility bug with 0.3, this was purely a 1.0 bug.
  • this was completely unrelated with JSON serialization.
  • this was not a bug with weight saving, this was a bug with the
    ModelCheckpointer callback.
  • of course this had nothing to do with the input layer.

The bug, in short, was that that ModelCheckpointer (and in general, all
Sequential callbacks) were calling the inner Model rather than the parent
Sequential model. Easy to fix.

On 12 April 2016 at 14:24, isaacgerg notifications@github.com wrote:

Are u pulling the master or the keras 1.0 branch? I am pulling the master

Sent from my Android.
On Apr 12, 2016 5:22 PM, "Refikcanmalli" notifications@github.com wrote:

I did clean installation of Keras 1.0 and tried to run the code sample
from this gist
https://gist.github.com/Refikcanmalli/21a96032d7dadf2faf594aaeb35a3761
and it produces error that we are talking about.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#2281 (comment)


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#2281 (comment)

@isaacgerg
Copy link

Unfortunately, I am providing what I can given how busy I am at the
moment. I could easily fire back and say that 1.0 should have been 0.4 and
when it works and was tested by community, it should become 1.0, but I
won't.

On Tue, Apr 12, 2016 at 6:02 PM, François Chollet notifications@github.com
wrote:

Figured it out, fix pending. It's unfortunate that you guys (isaacgerg?)
seemed bent on providing misinformation and indirection instead of
reproduction scripts (thanks to the person who did provide one).

  • this was not a compatibility bug with 0.3, this was purely a 1.0 bug.
  • this was completely unrelated with JSON serialization.
  • this was not a bug with weight saving, this was a bug with the
    ModelCheckpointer callback.
  • of course this had nothing to do with the input layer.

The bug, in short, was that that ModelCheckpointer (and in general, all
Sequential callbacks) were calling the inner Model rather than the parent
Sequential model. Easy to fix.

On 12 April 2016 at 14:24, isaacgerg notifications@github.com wrote:

Are u pulling the master or the keras 1.0 branch? I am pulling the master

Sent from my Android.
On Apr 12, 2016 5:22 PM, "Refikcanmalli" notifications@github.com
wrote:

I did clean installation of Keras 1.0 and tried to run the code sample
from this gist
<
https://gist.github.com/Refikcanmalli/21a96032d7dadf2faf594aaeb35a3761>
and it produces error that we are talking about.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#2281 (comment)


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#2281 (comment)


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#2281 (comment)

@fchollet
Copy link
Member

The issue is fixed, you can pull master and reinstall.

I could easily fire back and say that 1.0 should have been 0.4 and
when it works and was tested by community, it should become 1.0, but I
won't.

1.0 has very extensive unit test coverage, and was beta tested by over twenty people before release, for a couple weeks. Some bugs will always slip though the cracks though.

@isaacgerg
Copy link

So the bug turned out to be in the callbacks section. I mentioned that
earlier.

On Tue, Apr 12, 2016 at 6:18 PM, Isaac Gerg isaac.gerg@gergltd.com wrote:

Unfortunately, I am providing what I can given how busy I am at the
moment. I could easily fire back and say that 1.0 should have been 0.4 and
when it works and was tested by community, it should become 1.0, but I
won't.

On Tue, Apr 12, 2016 at 6:02 PM, François Chollet <
notifications@github.com> wrote:

Figured it out, fix pending. It's unfortunate that you guys (isaacgerg?)
seemed bent on providing misinformation and indirection instead of
reproduction scripts (thanks to the person who did provide one).

  • this was not a compatibility bug with 0.3, this was purely a 1.0 bug.
  • this was completely unrelated with JSON serialization.
  • this was not a bug with weight saving, this was a bug with the
    ModelCheckpointer callback.
  • of course this had nothing to do with the input layer.

The bug, in short, was that that ModelCheckpointer (and in general, all
Sequential callbacks) were calling the inner Model rather than the parent
Sequential model. Easy to fix.

On 12 April 2016 at 14:24, isaacgerg notifications@github.com wrote:

Are u pulling the master or the keras 1.0 branch? I am pulling the
master

Sent from my Android.
On Apr 12, 2016 5:22 PM, "Refikcanmalli" notifications@github.com
wrote:

I did clean installation of Keras 1.0 and tried to run the code sample
from this gist
<
https://gist.github.com/Refikcanmalli/21a96032d7dadf2faf594aaeb35a3761>
and it produces error that we are talking about.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
<#2281 (comment)


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#2281 (comment)


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#2281 (comment)

@isaacgerg
Copy link

How could it be a couple of weeks when 0.3 was still getting changes pumped
into its github repo?

On Tue, Apr 12, 2016 at 6:25 PM, François Chollet notifications@github.com
wrote:

The issue is fixed, you can pull master and reinstall.

I could easily fire back and say that 1.0 should have been 0.4 and
when it works and was tested by community, it should become 1.0, but I
won't.

1.0 has very extensive unit test coverage, and was beta tested by over
twenty people before release, for a couple weeks. Some bugs will always
slip though the cracks though.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#2281 (comment)

@karoly-zsolnai-feher
Copy link
Author

The fix indeed works, everything is going fine after pulling the new version. Thanks so much for the prompt answer! Closing.

@isaacgerg
Copy link

Just pulled. Looks like the fix worked. Thank you!

On Tue, Apr 12, 2016 at 6:28 PM, zsolnaifk notifications@github.com wrote:

Closed #2281 #2281.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#2281 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants