Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loading weights of a sequential model into a graph model #1728

Closed
yaoliUoA opened this issue Feb 15, 2016 · 25 comments
Closed

loading weights of a sequential model into a graph model #1728

yaoliUoA opened this issue Feb 15, 2016 · 25 comments

Comments

@yaoliUoA
Copy link

Hi guys,

I am trying to load the saved weights of several layers in a sequential model in HDF5 format to initialize weights of several layers in a graph model. It is easy to do this if both the source and target models are
sequential:
f = h5py.File(weights_path)
for k in range(f.attrs['nb_layers']):
if k >= len(model.layers):
# we don't look at the last (fully-connected) layers in the savefile
break
g = f['layer_{}'.format(k)]
weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]
model.layers[k].set_weights(weights)
f.close()

However, if the model is defined as a graph model, I have no idea how to do this, as the model does not have the attribute model.layers. Can someone share some hints on this issue? Thanks.

@wadhwasahil
Copy link

Having the same problem. I hope someone replies here.

@tboquet
Copy link
Contributor

tboquet commented Feb 16, 2016

You could use:

model.nodes

Note that you will have to be carefull about the order and names of the layers.

Another option is to define a Sequential model and add it to a Graph model if you want to add more layers or modify it at some point.

partialy_loaded_model = Sequential()

# part where you define the structure of the partialy loaded model and load you weights
...

# your new model
new_model = Graph()
new_model.add_input(...)
# add your partialy loaded model
new_model.add_node(partialy_loaded_model)
# add more nodes
new_model.add_node(another_layer)
...

@wadhwasahil
Copy link

I am getting an error like this: -
KeyError: "Unable to open object (Object 'graph' doesn't exist)"

@wadhwasahil
Copy link

Hi. I am just getting the same error.

@tboquet
Copy link
Contributor

tboquet commented Feb 16, 2016

I think I didn't explained it carefully enough:

what you could do using the VGG16 example from keras' blog:

from keras.models import Sequential, Graph
from keras.layers import Convolution2D, ZeroPadding2D, MaxPooling2D
import keras.backend as K

img_width, img_height = 128, 128

# build the VGG16 network with our input_img as input
first_layer = ZeroPadding2D((1, 1), input_shape=(3, img_width, img_height))

model = Sequential()
model.add(first_layer)
model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_2'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_2'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

# get the symbolic outputs of each "key" layer (we gave them unique names).
layer_dict = dict([(layer.name, layer) for layer in model.layers])

# load the weights

import h5py

weights_path = 'vgg16_weights.h5'

f = h5py.File(weights_path)
for k in range(f.attrs['nb_layers']):
    if k >= len(model.layers):
        # we don't look at the last (fully-connected) layers in the savefile
        break
    g = f['layer_{}'.format(k)]
    weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]
    model.layers[k].set_weights(weights)
f.close()
print('Model loaded.')

# Here is what you want:

graph_m = Graph()
graph_m.add_input('my_inp', input_shape=(3, img_width, img_height))
graph_m.add_node(model, name='your_model', input='my_inp')
graph_m.add_node(Flatten(), name='Flatten', input='your_model')
graph_m.add_node(Dense(4096, activation='relu'), name='Dense1', input='Flatten')
graph_m.add_node(Dropout(0.5), name='Dropout1', input='Dense1')
graph_m.add_node(Dense(4096, activation='relu'), name='Dense2', input='Dropout1')
graph_m.add_node(Dropout(0.5), name='Dropout2', input='Dense2')
graph_m.add_node(Dense(1000, activation='softmax'), name='Final', input='Dropout2')
graph_m.add_output(name='out1', input='Final')
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
graph_m.compile(optimizer=sgd, loss={'out1': 'categorical_crossentropy'})

So basically here you load only the weights for the feature extraction part and add the nonlinear structure of the classifier and fine tune the model. You could also fix the weights of your feature extraction layers using the trainable attribute of the layers.

@yaoliUoA
Copy link
Author

@tboquet Thanks for your reply. Your code is clear and easy to understand. But I do not quite understand the last sentence "You could also fix the weight of your feature extraction layer using the trainable attribute of the layers". When we do backprop using graph_m.fit after
graph_m.comple, will the weights in the layers of the vgg part (which is a node now) be updated in the new network?

How to use the trainable attribute of the layers? Would you please sharing some code of both the cases (ie, fixing the weights of the vgg part or updating them when training the new network)? Thanks.

@tboquet
Copy link
Contributor

tboquet commented Feb 16, 2016

From the doc, you just have to add trainable = False to freeze the training of a layer.
Ex freezed:

...
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1', trainable=False))
...

Ex trainable:

...
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1', trainable=True))
...

trainable is True by default so that something happens if you don't know about the feature...

@wadhwasahil
Copy link

Thanks for the code @tboquet .
However, when I try to predict the class of an image using the code
im = cv2.resize(cv2.imread('cat.jpg'), (img_width, img_height)).astype(np.float32) im[:,:,0] -= 103.939 im[:,:,1] -= 116.779 im[:,:,2] -= 123.68 im = im.transpose((2,0,1)) im = np.expand_dims(im, axis=0) out = graph_m.predict(im) print np.argmax(out)

The error which I get is shown below: -
Traceback (most recent call last): File "main.py", line 94, in <module> out = graph_m.predict(im) File "/home/darkfantasy/anaconda2/lib/python2.7/site-packages/keras/models.py", line 1249, in predict ins = [data[name] for name in self.input_order] IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis ' ' (None) and integer or boolean arrays are valid indices '

@tboquet
Copy link
Contributor

tboquet commented Feb 17, 2016

You should take a look at the Graph model API documentation. You should now provide a dict of input:

out = graph_m.predict({'my_inp': im})

@wadhwasahil
Copy link

I tried this method.
The class which I was getting from the original sequential model was 281.
Not using this graphical model, the class is coming out to be 0 which is not correct.
What can the possible reason?

@tboquet
Copy link
Contributor

tboquet commented Feb 17, 2016

Totally normal, the weights of the last layers are not trained, they have the default value of the default initailization method. What are you trying to achieve? Please put a detailed description of your goal.
From the post where I define the model:

So basically here you load only the weights for the feature extraction part and add the nonlinear structure of the classifier and fine tune the model.

So you need a dataset (ImageNet or another one) to train the new part of the model. It was just an example of how you could define a Graph structure having saved weights from a Sequential model.

@wadhwasahil
Copy link

I want to find the class of an image using the predefined weights of the **VGG. But, due to random initialization of weights, the output is wrong.
Can you tell, how to load the weights of the last graphical fully connected layers?

@tboquet
Copy link
Contributor

tboquet commented Feb 17, 2016

Why don't you use the same Sequential structure?

@wadhwasahil
Copy link

because I have to add extra layers of LSTM network which will require inputs from more than 1 layer. So, I have to use the graphical model.

@tboquet
Copy link
Contributor

tboquet commented Feb 17, 2016

Ok so just define the full Sequential model, use the load_weights method of the Sequential model and when the weights are loaded add this Sequential model to your Graph model using add_node. It should be easy to infer based on the previous discussion.

@wadhwasahil
Copy link

thanks.
Can I do the same set_weights for the graphical model as well?

@tboquet
Copy link
Contributor

tboquet commented Feb 17, 2016

I'm not sure the Graph model will help you, consider posting this to the Google group with an exhaustive description of what you are trying to achieve: description of the model, of the inputs, the outputs and a link to a related paper if so. The issues should be related to bugs, new feature request, etc. The Google group is the place to post those kind of questions.

@tboquet
Copy link
Contributor

tboquet commented Feb 17, 2016

The set_weights method is available for every layer with trainable parameters.

@wadhwasahil
Copy link

Thanks for your time and effort.
(y)

@wadhwasahil
Copy link

After removing the two lines which were not there in the original network
graph_m.add_input('my_inp', input_shape=(3, img_width, img_height)) and
graph_m.add_output(name='out1', input='Final'),I get the following error.
AttributeError: 'float' object has no attribute 'type'

@tboquet
Copy link
Contributor

tboquet commented Feb 17, 2016

I suggest you read carefully the documentation and spend time on the examples so that you could see what are the purpose of these lines. You could take a look at this example.

@ouceduxzk
Copy link

A follow-up question, what if the model is not VGG, but an non Sequential Model, say AlexNet, where we define the model = Model(input = inputs, output = outputs) and the first layer of the model is Input((3,227,227)). If I use the graph

graph_m = Graph()
graph_m.add_input(name = 'main_input', input_shape=(3, 227, 227))
graph_m.add_node( model, name = 'alex', input = 'main_input')
graph_m.add_node(Dense(10, activation='softmax'), name='Final', input='')
graph_m.add_output(name='out1', input='Final')

I got the following error, which indicates the model and the graph_input is not connected.
File "/users/zaikun/anaconda2/lib/python2.7/site-packages/keras/engine/topology.py", line 1734, in init
str(layers_with_complete_input))
Exception: Graph disconnected: cannot obtain value for tensor main_input at layer "alex". The following previous layers were accessed without issue: []

Any idea how to fix this?

@Neltherion
Copy link

Neltherion commented Oct 10, 2016

@ouceduxzk @tboquet I'm having the same problem (about Graph being disconnected), did you manage to solve it somehow?

@sedghi
Copy link

sedghi commented May 17, 2017

for any one who is getting error for @tboquet code :

f = h5py.File(weights_path4)
f = f['model_weights']
layer_names = [n.decode('utf8') for n in f.attrs['layer_names']]
g = f[layer_names[0]]
weight_names = [n.decode('utf8') for n in g.attrs['weight_names']]
weight_values = [g[weight_name] for weight_name in weight_names]

give you the weight_values in layer_name (0 here) and you can set it to your layers weight with model.layers[k].set_weights(weights)

@stale stale bot added the stale label Aug 15, 2017
@stale
Copy link

stale bot commented Aug 15, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

@stale stale bot closed this as completed Sep 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants