New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
loading weights of a sequential model into a graph model #1728
Comments
Having the same problem. I hope someone replies here. |
You could use: model.nodes Note that you will have to be carefull about the order and names of the layers. Another option is to define a partialy_loaded_model = Sequential()
# part where you define the structure of the partialy loaded model and load you weights
...
# your new model
new_model = Graph()
new_model.add_input(...)
# add your partialy loaded model
new_model.add_node(partialy_loaded_model)
# add more nodes
new_model.add_node(another_layer)
... |
I am getting an error like this: - |
Hi. I am just getting the same error. |
I think I didn't explained it carefully enough: what you could do using the VGG16 example from keras' blog: from keras.models import Sequential, Graph
from keras.layers import Convolution2D, ZeroPadding2D, MaxPooling2D
import keras.backend as K
img_width, img_height = 128, 128
# build the VGG16 network with our input_img as input
first_layer = ZeroPadding2D((1, 1), input_shape=(3, img_width, img_height))
model = Sequential()
model.add(first_layer)
model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_2'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_2'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
# get the symbolic outputs of each "key" layer (we gave them unique names).
layer_dict = dict([(layer.name, layer) for layer in model.layers])
# load the weights
import h5py
weights_path = 'vgg16_weights.h5'
f = h5py.File(weights_path)
for k in range(f.attrs['nb_layers']):
if k >= len(model.layers):
# we don't look at the last (fully-connected) layers in the savefile
break
g = f['layer_{}'.format(k)]
weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]
model.layers[k].set_weights(weights)
f.close()
print('Model loaded.')
# Here is what you want:
graph_m = Graph()
graph_m.add_input('my_inp', input_shape=(3, img_width, img_height))
graph_m.add_node(model, name='your_model', input='my_inp')
graph_m.add_node(Flatten(), name='Flatten', input='your_model')
graph_m.add_node(Dense(4096, activation='relu'), name='Dense1', input='Flatten')
graph_m.add_node(Dropout(0.5), name='Dropout1', input='Dense1')
graph_m.add_node(Dense(4096, activation='relu'), name='Dense2', input='Dropout1')
graph_m.add_node(Dropout(0.5), name='Dropout2', input='Dense2')
graph_m.add_node(Dense(1000, activation='softmax'), name='Final', input='Dropout2')
graph_m.add_output(name='out1', input='Final')
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
graph_m.compile(optimizer=sgd, loss={'out1': 'categorical_crossentropy'}) So basically here you load only the weights for the feature extraction part and add the nonlinear structure of the classifier and fine tune the model. You could also fix the weights of your feature extraction layers using the |
@tboquet Thanks for your reply. Your code is clear and easy to understand. But I do not quite understand the last sentence "You could also fix the weight of your feature extraction layer using the trainable attribute of the layers". When we do backprop using How to use the |
From the doc, you just have to add ...
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1', trainable=False))
... Ex trainable: ...
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1', trainable=True))
...
|
Thanks for the code @tboquet . The error which I get is shown below: - |
You should take a look at the out = graph_m.predict({'my_inp': im}) |
I tried this method. |
Totally normal, the weights of the last layers are not trained, they have the default value of the default initailization method. What are you trying to achieve? Please put a detailed description of your goal.
So you need a dataset (ImageNet or another one) to train the new part of the model. It was just an example of how you could define a |
I want to find the class of an image using the predefined weights of the **VGG. But, due to random initialization of weights, the output is wrong. |
Why don't you use the same |
because I have to add extra layers of LSTM network which will require inputs from more than 1 layer. So, I have to use the graphical model. |
Ok so just define the full |
thanks. |
I'm not sure the |
The |
Thanks for your time and effort. |
After removing the two lines which were not there in the original network |
I suggest you read carefully the documentation and spend time on the examples so that you could see what are the purpose of these lines. You could take a look at this example. |
A follow-up question, what if the model is not VGG, but an non Sequential Model, say AlexNet, where we define the model = Model(input = inputs, output = outputs) and the first layer of the model is Input((3,227,227)). If I use the graph graph_m = Graph() I got the following error, which indicates the model and the graph_input is not connected. Any idea how to fix this? |
@ouceduxzk @tboquet I'm having the same problem (about Graph being disconnected), did you manage to solve it somehow? |
for any one who is getting error for @tboquet code :
give you the |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed. |
Hi guys,
I am trying to load the saved weights of several layers in a sequential model in HDF5 format to initialize weights of several layers in a graph model. It is easy to do this if both the source and target models are
sequential:
f = h5py.File(weights_path)
for k in range(f.attrs['nb_layers']):
if k >= len(model.layers):
# we don't look at the last (fully-connected) layers in the savefile
break
g = f['layer_{}'.format(k)]
weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]
model.layers[k].set_weights(weights)
f.close()
However, if the model is defined as a graph model, I have no idea how to do this, as the model does not have the attribute model.layers. Can someone share some hints on this issue? Thanks.
The text was updated successfully, but these errors were encountered: