using pre trained VGG16 for another classification task #4465

vabatista · 2016-11-22T12:14:01Z

How can I user the new keras.applications.VGG16 class to start my training with the weights in H5 file, but for a new task with 8 classes only?

I didn't figure out how to pop the softmax layer and put another one with 8 perceptons only.

JGuillaumin · 2016-11-22T12:35:42Z

One way to do this, is to not include the fully-connected layers at the top of the network.
Then add new fully-connect layers with random initialization, with the correct num of output.

The convolutional layers will be initialized with weights based on a training on ImageNet dataset.
Generally, we can say that the convolutional layers work as features extractors.

But you to train train the whole network for your 8 classes. This training will train the new fully-connected layers and fine-train the convolutional layers. (you can freeze the convolutional layers, to keep the same features extractors).

from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
from keras.layers import Input, Flatten, Dense
from keras.models import Model
import numpy as np

#Get back the convolutional part of a VGG network trained on ImageNet
model_vgg16_conv = VGG16(weights='imagenet', include_top=False)
model_vgg16_conv.summary()

#Create your own input format (here 3x200x200)
input = Input(shape=(3,200,200),name = 'image_input')

#Use the generated model 
output_vgg16_conv = model_vgg16_conv(input)

#Add the fully-connected layers 
x = Flatten(name='flatten')(output_vgg16_conv)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(8, activation='softmax', name='predictions')(x)

#Create your own model 
my_model = Model(input=input, output=x)

#In the summary, weights and layers from VGG part will be hidden, but they will be fit during the training
my_model.summary()


#Then training with your data !

JGuillaumin · 2016-11-22T12:48:20Z

If you want to change only the last layer :

# Generate a model with all layers (with top)
vgg16 = VGG16(weights=None, include_top=True)

#Add a layer where input is the output of the  second last layer 
x = Dense(8, activation='softmax', name='predictions')(vgg16.layers[-2].output)

#Then create the corresponding model 
my_model = Model(input=vgg16.input, output=x)
my_model.summary()

vabatista · 2016-11-22T13:08:02Z

Thank you very much!

UkiDLucas · 2017-02-06T02:10:23Z

Thank you JGuillaumin!

parikshit95 · 2017-02-26T00:37:52Z

For training do you just use the model.compile() and model.fit(data,labels) commands?

howardya · 2017-02-26T09:48:08Z

@JGuillaumin In your reply, when we only want to change the last layer, did you mean

vgg16 = VGG16(weights='imagenet', include_top=True)

instead of

vgg16 = VGG16(weights=None, include_top=True)

JordanPeltier · 2017-03-09T14:39:06Z

@howardya , actually not.
He wants to use pretrained vgg16 model with Imagenet but just remove the last layer (softmax) and use another one for another classification task.

abdulsalama · 2017-03-16T19:41:39Z

just a small correction it should be shape=(200, 200, 3) in
input = Input(shape=(3,200,200),name = 'image_input')

Otherwise, you will get this error:
ValueError: number of input channels does not match corresponding dimension of filter, 100 != 3

LindaSt · 2017-04-08T21:06:06Z

@JGuillaumin
Thank you very much for your example! I have been trying to implement your example on the MNIST dataset. However, the accuracy just keeps fluctuation around 10% during training and validation. I really do not understand what I did wrong. I added my code below.

Thank you very much!

import sys
import numpy as np
import cv2
import sklearn.metrics as sklm

from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
from keras.layers import Input, Flatten, Dense
from keras.models import Model
from keras.datasets import mnist

from keras import backend as K
img_dim_ordering = 'tf'
K.set_image_dim_ordering(img_dim_ordering)

# the model
def pretrained_model(img_shape, num_classes):
    model_vgg16_conv = VGG16(weights='imagenet', include_top=False)
    #model_vgg16_conv.summary()
    
    #Create your own input format
    keras_input = Input(shape=img_shape, name = 'image_input')
    
    #Use the generated model 
    output_vgg16_conv = model_vgg16_conv(keras_input)
    
    #Add the fully-connected layers 
    x = Flatten(name='flatten')(output_vgg16_conv)
    x = Dense(4096, activation=layer_type, name='fc1')(x)
    x = Dense(4096, activation=layer_type, name='fc2')(x)
    x = Dense(num_classes, activation='softmax', name='predictions')(x)
    
    #Create your own model 
    pretrained_model = Model(inputs=keras_input, outputs=x)
    pretrained_model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    
    return pretrained_model

# loading the data
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# converting it to RGB
x_train = [cv2.cvtColor(cv2.resize(i, (32,32)), cv2.COLOR_GRAY2BGR) for i in x_train]
x_train = np.concatenate([arr[np.newaxis] for arr in x_train]).astype('float32')

x_test = [cv2.cvtColor(cv2.resize(i, (32,32)), cv2.COLOR_GRAY2BGR) for i in x_test]
x_test = np.concatenate([arr[np.newaxis] for arr in x_test]).astype('float32')

# training the model
model = pretrained_model(x_train.shape[1:], len(set(y_train)), 'relu')
hist = model.fit(x_train, y_train, epochs=2, validation_data=(x_test, y_test), verbose=1)

KamalOthman · 2017-04-24T18:45:07Z

Hi @JGuillaumin and Everyone
I used the pre-trained VGG16 model without its top layers, i.e. fully connected layers. I trained my FC with train & validation datasets. I followed the example in https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
Everything is working well. Now, I have the VGG16 model with its weights and my FC layers with its weights.
For a new test data, How can I merge both models in order to predict a new data?

I really appreciate your help gyus

Kamal

chauhan-utk · 2017-05-13T20:24:27Z

@KamalOthman I am also trying to do the same thing. Update if you find any solution.

KamalOthman · 2017-05-14T21:00:49Z

Hi @chauhan-utk
Yes, I solved it. Please find it here #6408

Cheers

jinkos · 2017-05-23T15:31:05Z

Should the input images be normalised in some specific way in order to make best use of the features?
I am normalizing each image by subtracting the mean and dividing by the SD. Seems to work. But is there a better way?

DanqingZ · 2017-06-25T06:51:05Z

@LindaSt Is your problem solved?

LindaSt · 2017-06-26T08:53:01Z

@DanqingZ Yes, I solved my problem by freezing the layers in the base model.

for layer in base_model.layers:
    layer.trainable = False

aniket03 · 2017-07-04T06:17:45Z

@LindaSt @DanqingZ I am using conv layers of VGG16 for feature extraction. I tried freezing layers of VGG16 but still accuracy on train and val still remains at 10%, Can u pls help?

from __future__ import division

from keras.applications.vgg16 import VGG16
from keras.layers import Input, Flatten, Dense, Dropout
from keras.models import Model, Sequential
from keras.optimizers import SGD
from keras.utils import np_utils

from pandas import read_csv
import pickle

from sklearn.preprocessing import LabelEncoder

# CONSTANTS
DATAPOINTS_TO_CONSIDER = 10000
TRAINING_DATA_PERCENTAGE = 0.8

# load data
data_file = open("cifar10.pkl")
X = pickle.load(data_file)

X = X[:DATAPOINTS_TO_CONSIDER]
X = X.reshape(X.shape[0], X.shape[3], X.shape[1], X.shape[2])

# Split images into train - validation sets
n_points = X.shape[0]
tr_points = int(TRAINING_DATA_PERCENTAGE * n_points)
X_train = X[0:tr_points]
X_val = X[tr_points:]
del X
data_file.close()

# normalize inputs from 0-255 to 0.0-1.0
X_train = X_train.astype('float32')
X_val = X_val.astype('float32')
X_train = X_train/255.0
X_val = X_val/255.0

# Load labels
dataframe = read_csv('trainLabels.csv')
labels = dataframe.values
Y = labels[:, 1]
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
categorical_Y = np_utils.to_categorical(encoded_Y)

# Split labels into train and validation sets
y_train = categorical_Y[:tr_points]
y_val = categorical_Y[tr_points:DATAPOINTS_TO_CONSIDER]
del labels, Y

num_classes = 10  # Hard coded now will change

# Get back the convolutional part of a VGG network trained on ImageNet
model_vgg16_conv = VGG16(weights='imagenet', include_top=False)
model_vgg16_conv.summary()

# Make vgg16 model layers as non trainable
for layer in model_vgg16_conv.layers:
    layer.trainable = False

# Create your own input format (here 3x32x32)
input = Input(shape=(3,32,32), name='image_input')

# Use the generated model
output_vgg16_conv = model_vgg16_conv(input)

# Add the flatten layer
x = Flatten(name='flatten')(output_vgg16_conv)

# Create your own model
my_model = Model(input=input, output=x)
my_model.summary()

# Collect features obtained from VGG
features_for_train_data = my_model.predict(X_train)
features_for_test_data = my_model.predict(X_val)

model = Sequential()
model.add(Dense(256, input_shape=features_for_train_data.shape[1:], activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

sgd = SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd,
              loss='mse',
              metrics=['accuracy'])

model.fit(features_for_train_data, y_train,
          nb_epoch=80,
          batch_size=32,
          validation_data=(features_for_test_data, y_val))

@LindaSt @DanqingZ ^ sorted now.

liketheflower · 2017-07-06T18:09:56Z

@LindaSt @DanqingZ@aniket03
Here are more hints.
I am upsample the cifar10 data set from 3232 to 197197 and the accuracy is always 0.10.
I change the size to 64*64, it runs well at the beginning and the accuracy goes to 84.7, then since
46 epoch, the result reduced to 0.10.
Something must be wrong. Hope this can give you guys more hints and let's fix this problem together.

BTW, the application resnet50 model works good on the cifar10 data set when upsample the image size to 197*197.

liketheflower · 2017-07-06T18:16:51Z

here is the output:
Epoch 44/200
1562/1562 [==============================] - 358s - loss: 0.4885 - acc: 0.8489 - val_loss: 0.5697 - val_acc: 0.8346
Epoch 45/200
1562/1562 [==============================] - 357s - loss: 0.4232 - acc: 0.8672 - val_loss: 1.2105 - val_acc: 0.6898
Epoch 46/200
1562/1562 [==============================] - 354s - loss: 12.1607 - acc: 0.2163 - val_loss: 14.5063 - val_acc: 0.1000
Epoch 47/200
1562/1562 [==============================] - 353s - loss: 14.5035 - acc: 0.1002 - val_loss: 14.5063 - val_acc: 0.1000
Epoch 48/200
1562/1562 [==============================] - 354s - loss: 14.5048 - acc: 0.1001 - val_loss: 14.5063 - val_acc: 0.1000
Epoch 49/200
1562/1562 [==============================] - 355s - loss: 14.5029 - acc: 0.1002 - val_loss: 14.5063 - val_acc: 0.1000
Epoch 50/200
1562/1562 [==============================] - 352s - loss: 14.5025 - acc: 0.1002 - val_loss: 14.5063 - val_acc: 0.1000
Epoch 51/200
1562/1562 [==============================] - 353s - loss: 14.5154 - acc: 0.0994 - val_loss: 14.5063 - val_acc: 0.1000
Epoch 52/200
1562/1562 [==============================] - 354s - loss: 14.5035 - acc: 0.1002 - val_loss: 14.5063 - val_acc: 0.1000

here is the code:

"""
Adapted from keras example cifar10_cnn.py
Train ResNet-18 on the CIFAR10 small images dataset.

GPU run command with Theano backend (with TensorFlow, the GPU is automatically used):
    THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python cifar10.py
"""
from __future__ import print_function
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
from keras.utils import np_utils
from keras.callbacks import ReduceLROnPlateau, CSVLogger, EarlyStopping
from scipy.misc import toimage, imresize
import numpy as np
#import resnet
from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
from keras.layers import Input, Flatten, Dense
from keras.models import Model
import numpy as np
from keras.callbacks import ModelCheckpoint


from keras import backend as K
#K.set_image_dim_ordering('th')
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)


lr_reducer = ReduceLROnPlateau(factor=np.sqrt(0.1), cooldown=0, patience=5, min_lr=0.5e-6)
early_stopper = EarlyStopping(min_delta=0.001, patience=20)
csv_logger = CSVLogger('./results/vgg16imagenetpretrained_upsampleimage_cifar10_data_argumentation.csv')

batch_size = 32
nb_classes = 10
nb_epoch = 200
data_augmentation = True

# input image dimensions
img_rows, img_cols = 197, 197
I_R = 64
# The CIFAR10 images are RGB.
img_channels = 3

# The data, shuffled and split between train and test sets:
(X_train_original, y_train), (X_test_original, y_test) = cifar10.load_data()

# Convert class vectors to binary class matrices.
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)

X_train_original = X_train_original.astype('float32')
X_test_original = X_test_original.astype('float32')


# upsample it to size 64X64X3


X_train = np.zeros((X_train_original.shape[0],I_R,I_R,3))
for i in range(X_train_original.shape[0]):
    X_train[i] = imresize(X_train_original[i], (I_R,I_R,3), interp='bilinear', mode=None)


X_test = np.zeros((X_test_original.shape[0],I_R,I_R,3))
for i in range(X_test_original.shape[0]):
    X_test[i] = imresize(X_test_original[i], (I_R,I_R,3), interp='bilinear', mode=None)


# subtract mean and normalize
mean_image = np.mean(X_train, axis=0)
X_train -= mean_image
X_test -= mean_image
X_train /= 128.
X_test /= 128.


print(X_train.shape)




#model = resnet.ResnetBuilder.build_resnet_18((img_channels, img_rows, img_cols), nb_classes)
#model =get_vgg_pretrained_model()


 #Get back the convolutional part of a VGG network trained on ImageNet
model_vgg16_conv = VGG16(input_shape=(I_R,I_R,3),weights='imagenet', include_top=False,pooling=max)
model_vgg16_conv.summary()
#print("ss")
    #Create your own input format (here 3x200x200)
input = Input(shape=(I_R,I_R,3),name = 'image_input')
#print("ss2")
    #Use the generated model 
output_vgg16_conv = model_vgg16_conv(input)
print("ss3")
    #Add the fully-connected layers 
x = Flatten(name='flatten')(output_vgg16_conv)
x = Dense(512, activation='relu', name='fc1')(x)
x = Dense(128, activation='relu', name='fc2')(x)
x = Dense(10, activation='softmax', name='predictions')(x)

    #Create your own model 
my_model = Model(input=input, output=x)

    #In the summary, weights and layers from VGG part will be hidden, but they will be fit during the training
my_model.summary()



my_model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

# serialize model to JSON
model_json = my_model.to_json()
with open("./results/model_data_argumentation.json", "w") as json_file:
    json_file.write(model_json)


print(my_model.summary())
if not data_augmentation:
    print('Not using data augmentation.')
    # checkpoint
   # filepath="./results/weights-improvement-{epoch:02d}-{val_acc:.2f}.hdf5"
   # checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
   # callbacks_list = [checkpoint]
    # Fit the model
    #model.fit(X, Y, validation_split=0.33, epochs=150, batch_size=10, callbacks=callbacks_list, verbose=0)

    my_model.fit(X_train, Y_train,
              batch_size=batch_size,
              nb_epoch=nb_epoch,
              validation_data=(X_test, Y_test),
              shuffle=True,
              callbacks=[lr_reducer, early_stopper, csv_logger])
else:
    print('Using real-time data augmentation.')
    # This will do preprocessing and realtime data augmentation:
    datagen = ImageDataGenerator(
        featurewise_center=False,  # set input mean to 0 over the dataset
        samplewise_center=False,  # set each sample mean to 0
        featurewise_std_normalization=False,  # divide inputs by std of the dataset
        samplewise_std_normalization=False,  # divide each input by its std
        zca_whitening=False,  # apply ZCA whitening
        rotation_range=0,  # randomly rotate images in the range (degrees, 0 to 180)
        width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)
        height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
        horizontal_flip=True,  # randomly flip images
        vertical_flip=False)  # randomly flip images

    # Compute quantities required for featurewise normalization
    # (std, mean, and principal components if ZCA whitening is applied).
    datagen.fit(X_train)

    # Fit the model on the batches generated by datagen.flow().
    my_model.fit_generator(datagen.flow(X_train, Y_train, batch_size=batch_size),
                        steps_per_epoch=X_train.shape[0] // batch_size,
                        validation_data=(X_test, Y_test),
                        epochs=nb_epoch, verbose=1, max_q_size=100,
                        callbacks=[lr_reducer, early_stopper, csv_logger])
my_model.save_weights("./results/vgg16_pretrained_upsample_model_data_argumentation.h5")
print("Saved model to disk")

BogoK · 2017-11-23T10:24:58Z

How do You get all the known classes of the pretrained VGG16 model?Thanks

nabsabraham · 2018-07-22T23:02:11Z

Hi everyone,
I am using the same model posted here by @JGuillaumin however the model does not seem to be learning. The model summary shows that all the weights are trainable so I don't know what I am doing wrong. Can someone point me in the right direction?

This is the output I am seeing

Model Summary:

Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, None, None, 3)     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, None, None, 64)    1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, None, None, 64)    36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, None, None, 64)    0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, None, None, 128)   73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, None, None, 128)   147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, None, None, 128)   0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, None, None, 256)   295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, None, None, 256)   0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, None, None, 512)   1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, None, None, 512)   0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, None, None, 512)   0         
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________

This is the code I am using:

batch = 8
epochs = 20
#Get back the convolutional part of a VGG network trained on ImageNet
model_vgg16_conv = VGG16(weights='imagenet', include_top=False)
model_vgg16_conv.summary()

#Create your own input format (here 3x200x200)
input = Input(shape=(256,256,3),name = 'image_input')

#Use the generated model 
output_vgg16_conv = model_vgg16_conv(input)

#Add the fully-connected layers 
x = Flatten(name='flatten')(output_vgg16_conv)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dropout(0.5)(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(1, activation='softmax', name='predictions')(x)

#Create your own model 
my_model = Model(inputs=input, outputs=x)

#In the summary, weights and layers from VGG part will be hidden, but they will be fit during the training
my_model.summary()
my_model.compile(loss = "binary_crossentropy", 
                    optimizer = SGD(lr=1e-5, momentum=0.9), 
                    metrics=["accuracy"])

history = my_model.fit(train_shuffled,labels_shuffled, batch_size=batch, epochs=epochs, 
                    verbose=1,validation_split=0.2, shuffle=True)```

JGuillaumin · 2018-07-23T08:03:51Z

Hi,
Did you try higher learning rate (1e-4, 1e-3, 1e-2) ? Even with a pre-trained neural network, 1e-5 is very small.
You can test with 2 output neurons and 'categorical_crossentropy' as the loss.

You final layers are very big !!
I recommend starting from a smaller network like this:

model_vgg16_conv = VGG16(weights='imagenet', include_top=False)

input = Input(shape=(256,256,3),name = 'image_input')

output_vgg16_conv = model_vgg16_conv(input)
# shape [?, 8, 8, 512]

x = GlobalAveragePooling2D(output_vgg16_conv)
# shape [?, 512]

x = Dense(1, activation='softmax', name='predictions')
# shape [?, 1]

JGuillaumin · 2018-07-23T08:15:27Z

I think it does not train because you use softmax as the final activation function !!
For binary cross entropy you have to use sigmoid

nabsabraham · 2018-07-24T04:41:20Z

Hi Julien,
The learning rates did not change anything but you are right, changing it to sigmoid made a difference. It converges really quickly (2-3 epochs) but this might be a data problem as I'm using biomedical images.
Do you know why it can only work with sigmoid and not softmax? Any literature you can provide would be great!

JGuillaumin · 2018-07-24T07:39:37Z

Because softmax is not an element-wise activation function. Here is the formula for x a vector of dim K (sorry there is no MathJax rendering in GitHub issue .. )

$$softmax(x)_i = \frac{exp(x_i)}{\sum_j^K exp(x_j)}$$

When you have K=1 (your case), whatever your network outputs, the softmax will be 1 !

That's why your loss is constant !

nabsabraham · 2018-07-24T11:34:31Z

awesome thank you for this! 👍

dgtlmoon · 2018-11-05T21:42:16Z

Sorry for the bump - but would retraining those layers as mentioned herein, improve the results for my domain specific feature comparison (aka: similar image finding) where i'm using a cosine difference of the feature (4096-element array, which is the activations of the last fully-connected layer fc2 in VGG16) to compare image "similarity" between an image set?

I'm comparing a lot of images which are mostly related, but I suspect I could get better results by training on my own huge dataset (250,000 images, can be divided into 6 categories easily)

ie

feat_extractor = Model(inputs=model.input, outputs=model.get_layer("fc2").output)

gledsonmelotti · 2018-12-08T17:03:38Z

@JGuillaumin How do I do to train a model that has images with 5 channels in input using pre trained VGG16?

gledsonmelotti · 2018-12-08T17:06:29Z

just a small correction it should be shape=(200, 200, 3) in
input = Input(shape=(3,200,200),name = 'image_input')

Otherwise, you will get this error:
ValueError: number of input channels does not match corresponding dimension of filter, 100 != 3

How do I do to train a model that has images with 5 channels in input using pre trained VGG16?

xuzhengxiao · 2018-12-24T13:06:22Z

just a small correction it should be shape=(200, 200, 3) in
input = Input(shape=(3,200,200),name = 'image_input')
Otherwise, you will get this error:
ValueError: number of input channels does not match corresponding dimension of filter, 100 != 3

How do I do to train a model that has images with 5 channels in input using pre trained VGG16?

The key lies in keras api load_weights parameter by_name.If by_name is TRUE, weights are loaded into layers only if they share the same name. This is useful for fine-tuning or transfer-learning models where some of the layers have changed.

This is my model with 4 channels,you can do like this.

input_shape=(256,256,4)
input_tensor = Input(input_shape)
base_model_i4 = DenseNet121(include_top=False, weights=None,input_tensor=input_tensor)
######### this trick !
base_model_i4.layers[2].name='new_conv1/conv'
weights_path= 'densenet121_weights_tf_dim_ordering_tf_kernels_notop.h5'
base_model_i4.load_weights(weights_path,by_name=True)

####### manually get the first conv layer weights (for 3 channels),and computer 4-th channel weight
f=h5py.File(weights_path)
weight=f['conv1']['conv']['conv1']['conv']['kernel:0'].value
weights=[]
weights.append(np.concatenate([weight, 0.5 * (weight[:, :, :1, :] + weight[:, :, 2:, :])], axis=2))
base_model_i4.layers[2].set_weights(weights)

x=base_model_i4.output
x = GlobalAveragePooling2D()(x)
x = BatchNormalization(axis=-1)(x)
x = Dense(1024, activation='relu')(x)
x = Dense(28)(x)
model = Model(inputs=base_model_i4.input, outputs=x)

gledsonmelotti · 2018-12-27T22:16:19Z

just a small correction it should be shape=(200, 200, 3) in
input = Input(shape=(3,200,200),name = 'image_input')
Otherwise, you will get this error:
ValueError: number of input channels does not match corresponding dimension of filter, 100 != 3

How do I do to train a model that has images with 5 channels in input using pre trained VGG16?

The key lies in keras api load_weights parameter by_name.If by_name is TRUE, weights are loaded into layers only if they share the same name. This is useful for fine-tuning or transfer-learning models where some of the layers have changed.

This is my model with 4 channels,you can do like this.
input_shape=(256,256,4)
input_tensor = Input(input_shape)
base_model_i4 = DenseNet121(include_top=False, weights=None,input_tensor=input_tensor)
######### this trick !
base_model_i4.layers[2].name='new_conv1/conv'
weights_path= 'densenet121_weights_tf_dim_ordering_tf_kernels_notop.h5'
base_model_i4.load_weights(weights_path,by_name=True)

####### manually get the first conv layer weights (for 3 channels),and computer 4-th channel weight
f=h5py.File(weights_path)
weight=f['conv1']['conv']['conv1']['conv']['kernel:0'].value
weights=[]
weights.append(np.concatenate([weight, 0.5 * (weight[:, :, :1, :] + weight[:, :, 2:, :])], axis=2))
base_model_i4.layers[2].set_weights(weights)

x=base_model_i4.output
x = GlobalAveragePooling2D()(x)
x = BatchNormalization(axis=-1)(x)
x = Dense(1024, activation='relu')(x)
x = Dense(28)(x)
model = Model(inputs=base_model_i4.input, outputs=x)

Hi @xuzhengxiao, how are you? I am new to the area of neural networks and python. I'm studying alone, so I have some doubts. Your idea seems to be very interesting. But I did not understand some commands. I could not understand why you used base_model_i4.layers [2] .name = 'new_conv1 / conv'. What does 'new_conv1 / conv' mean? How did you know it was layer [2] to replace the name? In the command f ['conv1'] ['conv'] ['conv1'] ['conv'] ['kernel: 0']. Value, how did you discover this conv and conv1 sequence? And what does ['kernel: 0'] mean?

I thank you for your attention,
Gledson.

xuzhengxiao · 2018-12-28T05:52:29Z

@gledsonmelotti
1. 'new_conv1/conv' is just a new layer name,you can also use other names.Just as I mentioned before，in keras you can change the layer name to decide which layer's weight doesn't be loaded.

2.which layer shoud be changed depends on your needs and pre trained model.In my case,I want to substitute input layer with 4-channel layer.Using model.summary() to check the pre trained model structure.(check this link :https://user-images.githubusercontent.com/9840937/50503586-e84f4680-0aa2-11e9-8648-a54cce1406e2.png). In pre trained model,input layer has 3 channels,so the in channel of first conv layer is 3,which doesn't meet my customed need(in channel 4).So model.layers[2] should be changed.

3.f['conv1']['conv']['conv1']['conv']['kernel:0'].value,actually you need check the h5 file to load the weights you needed.

pappuyadav · 2019-10-24T06:56:39Z

@nabsabraham
Try this modified section of code and see if it helps.

#Use the generated model
**x = model_vgg16_conv.get_layer('block5_conv3').output

#Add the fully-connected layers
x = Flatten(name='flatten')(x)**
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dropout(0.5)(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(1, activation='softmax', name='predictions')(x)

sumaira-hussain · 2019-10-29T12:41:42Z

Hi I also try to create my model with VGG16 pretrained model but I am getting following error:
ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 2 array(s), but instead got the following list of 1 arrays: [array([[[1., 0., 0., ..., 0., 0., 0.],

the code files are attached for reference
VGG.docx
generator.docx

rushabhpatil · 2019-11-10T16:46:44Z

just a small correction it should be shape=(200, 200, 3) in
input = Input(shape=(3,200,200),name = 'image_input')

Otherwise, you will get this error:
ValueError: number of input channels does not match corresponding dimension of filter, 100 != 3

if my input is having "channels_first" what should be the changes??

gledsonmelotti · 2019-11-11T13:44:14Z

@gledsonmelotti

'new_conv1/conv' is just a new layer name,you can also use other names.Just as I mentioned before，in keras you can change the layer name to decide which layer's weight doesn't be loaded.

2.which layer shoud be changed depends on your needs and pre trained model.In my case,I want to substitute input layer with 4-channel layer.Using model.summary() to check the pre trained model structure.(check this link :https://user-images.githubusercontent.com/9840937/50503586-e84f4680-0aa2-11e9-8648-a54cce1406e2.png). In pre trained model,input layer has 3 channels,so the in channel of first conv layer is 3,which doesn't meet my customed need(in channel 4).So model.layers[2] should be changed.

3.f['conv1']['conv']['conv1']['conv']['kernel:0'].value,actually you need check the h5 file to load the weights you needed.

Tank you very much @xuzhengxiao. I really liked your explanation. Thank you very much.

felipe-chamas · 2020-01-29T23:14:49Z

just a small correction it should be shape=(200, 200, 3) in
input = Input(shape=(3,200,200),name = 'image_input')
Otherwise, you will get this error:
ValueError: number of input channels does not match corresponding dimension of filter, 100 != 3

How do I do to train a model that has images with 5 channels in input using pre trained VGG16?

The key lies in keras api load_weights parameter by_name.If by_name is TRUE, weights are loaded into layers only if they share the same name. This is useful for fine-tuning or transfer-learning models where some of the layers have changed.

This is my model with 4 channels,you can do like this.
input_shape=(256,256,4)
input_tensor = Input(input_shape)
base_model_i4 = DenseNet121(include_top=False, weights=None,input_tensor=input_tensor)
######### this trick !
base_model_i4.layers[2].name='new_conv1/conv'
weights_path= 'densenet121_weights_tf_dim_ordering_tf_kernels_notop.h5'
base_model_i4.load_weights(weights_path,by_name=True)

####### manually get the first conv layer weights (for 3 channels),and computer 4-th channel weight
f=h5py.File(weights_path)
weight=f['conv1']['conv']['conv1']['conv']['kernel:0'].value
weights=[]
weights.append(np.concatenate([weight, 0.5 * (weight[:, :, :1, :] + weight[:, :, 2:, :])], axis=2))
base_model_i4.layers[2].set_weights(weights)

x=base_model_i4.output
x = GlobalAveragePooling2D()(x)
x = BatchNormalization(axis=-1)(x)
x = Dense(1024, activation='relu')(x)
x = Dense(28)(x)
model = Model(inputs=base_model_i4.input, outputs=x)

Worked very well. I have used the same model as @xuzhengxiao and had the same need of adding the 4th channel. But if anyone tries this, remember to set the layer name back in order to save/load the model afterwards without any problems.

model.layers[2]._name='conv1/conv'

MEhtisham8217 · 2020-12-06T17:10:52Z

@DanqingZ help me out please
Use the same dataset (LFW ) ==> http://vis-www.cs.umass.edu/lfw/lfw-funneled.tgz
Apply a pretrained convolutional neural network, (using VGG16), and replace the fully-connected layers with your own. Freeze the weights of the convolutional layers and only train the new FC layer.
Sample code for using pre-trained VGG16 for another classification task is available from:
#4465

H-Ayoobi · 2023-09-20T12:36:31Z

here is the output: Epoch 44/200 1562/1562 [==============================] - 358s - loss: 0.4885 - acc: 0.8489 - val_loss: 0.5697 - val_acc: 0.8346 Epoch 45/200 1562/1562 [==============================] - 357s - loss: 0.4232 - acc: 0.8672 - val_loss: 1.2105 - val_acc: 0.6898 Epoch 46/200 1562/1562 [==============================] - 354s - loss: 12.1607 - acc: 0.2163 - val_loss: 14.5063 - val_acc: 0.1000 Epoch 47/200 1562/1562 [==============================] - 353s - loss: 14.5035 - acc: 0.1002 - val_loss: 14.5063 - val_acc: 0.1000 Epoch 48/200 1562/1562 [==============================] - 354s - loss: 14.5048 - acc: 0.1001 - val_loss: 14.5063 - val_acc: 0.1000 Epoch 49/200 1562/1562 [==============================] - 355s - loss: 14.5029 - acc: 0.1002 - val_loss: 14.5063 - val_acc: 0.1000 Epoch 50/200 1562/1562 [==============================] - 352s - loss: 14.5025 - acc: 0.1002 - val_loss: 14.5063 - val_acc: 0.1000 Epoch 51/200 1562/1562 [==============================] - 353s - loss: 14.5154 - acc: 0.0994 - val_loss: 14.5063 - val_acc: 0.1000 Epoch 52/200 1562/1562 [==============================] - 354s - loss: 14.5035 - acc: 0.1002 - val_loss: 14.5063 - val_acc: 0.1000

here is the code:

"""
Adapted from keras example cifar10_cnn.py
Train ResNet-18 on the CIFAR10 small images dataset.

GPU run command with Theano backend (with TensorFlow, the GPU is automatically used):
    THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python cifar10.py
"""
from __future__ import print_function
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
from keras.utils import np_utils
from keras.callbacks import ReduceLROnPlateau, CSVLogger, EarlyStopping
from scipy.misc import toimage, imresize
import numpy as np
#import resnet
from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
from keras.layers import Input, Flatten, Dense
from keras.models import Model
import numpy as np
from keras.callbacks import ModelCheckpoint


from keras import backend as K
#K.set_image_dim_ordering('th')
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)


lr_reducer = ReduceLROnPlateau(factor=np.sqrt(0.1), cooldown=0, patience=5, min_lr=0.5e-6)
early_stopper = EarlyStopping(min_delta=0.001, patience=20)
csv_logger = CSVLogger('./results/vgg16imagenetpretrained_upsampleimage_cifar10_data_argumentation.csv')

batch_size = 32
nb_classes = 10
nb_epoch = 200
data_augmentation = True

# input image dimensions
img_rows, img_cols = 197, 197
I_R = 64
# The CIFAR10 images are RGB.
img_channels = 3

# The data, shuffled and split between train and test sets:
(X_train_original, y_train), (X_test_original, y_test) = cifar10.load_data()

# Convert class vectors to binary class matrices.
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)

X_train_original = X_train_original.astype('float32')
X_test_original = X_test_original.astype('float32')


# upsample it to size 64X64X3


X_train = np.zeros((X_train_original.shape[0],I_R,I_R,3))
for i in range(X_train_original.shape[0]):
    X_train[i] = imresize(X_train_original[i], (I_R,I_R,3), interp='bilinear', mode=None)


X_test = np.zeros((X_test_original.shape[0],I_R,I_R,3))
for i in range(X_test_original.shape[0]):
    X_test[i] = imresize(X_test_original[i], (I_R,I_R,3), interp='bilinear', mode=None)


# subtract mean and normalize
mean_image = np.mean(X_train, axis=0)
X_train -= mean_image
X_test -= mean_image
X_train /= 128.
X_test /= 128.


print(X_train.shape)




#model = resnet.ResnetBuilder.build_resnet_18((img_channels, img_rows, img_cols), nb_classes)
#model =get_vgg_pretrained_model()


 #Get back the convolutional part of a VGG network trained on ImageNet
model_vgg16_conv = VGG16(input_shape=(I_R,I_R,3),weights='imagenet', include_top=False,pooling=max)
model_vgg16_conv.summary()
#print("ss")
    #Create your own input format (here 3x200x200)
input = Input(shape=(I_R,I_R,3),name = 'image_input')
#print("ss2")
    #Use the generated model 
output_vgg16_conv = model_vgg16_conv(input)
print("ss3")
    #Add the fully-connected layers 
x = Flatten(name='flatten')(output_vgg16_conv)
x = Dense(512, activation='relu', name='fc1')(x)
x = Dense(128, activation='relu', name='fc2')(x)
x = Dense(10, activation='softmax', name='predictions')(x)

    #Create your own model 
my_model = Model(input=input, output=x)

    #In the summary, weights and layers from VGG part will be hidden, but they will be fit during the training
my_model.summary()



my_model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

# serialize model to JSON
model_json = my_model.to_json()
with open("./results/model_data_argumentation.json", "w") as json_file:
    json_file.write(model_json)


print(my_model.summary())
if not data_augmentation:
    print('Not using data augmentation.')
    # checkpoint
   # filepath="./results/weights-improvement-{epoch:02d}-{val_acc:.2f}.hdf5"
   # checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
   # callbacks_list = [checkpoint]
    # Fit the model
    #model.fit(X, Y, validation_split=0.33, epochs=150, batch_size=10, callbacks=callbacks_list, verbose=0)

    my_model.fit(X_train, Y_train,
              batch_size=batch_size,
              nb_epoch=nb_epoch,
              validation_data=(X_test, Y_test),
              shuffle=True,
              callbacks=[lr_reducer, early_stopper, csv_logger])
else:
    print('Using real-time data augmentation.')
    # This will do preprocessing and realtime data augmentation:
    datagen = ImageDataGenerator(
        featurewise_center=False,  # set input mean to 0 over the dataset
        samplewise_center=False,  # set each sample mean to 0
        featurewise_std_normalization=False,  # divide inputs by std of the dataset
        samplewise_std_normalization=False,  # divide each input by its std
        zca_whitening=False,  # apply ZCA whitening
        rotation_range=0,  # randomly rotate images in the range (degrees, 0 to 180)
        width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)
        height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
        horizontal_flip=True,  # randomly flip images
        vertical_flip=False)  # randomly flip images

    # Compute quantities required for featurewise normalization
    # (std, mean, and principal components if ZCA whitening is applied).
    datagen.fit(X_train)

    # Fit the model on the batches generated by datagen.flow().
    my_model.fit_generator(datagen.flow(X_train, Y_train, batch_size=batch_size),
                        steps_per_epoch=X_train.shape[0] // batch_size,
                        validation_data=(X_test, Y_test),
                        epochs=nb_epoch, verbose=1, max_q_size=100,
                        callbacks=[lr_reducer, early_stopper, csv_logger])
my_model.save_weights("./results/vgg16_pretrained_upsample_model_data_argumentation.h5")
print("Saved model to disk")

I had the same issue with VGG16 and this problem can be solved using BatchNormalization layers between the dense layers.
For the ResNET backbone, I did not have the same issue but you can still use the BachNormalization layers.

vabatista closed this as completed Nov 22, 2016

MEhtisham8217 mentioned this issue Dec 6, 2020

Need some guidelines #14289

Closed

using pre trained VGG16 for another classification task #4465

using pre trained VGG16 for another classification task #4465

Comments

vabatista commented Nov 22, 2016

JGuillaumin commented Nov 22, 2016 • edited Loading

JGuillaumin commented Nov 22, 2016 • edited Loading

vabatista commented Nov 22, 2016

UkiDLucas commented Feb 6, 2017

parikshit95 commented Feb 26, 2017

howardya commented Feb 26, 2017 • edited Loading

JordanPeltier commented Mar 9, 2017

abdulsalama commented Mar 16, 2017

LindaSt commented Apr 8, 2017 • edited Loading

KamalOthman commented Apr 24, 2017

chauhan-utk commented May 13, 2017

KamalOthman commented May 14, 2017

jinkos commented May 23, 2017

DanqingZ commented Jun 25, 2017

LindaSt commented Jun 26, 2017 • edited Loading

aniket03 commented Jul 4, 2017 • edited Loading

liketheflower commented Jul 6, 2017

liketheflower commented Jul 6, 2017 • edited Loading

BogoK commented Nov 23, 2017

nabsabraham commented Jul 22, 2018

JGuillaumin commented Jul 23, 2018 • edited Loading

JGuillaumin commented Jul 23, 2018

nabsabraham commented Jul 24, 2018 • edited Loading

JGuillaumin commented Jul 24, 2018

nabsabraham commented Jul 24, 2018

dgtlmoon commented Nov 5, 2018

gledsonmelotti commented Dec 8, 2018

gledsonmelotti commented Dec 8, 2018

xuzhengxiao commented Dec 24, 2018

gledsonmelotti commented Dec 27, 2018 • edited Loading

xuzhengxiao commented Dec 28, 2018

pappuyadav commented Oct 24, 2019

sumaira-hussain commented Oct 29, 2019

rushabhpatil commented Nov 10, 2019

gledsonmelotti commented Nov 11, 2019

felipe-chamas commented Jan 29, 2020 • edited Loading

MEhtisham8217 commented Dec 6, 2020

H-Ayoobi commented Sep 20, 2023

JGuillaumin commented Nov 22, 2016 •

edited

Loading

JGuillaumin commented Nov 22, 2016 •

edited

Loading

howardya commented Feb 26, 2017 •

edited

Loading

LindaSt commented Apr 8, 2017 •

edited

Loading

LindaSt commented Jun 26, 2017 •

edited

Loading

aniket03 commented Jul 4, 2017 •

edited

Loading

liketheflower commented Jul 6, 2017 •

edited

Loading

JGuillaumin commented Jul 23, 2018 •

edited

Loading

nabsabraham commented Jul 24, 2018 •

edited

Loading

gledsonmelotti commented Dec 27, 2018 •

edited

Loading

felipe-chamas commented Jan 29, 2020 •

edited

Loading