New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using pre trained VGG16 for another classification task #4465

Closed
vabatista opened this Issue Nov 22, 2016 · 25 comments

Comments

Projects
None yet
@vabatista

vabatista commented Nov 22, 2016

How can I user the new keras.applications.VGG16 class to start my training with the weights in H5 file, but for a new task with 8 classes only?

I didn't figure out how to pop the softmax layer and put another one with 8 perceptons only.

@JGuillaumin

This comment has been minimized.

Show comment
Hide comment
@JGuillaumin

JGuillaumin Nov 22, 2016

One way to do this, is to not include the fully-connected layers at the top of the network.
Then add new fully-connect layers with random initialization, with the correct num of output.

The convolutional layers will be initialized with weights based on a training on ImageNet dataset.
Generally, we can say that the convolutional layers work as features extractors.

But you to train train the whole network for your 8 classes. This training will train the new fully-connected layers and fine-train the convolutional layers. (you can freeze the convolutional layers, to keep the same features extractors).

from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
from keras.layers import Input, Flatten, Dense
from keras.models import Model
import numpy as np

#Get back the convolutional part of a VGG network trained on ImageNet
model_vgg16_conv = VGG16(weights='imagenet', include_top=False)
model_vgg16_conv.summary()

#Create your own input format (here 3x200x200)
input = Input(shape=(3,200,200),name = 'image_input')

#Use the generated model 
output_vgg16_conv = model_vgg16_conv(input)

#Add the fully-connected layers 
x = Flatten(name='flatten')(output_vgg16_conv)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(8, activation='softmax', name='predictions')(x)

#Create your own model 
my_model = Model(input=input, output=x)

#In the summary, weights and layers from VGG part will be hidden, but they will be fit during the training
my_model.summary()


#Then training with your data ! 

JGuillaumin commented Nov 22, 2016

One way to do this, is to not include the fully-connected layers at the top of the network.
Then add new fully-connect layers with random initialization, with the correct num of output.

The convolutional layers will be initialized with weights based on a training on ImageNet dataset.
Generally, we can say that the convolutional layers work as features extractors.

But you to train train the whole network for your 8 classes. This training will train the new fully-connected layers and fine-train the convolutional layers. (you can freeze the convolutional layers, to keep the same features extractors).

from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
from keras.layers import Input, Flatten, Dense
from keras.models import Model
import numpy as np

#Get back the convolutional part of a VGG network trained on ImageNet
model_vgg16_conv = VGG16(weights='imagenet', include_top=False)
model_vgg16_conv.summary()

#Create your own input format (here 3x200x200)
input = Input(shape=(3,200,200),name = 'image_input')

#Use the generated model 
output_vgg16_conv = model_vgg16_conv(input)

#Add the fully-connected layers 
x = Flatten(name='flatten')(output_vgg16_conv)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(8, activation='softmax', name='predictions')(x)

#Create your own model 
my_model = Model(input=input, output=x)

#In the summary, weights and layers from VGG part will be hidden, but they will be fit during the training
my_model.summary()


#Then training with your data ! 
@JGuillaumin

This comment has been minimized.

Show comment
Hide comment
@JGuillaumin

JGuillaumin Nov 22, 2016

If you want to change only the last layer :

# Generate a model with all layers (with top)
vgg16 = VGG16(weights=None, include_top=True)

#Add a layer where input is the output of the  second last layer 
x = Dense(8, activation='softmax', name='predictions')(vgg16.layers[-2].output)

#Then create the corresponding model 
my_model = Model(input=vgg16.input, output=x)
my_model.summary()

JGuillaumin commented Nov 22, 2016

If you want to change only the last layer :

# Generate a model with all layers (with top)
vgg16 = VGG16(weights=None, include_top=True)

#Add a layer where input is the output of the  second last layer 
x = Dense(8, activation='softmax', name='predictions')(vgg16.layers[-2].output)

#Then create the corresponding model 
my_model = Model(input=vgg16.input, output=x)
my_model.summary()
@vabatista

This comment has been minimized.

Show comment
Hide comment
@vabatista

vabatista Nov 22, 2016

Thank you very much!

vabatista commented Nov 22, 2016

Thank you very much!

@vabatista vabatista closed this Nov 22, 2016

@UkiDLucas

This comment has been minimized.

Show comment
Hide comment
@UkiDLucas

UkiDLucas Feb 6, 2017

Thank you JGuillaumin!

UkiDLucas commented Feb 6, 2017

Thank you JGuillaumin!

@parikshit95

This comment has been minimized.

Show comment
Hide comment
@parikshit95

parikshit95 Feb 26, 2017

For training do you just use the model.compile() and model.fit(data,labels) commands?

parikshit95 commented Feb 26, 2017

For training do you just use the model.compile() and model.fit(data,labels) commands?

@howardya

This comment has been minimized.

Show comment
Hide comment
@howardya

howardya Feb 26, 2017

@JGuillaumin In your reply, when we only want to change the last layer, did you mean

vgg16 = VGG16(weights='imagenet', include_top=True)

instead of

vgg16 = VGG16(weights=None, include_top=True)

howardya commented Feb 26, 2017

@JGuillaumin In your reply, when we only want to change the last layer, did you mean

vgg16 = VGG16(weights='imagenet', include_top=True)

instead of

vgg16 = VGG16(weights=None, include_top=True)

@JordanPeltier

This comment has been minimized.

Show comment
Hide comment
@JordanPeltier

JordanPeltier Mar 9, 2017

@howardya , actually not.
He wants to use pretrained vgg16 model with Imagenet but just remove the last layer (softmax) and use another one for another classification task.

JordanPeltier commented Mar 9, 2017

@howardya , actually not.
He wants to use pretrained vgg16 model with Imagenet but just remove the last layer (softmax) and use another one for another classification task.

@abdulsalama

This comment has been minimized.

Show comment
Hide comment
@abdulsalama

abdulsalama Mar 16, 2017

just a small correction it should be shape=(200, 200, 3) in
input = Input(shape=(3,200,200),name = 'image_input')

Otherwise, you will get this error:
ValueError: number of input channels does not match corresponding dimension of filter, 100 != 3

abdulsalama commented Mar 16, 2017

just a small correction it should be shape=(200, 200, 3) in
input = Input(shape=(3,200,200),name = 'image_input')

Otherwise, you will get this error:
ValueError: number of input channels does not match corresponding dimension of filter, 100 != 3

@LindaSt

This comment has been minimized.

Show comment
Hide comment
@LindaSt

LindaSt Apr 8, 2017

@JGuillaumin
Thank you very much for your example! I have been trying to implement your example on the MNIST dataset. However, the accuracy just keeps fluctuation around 10% during training and validation. I really do not understand what I did wrong. I added my code below.

Thank you very much!

import sys
import numpy as np
import cv2
import sklearn.metrics as sklm

from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
from keras.layers import Input, Flatten, Dense
from keras.models import Model
from keras.datasets import mnist

from keras import backend as K
img_dim_ordering = 'tf'
K.set_image_dim_ordering(img_dim_ordering)

# the model
def pretrained_model(img_shape, num_classes):
    model_vgg16_conv = VGG16(weights='imagenet', include_top=False)
    #model_vgg16_conv.summary()
    
    #Create your own input format
    keras_input = Input(shape=img_shape, name = 'image_input')
    
    #Use the generated model 
    output_vgg16_conv = model_vgg16_conv(keras_input)
    
    #Add the fully-connected layers 
    x = Flatten(name='flatten')(output_vgg16_conv)
    x = Dense(4096, activation=layer_type, name='fc1')(x)
    x = Dense(4096, activation=layer_type, name='fc2')(x)
    x = Dense(num_classes, activation='softmax', name='predictions')(x)
    
    #Create your own model 
    pretrained_model = Model(inputs=keras_input, outputs=x)
    pretrained_model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    
    return pretrained_model

# loading the data
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# converting it to RGB
x_train = [cv2.cvtColor(cv2.resize(i, (32,32)), cv2.COLOR_GRAY2BGR) for i in x_train]
x_train = np.concatenate([arr[np.newaxis] for arr in x_train]).astype('float32')

x_test = [cv2.cvtColor(cv2.resize(i, (32,32)), cv2.COLOR_GRAY2BGR) for i in x_test]
x_test = np.concatenate([arr[np.newaxis] for arr in x_test]).astype('float32')

# training the model
model = pretrained_model(x_train.shape[1:], len(set(y_train)), 'relu')
hist = model.fit(x_train, y_train, epochs=2, validation_data=(x_test, y_test), verbose=1)

LindaSt commented Apr 8, 2017

@JGuillaumin
Thank you very much for your example! I have been trying to implement your example on the MNIST dataset. However, the accuracy just keeps fluctuation around 10% during training and validation. I really do not understand what I did wrong. I added my code below.

Thank you very much!

import sys
import numpy as np
import cv2
import sklearn.metrics as sklm

from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
from keras.layers import Input, Flatten, Dense
from keras.models import Model
from keras.datasets import mnist

from keras import backend as K
img_dim_ordering = 'tf'
K.set_image_dim_ordering(img_dim_ordering)

# the model
def pretrained_model(img_shape, num_classes):
    model_vgg16_conv = VGG16(weights='imagenet', include_top=False)
    #model_vgg16_conv.summary()
    
    #Create your own input format
    keras_input = Input(shape=img_shape, name = 'image_input')
    
    #Use the generated model 
    output_vgg16_conv = model_vgg16_conv(keras_input)
    
    #Add the fully-connected layers 
    x = Flatten(name='flatten')(output_vgg16_conv)
    x = Dense(4096, activation=layer_type, name='fc1')(x)
    x = Dense(4096, activation=layer_type, name='fc2')(x)
    x = Dense(num_classes, activation='softmax', name='predictions')(x)
    
    #Create your own model 
    pretrained_model = Model(inputs=keras_input, outputs=x)
    pretrained_model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    
    return pretrained_model

# loading the data
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# converting it to RGB
x_train = [cv2.cvtColor(cv2.resize(i, (32,32)), cv2.COLOR_GRAY2BGR) for i in x_train]
x_train = np.concatenate([arr[np.newaxis] for arr in x_train]).astype('float32')

x_test = [cv2.cvtColor(cv2.resize(i, (32,32)), cv2.COLOR_GRAY2BGR) for i in x_test]
x_test = np.concatenate([arr[np.newaxis] for arr in x_test]).astype('float32')

# training the model
model = pretrained_model(x_train.shape[1:], len(set(y_train)), 'relu')
hist = model.fit(x_train, y_train, epochs=2, validation_data=(x_test, y_test), verbose=1)
@KamalOthman

This comment has been minimized.

Show comment
Hide comment
@KamalOthman

KamalOthman Apr 24, 2017

Hi @JGuillaumin and Everyone
I used the pre-trained VGG16 model without its top layers, i.e. fully connected layers. I trained my FC with train & validation datasets. I followed the example in https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
Everything is working well. Now, I have the VGG16 model with its weights and my FC layers with its weights.
For a new test data, How can I merge both models in order to predict a new data?

I really appreciate your help gyus

Kamal

KamalOthman commented Apr 24, 2017

Hi @JGuillaumin and Everyone
I used the pre-trained VGG16 model without its top layers, i.e. fully connected layers. I trained my FC with train & validation datasets. I followed the example in https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
Everything is working well. Now, I have the VGG16 model with its weights and my FC layers with its weights.
For a new test data, How can I merge both models in order to predict a new data?

I really appreciate your help gyus

Kamal

@chauhan-utk

This comment has been minimized.

Show comment
Hide comment
@chauhan-utk

chauhan-utk May 13, 2017

@KamalOthman I am also trying to do the same thing. Update if you find any solution.

chauhan-utk commented May 13, 2017

@KamalOthman I am also trying to do the same thing. Update if you find any solution.

@KamalOthman

This comment has been minimized.

Show comment
Hide comment
@KamalOthman

KamalOthman May 14, 2017

Hi @chauhan-utk
Yes, I solved it. Please find it here #6408

Cheers

KamalOthman commented May 14, 2017

Hi @chauhan-utk
Yes, I solved it. Please find it here #6408

Cheers

@jinkos

This comment has been minimized.

Show comment
Hide comment
@jinkos

jinkos May 23, 2017

Should the input images be normalised in some specific way in order to make best use of the features?
I am normalizing each image by subtracting the mean and dividing by the SD. Seems to work. But is there a better way?

jinkos commented May 23, 2017

Should the input images be normalised in some specific way in order to make best use of the features?
I am normalizing each image by subtracting the mean and dividing by the SD. Seems to work. But is there a better way?

@DanqingZ

This comment has been minimized.

Show comment
Hide comment
@DanqingZ

DanqingZ Jun 25, 2017

@LindaSt Is your problem solved?

DanqingZ commented Jun 25, 2017

@LindaSt Is your problem solved?

@LindaSt

This comment has been minimized.

Show comment
Hide comment
@LindaSt

LindaSt Jun 26, 2017

@DanqingZ Yes, I solved my problem by freezing the layers in the base model.

for layer in base_model.layers:
    layer.trainable = False

LindaSt commented Jun 26, 2017

@DanqingZ Yes, I solved my problem by freezing the layers in the base model.

for layer in base_model.layers:
    layer.trainable = False
@aniket03

This comment has been minimized.

Show comment
Hide comment
@aniket03

aniket03 Jul 4, 2017

@LindaSt @DanqingZ I am using conv layers of VGG16 for feature extraction. I tried freezing layers of VGG16 but still accuracy on train and val still remains at 10%, Can u pls help?

from __future__ import division

from keras.applications.vgg16 import VGG16
from keras.layers import Input, Flatten, Dense, Dropout
from keras.models import Model, Sequential
from keras.optimizers import SGD
from keras.utils import np_utils

from pandas import read_csv
import pickle

from sklearn.preprocessing import LabelEncoder

# CONSTANTS
DATAPOINTS_TO_CONSIDER = 10000
TRAINING_DATA_PERCENTAGE = 0.8

# load data
data_file = open("cifar10.pkl")
X = pickle.load(data_file)

X = X[:DATAPOINTS_TO_CONSIDER]
X = X.reshape(X.shape[0], X.shape[3], X.shape[1], X.shape[2])

# Split images into train - validation sets
n_points = X.shape[0]
tr_points = int(TRAINING_DATA_PERCENTAGE * n_points)
X_train = X[0:tr_points]
X_val = X[tr_points:]
del X
data_file.close()

# normalize inputs from 0-255 to 0.0-1.0
X_train = X_train.astype('float32')
X_val = X_val.astype('float32')
X_train = X_train/255.0
X_val = X_val/255.0

# Load labels
dataframe = read_csv('trainLabels.csv')
labels = dataframe.values
Y = labels[:, 1]
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
categorical_Y = np_utils.to_categorical(encoded_Y)

# Split labels into train and validation sets
y_train = categorical_Y[:tr_points]
y_val = categorical_Y[tr_points:DATAPOINTS_TO_CONSIDER]
del labels, Y

num_classes = 10  # Hard coded now will change

# Get back the convolutional part of a VGG network trained on ImageNet
model_vgg16_conv = VGG16(weights='imagenet', include_top=False)
model_vgg16_conv.summary()

# Make vgg16 model layers as non trainable
for layer in model_vgg16_conv.layers:
    layer.trainable = False

# Create your own input format (here 3x32x32)
input = Input(shape=(3,32,32), name='image_input')

# Use the generated model
output_vgg16_conv = model_vgg16_conv(input)

# Add the flatten layer
x = Flatten(name='flatten')(output_vgg16_conv)

# Create your own model
my_model = Model(input=input, output=x)
my_model.summary()

# Collect features obtained from VGG
features_for_train_data = my_model.predict(X_train)
features_for_test_data = my_model.predict(X_val)

model = Sequential()
model.add(Dense(256, input_shape=features_for_train_data.shape[1:], activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

sgd = SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd,
              loss='mse',
              metrics=['accuracy'])

model.fit(features_for_train_data, y_train,
          nb_epoch=80,
          batch_size=32,
          validation_data=(features_for_test_data, y_val))

@LindaSt @DanqingZ ^ sorted now.

aniket03 commented Jul 4, 2017

@LindaSt @DanqingZ I am using conv layers of VGG16 for feature extraction. I tried freezing layers of VGG16 but still accuracy on train and val still remains at 10%, Can u pls help?

from __future__ import division

from keras.applications.vgg16 import VGG16
from keras.layers import Input, Flatten, Dense, Dropout
from keras.models import Model, Sequential
from keras.optimizers import SGD
from keras.utils import np_utils

from pandas import read_csv
import pickle

from sklearn.preprocessing import LabelEncoder

# CONSTANTS
DATAPOINTS_TO_CONSIDER = 10000
TRAINING_DATA_PERCENTAGE = 0.8

# load data
data_file = open("cifar10.pkl")
X = pickle.load(data_file)

X = X[:DATAPOINTS_TO_CONSIDER]
X = X.reshape(X.shape[0], X.shape[3], X.shape[1], X.shape[2])

# Split images into train - validation sets
n_points = X.shape[0]
tr_points = int(TRAINING_DATA_PERCENTAGE * n_points)
X_train = X[0:tr_points]
X_val = X[tr_points:]
del X
data_file.close()

# normalize inputs from 0-255 to 0.0-1.0
X_train = X_train.astype('float32')
X_val = X_val.astype('float32')
X_train = X_train/255.0
X_val = X_val/255.0

# Load labels
dataframe = read_csv('trainLabels.csv')
labels = dataframe.values
Y = labels[:, 1]
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
categorical_Y = np_utils.to_categorical(encoded_Y)

# Split labels into train and validation sets
y_train = categorical_Y[:tr_points]
y_val = categorical_Y[tr_points:DATAPOINTS_TO_CONSIDER]
del labels, Y

num_classes = 10  # Hard coded now will change

# Get back the convolutional part of a VGG network trained on ImageNet
model_vgg16_conv = VGG16(weights='imagenet', include_top=False)
model_vgg16_conv.summary()

# Make vgg16 model layers as non trainable
for layer in model_vgg16_conv.layers:
    layer.trainable = False

# Create your own input format (here 3x32x32)
input = Input(shape=(3,32,32), name='image_input')

# Use the generated model
output_vgg16_conv = model_vgg16_conv(input)

# Add the flatten layer
x = Flatten(name='flatten')(output_vgg16_conv)

# Create your own model
my_model = Model(input=input, output=x)
my_model.summary()

# Collect features obtained from VGG
features_for_train_data = my_model.predict(X_train)
features_for_test_data = my_model.predict(X_val)

model = Sequential()
model.add(Dense(256, input_shape=features_for_train_data.shape[1:], activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

sgd = SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd,
              loss='mse',
              metrics=['accuracy'])

model.fit(features_for_train_data, y_train,
          nb_epoch=80,
          batch_size=32,
          validation_data=(features_for_test_data, y_val))

@LindaSt @DanqingZ ^ sorted now.

@liketheflower

This comment has been minimized.

Show comment
Hide comment
@liketheflower

liketheflower Jul 6, 2017

@LindaSt @DanqingZ@aniket03
Here are more hints.
I am upsample the cifar10 data set from 3232 to 197197 and the accuracy is always 0.10.
I change the size to 64*64, it runs well at the beginning and the accuracy goes to 84.7, then since
46 epoch, the result reduced to 0.10.
Something must be wrong. Hope this can give you guys more hints and let's fix this problem together.

BTW, the application resnet50 model works good on the cifar10 data set when upsample the image size to 197*197.

liketheflower commented Jul 6, 2017

@LindaSt @DanqingZ@aniket03
Here are more hints.
I am upsample the cifar10 data set from 3232 to 197197 and the accuracy is always 0.10.
I change the size to 64*64, it runs well at the beginning and the accuracy goes to 84.7, then since
46 epoch, the result reduced to 0.10.
Something must be wrong. Hope this can give you guys more hints and let's fix this problem together.

BTW, the application resnet50 model works good on the cifar10 data set when upsample the image size to 197*197.

@liketheflower

This comment has been minimized.

Show comment
Hide comment
@liketheflower

liketheflower Jul 6, 2017

here is the output:
Epoch 44/200
1562/1562 [==============================] - 358s - loss: 0.4885 - acc: 0.8489 - val_loss: 0.5697 - val_acc: 0.8346
Epoch 45/200
1562/1562 [==============================] - 357s - loss: 0.4232 - acc: 0.8672 - val_loss: 1.2105 - val_acc: 0.6898
Epoch 46/200
1562/1562 [==============================] - 354s - loss: 12.1607 - acc: 0.2163 - val_loss: 14.5063 - val_acc: 0.1000
Epoch 47/200
1562/1562 [==============================] - 353s - loss: 14.5035 - acc: 0.1002 - val_loss: 14.5063 - val_acc: 0.1000
Epoch 48/200
1562/1562 [==============================] - 354s - loss: 14.5048 - acc: 0.1001 - val_loss: 14.5063 - val_acc: 0.1000
Epoch 49/200
1562/1562 [==============================] - 355s - loss: 14.5029 - acc: 0.1002 - val_loss: 14.5063 - val_acc: 0.1000
Epoch 50/200
1562/1562 [==============================] - 352s - loss: 14.5025 - acc: 0.1002 - val_loss: 14.5063 - val_acc: 0.1000
Epoch 51/200
1562/1562 [==============================] - 353s - loss: 14.5154 - acc: 0.0994 - val_loss: 14.5063 - val_acc: 0.1000
Epoch 52/200
1562/1562 [==============================] - 354s - loss: 14.5035 - acc: 0.1002 - val_loss: 14.5063 - val_acc: 0.1000

here is the code:

"""
Adapted from keras example cifar10_cnn.py
Train ResNet-18 on the CIFAR10 small images dataset.

GPU run command with Theano backend (with TensorFlow, the GPU is automatically used):
    THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python cifar10.py
"""
from __future__ import print_function
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
from keras.utils import np_utils
from keras.callbacks import ReduceLROnPlateau, CSVLogger, EarlyStopping
from scipy.misc import toimage, imresize
import numpy as np
#import resnet
from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
from keras.layers import Input, Flatten, Dense
from keras.models import Model
import numpy as np
from keras.callbacks import ModelCheckpoint


from keras import backend as K
#K.set_image_dim_ordering('th')
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)


lr_reducer = ReduceLROnPlateau(factor=np.sqrt(0.1), cooldown=0, patience=5, min_lr=0.5e-6)
early_stopper = EarlyStopping(min_delta=0.001, patience=20)
csv_logger = CSVLogger('./results/vgg16imagenetpretrained_upsampleimage_cifar10_data_argumentation.csv')

batch_size = 32
nb_classes = 10
nb_epoch = 200
data_augmentation = True

# input image dimensions
img_rows, img_cols = 197, 197
I_R = 64
# The CIFAR10 images are RGB.
img_channels = 3

# The data, shuffled and split between train and test sets:
(X_train_original, y_train), (X_test_original, y_test) = cifar10.load_data()

# Convert class vectors to binary class matrices.
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)

X_train_original = X_train_original.astype('float32')
X_test_original = X_test_original.astype('float32')


# upsample it to size 64X64X3


X_train = np.zeros((X_train_original.shape[0],I_R,I_R,3))
for i in range(X_train_original.shape[0]):
    X_train[i] = imresize(X_train_original[i], (I_R,I_R,3), interp='bilinear', mode=None)


X_test = np.zeros((X_test_original.shape[0],I_R,I_R,3))
for i in range(X_test_original.shape[0]):
    X_test[i] = imresize(X_test_original[i], (I_R,I_R,3), interp='bilinear', mode=None)


# subtract mean and normalize
mean_image = np.mean(X_train, axis=0)
X_train -= mean_image
X_test -= mean_image
X_train /= 128.
X_test /= 128.


print(X_train.shape)




#model = resnet.ResnetBuilder.build_resnet_18((img_channels, img_rows, img_cols), nb_classes)
#model =get_vgg_pretrained_model()


 #Get back the convolutional part of a VGG network trained on ImageNet
model_vgg16_conv = VGG16(input_shape=(I_R,I_R,3),weights='imagenet', include_top=False,pooling=max)
model_vgg16_conv.summary()
#print("ss")
    #Create your own input format (here 3x200x200)
input = Input(shape=(I_R,I_R,3),name = 'image_input')
#print("ss2")
    #Use the generated model 
output_vgg16_conv = model_vgg16_conv(input)
print("ss3")
    #Add the fully-connected layers 
x = Flatten(name='flatten')(output_vgg16_conv)
x = Dense(512, activation='relu', name='fc1')(x)
x = Dense(128, activation='relu', name='fc2')(x)
x = Dense(10, activation='softmax', name='predictions')(x)

    #Create your own model 
my_model = Model(input=input, output=x)

    #In the summary, weights and layers from VGG part will be hidden, but they will be fit during the training
my_model.summary()



my_model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

# serialize model to JSON
model_json = my_model.to_json()
with open("./results/model_data_argumentation.json", "w") as json_file:
    json_file.write(model_json)


print(my_model.summary())
if not data_augmentation:
    print('Not using data augmentation.')
    # checkpoint
   # filepath="./results/weights-improvement-{epoch:02d}-{val_acc:.2f}.hdf5"
   # checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
   # callbacks_list = [checkpoint]
    # Fit the model
    #model.fit(X, Y, validation_split=0.33, epochs=150, batch_size=10, callbacks=callbacks_list, verbose=0)

    my_model.fit(X_train, Y_train,
              batch_size=batch_size,
              nb_epoch=nb_epoch,
              validation_data=(X_test, Y_test),
              shuffle=True,
              callbacks=[lr_reducer, early_stopper, csv_logger])
else:
    print('Using real-time data augmentation.')
    # This will do preprocessing and realtime data augmentation:
    datagen = ImageDataGenerator(
        featurewise_center=False,  # set input mean to 0 over the dataset
        samplewise_center=False,  # set each sample mean to 0
        featurewise_std_normalization=False,  # divide inputs by std of the dataset
        samplewise_std_normalization=False,  # divide each input by its std
        zca_whitening=False,  # apply ZCA whitening
        rotation_range=0,  # randomly rotate images in the range (degrees, 0 to 180)
        width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)
        height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
        horizontal_flip=True,  # randomly flip images
        vertical_flip=False)  # randomly flip images

    # Compute quantities required for featurewise normalization
    # (std, mean, and principal components if ZCA whitening is applied).
    datagen.fit(X_train)

    # Fit the model on the batches generated by datagen.flow().
    my_model.fit_generator(datagen.flow(X_train, Y_train, batch_size=batch_size),
                        steps_per_epoch=X_train.shape[0] // batch_size,
                        validation_data=(X_test, Y_test),
                        epochs=nb_epoch, verbose=1, max_q_size=100,
                        callbacks=[lr_reducer, early_stopper, csv_logger])
my_model.save_weights("./results/vgg16_pretrained_upsample_model_data_argumentation.h5")
print("Saved model to disk")


liketheflower commented Jul 6, 2017

here is the output:
Epoch 44/200
1562/1562 [==============================] - 358s - loss: 0.4885 - acc: 0.8489 - val_loss: 0.5697 - val_acc: 0.8346
Epoch 45/200
1562/1562 [==============================] - 357s - loss: 0.4232 - acc: 0.8672 - val_loss: 1.2105 - val_acc: 0.6898
Epoch 46/200
1562/1562 [==============================] - 354s - loss: 12.1607 - acc: 0.2163 - val_loss: 14.5063 - val_acc: 0.1000
Epoch 47/200
1562/1562 [==============================] - 353s - loss: 14.5035 - acc: 0.1002 - val_loss: 14.5063 - val_acc: 0.1000
Epoch 48/200
1562/1562 [==============================] - 354s - loss: 14.5048 - acc: 0.1001 - val_loss: 14.5063 - val_acc: 0.1000
Epoch 49/200
1562/1562 [==============================] - 355s - loss: 14.5029 - acc: 0.1002 - val_loss: 14.5063 - val_acc: 0.1000
Epoch 50/200
1562/1562 [==============================] - 352s - loss: 14.5025 - acc: 0.1002 - val_loss: 14.5063 - val_acc: 0.1000
Epoch 51/200
1562/1562 [==============================] - 353s - loss: 14.5154 - acc: 0.0994 - val_loss: 14.5063 - val_acc: 0.1000
Epoch 52/200
1562/1562 [==============================] - 354s - loss: 14.5035 - acc: 0.1002 - val_loss: 14.5063 - val_acc: 0.1000

here is the code:

"""
Adapted from keras example cifar10_cnn.py
Train ResNet-18 on the CIFAR10 small images dataset.

GPU run command with Theano backend (with TensorFlow, the GPU is automatically used):
    THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python cifar10.py
"""
from __future__ import print_function
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
from keras.utils import np_utils
from keras.callbacks import ReduceLROnPlateau, CSVLogger, EarlyStopping
from scipy.misc import toimage, imresize
import numpy as np
#import resnet
from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
from keras.layers import Input, Flatten, Dense
from keras.models import Model
import numpy as np
from keras.callbacks import ModelCheckpoint


from keras import backend as K
#K.set_image_dim_ordering('th')
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)


lr_reducer = ReduceLROnPlateau(factor=np.sqrt(0.1), cooldown=0, patience=5, min_lr=0.5e-6)
early_stopper = EarlyStopping(min_delta=0.001, patience=20)
csv_logger = CSVLogger('./results/vgg16imagenetpretrained_upsampleimage_cifar10_data_argumentation.csv')

batch_size = 32
nb_classes = 10
nb_epoch = 200
data_augmentation = True

# input image dimensions
img_rows, img_cols = 197, 197
I_R = 64
# The CIFAR10 images are RGB.
img_channels = 3

# The data, shuffled and split between train and test sets:
(X_train_original, y_train), (X_test_original, y_test) = cifar10.load_data()

# Convert class vectors to binary class matrices.
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)

X_train_original = X_train_original.astype('float32')
X_test_original = X_test_original.astype('float32')


# upsample it to size 64X64X3


X_train = np.zeros((X_train_original.shape[0],I_R,I_R,3))
for i in range(X_train_original.shape[0]):
    X_train[i] = imresize(X_train_original[i], (I_R,I_R,3), interp='bilinear', mode=None)


X_test = np.zeros((X_test_original.shape[0],I_R,I_R,3))
for i in range(X_test_original.shape[0]):
    X_test[i] = imresize(X_test_original[i], (I_R,I_R,3), interp='bilinear', mode=None)


# subtract mean and normalize
mean_image = np.mean(X_train, axis=0)
X_train -= mean_image
X_test -= mean_image
X_train /= 128.
X_test /= 128.


print(X_train.shape)




#model = resnet.ResnetBuilder.build_resnet_18((img_channels, img_rows, img_cols), nb_classes)
#model =get_vgg_pretrained_model()


 #Get back the convolutional part of a VGG network trained on ImageNet
model_vgg16_conv = VGG16(input_shape=(I_R,I_R,3),weights='imagenet', include_top=False,pooling=max)
model_vgg16_conv.summary()
#print("ss")
    #Create your own input format (here 3x200x200)
input = Input(shape=(I_R,I_R,3),name = 'image_input')
#print("ss2")
    #Use the generated model 
output_vgg16_conv = model_vgg16_conv(input)
print("ss3")
    #Add the fully-connected layers 
x = Flatten(name='flatten')(output_vgg16_conv)
x = Dense(512, activation='relu', name='fc1')(x)
x = Dense(128, activation='relu', name='fc2')(x)
x = Dense(10, activation='softmax', name='predictions')(x)

    #Create your own model 
my_model = Model(input=input, output=x)

    #In the summary, weights and layers from VGG part will be hidden, but they will be fit during the training
my_model.summary()



my_model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

# serialize model to JSON
model_json = my_model.to_json()
with open("./results/model_data_argumentation.json", "w") as json_file:
    json_file.write(model_json)


print(my_model.summary())
if not data_augmentation:
    print('Not using data augmentation.')
    # checkpoint
   # filepath="./results/weights-improvement-{epoch:02d}-{val_acc:.2f}.hdf5"
   # checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
   # callbacks_list = [checkpoint]
    # Fit the model
    #model.fit(X, Y, validation_split=0.33, epochs=150, batch_size=10, callbacks=callbacks_list, verbose=0)

    my_model.fit(X_train, Y_train,
              batch_size=batch_size,
              nb_epoch=nb_epoch,
              validation_data=(X_test, Y_test),
              shuffle=True,
              callbacks=[lr_reducer, early_stopper, csv_logger])
else:
    print('Using real-time data augmentation.')
    # This will do preprocessing and realtime data augmentation:
    datagen = ImageDataGenerator(
        featurewise_center=False,  # set input mean to 0 over the dataset
        samplewise_center=False,  # set each sample mean to 0
        featurewise_std_normalization=False,  # divide inputs by std of the dataset
        samplewise_std_normalization=False,  # divide each input by its std
        zca_whitening=False,  # apply ZCA whitening
        rotation_range=0,  # randomly rotate images in the range (degrees, 0 to 180)
        width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)
        height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
        horizontal_flip=True,  # randomly flip images
        vertical_flip=False)  # randomly flip images

    # Compute quantities required for featurewise normalization
    # (std, mean, and principal components if ZCA whitening is applied).
    datagen.fit(X_train)

    # Fit the model on the batches generated by datagen.flow().
    my_model.fit_generator(datagen.flow(X_train, Y_train, batch_size=batch_size),
                        steps_per_epoch=X_train.shape[0] // batch_size,
                        validation_data=(X_test, Y_test),
                        epochs=nb_epoch, verbose=1, max_q_size=100,
                        callbacks=[lr_reducer, early_stopper, csv_logger])
my_model.save_weights("./results/vgg16_pretrained_upsample_model_data_argumentation.h5")
print("Saved model to disk")


@BogoK

This comment has been minimized.

Show comment
Hide comment
@BogoK

BogoK Nov 23, 2017

How do You get all the known classes of the pretrained VGG16 model?Thanks

BogoK commented Nov 23, 2017

How do You get all the known classes of the pretrained VGG16 model?Thanks

@nabsabraham

This comment has been minimized.

Show comment
Hide comment
@nabsabraham

nabsabraham Jul 22, 2018

Hi everyone,
I am using the same model posted here by @JGuillaumin however the model does not seem to be learning. The model summary shows that all the weights are trainable so I don't know what I am doing wrong. Can someone point me in the right direction?

This is the output I am seeing
image

Model Summary:

Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, None, None, 3)     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, None, None, 64)    1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, None, None, 64)    36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, None, None, 64)    0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, None, None, 128)   73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, None, None, 128)   147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, None, None, 128)   0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, None, None, 256)   295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, None, None, 256)   0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, None, None, 512)   1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, None, None, 512)   0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, None, None, 512)   0         
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________

This is the code I am using:

batch = 8
epochs = 20
#Get back the convolutional part of a VGG network trained on ImageNet
model_vgg16_conv = VGG16(weights='imagenet', include_top=False)
model_vgg16_conv.summary()

#Create your own input format (here 3x200x200)
input = Input(shape=(256,256,3),name = 'image_input')

#Use the generated model 
output_vgg16_conv = model_vgg16_conv(input)

#Add the fully-connected layers 
x = Flatten(name='flatten')(output_vgg16_conv)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dropout(0.5)(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(1, activation='softmax', name='predictions')(x)

#Create your own model 
my_model = Model(inputs=input, outputs=x)

#In the summary, weights and layers from VGG part will be hidden, but they will be fit during the training
my_model.summary()
my_model.compile(loss = "binary_crossentropy", 
                    optimizer = SGD(lr=1e-5, momentum=0.9), 
                    metrics=["accuracy"])

history = my_model.fit(train_shuffled,labels_shuffled, batch_size=batch, epochs=epochs, 
                    verbose=1,validation_split=0.2, shuffle=True)```

nabsabraham commented Jul 22, 2018

Hi everyone,
I am using the same model posted here by @JGuillaumin however the model does not seem to be learning. The model summary shows that all the weights are trainable so I don't know what I am doing wrong. Can someone point me in the right direction?

This is the output I am seeing
image

Model Summary:

Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, None, None, 3)     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, None, None, 64)    1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, None, None, 64)    36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, None, None, 64)    0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, None, None, 128)   73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, None, None, 128)   147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, None, None, 128)   0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, None, None, 256)   295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, None, None, 256)   0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, None, None, 512)   1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, None, None, 512)   0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, None, None, 512)   0         
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________

This is the code I am using:

batch = 8
epochs = 20
#Get back the convolutional part of a VGG network trained on ImageNet
model_vgg16_conv = VGG16(weights='imagenet', include_top=False)
model_vgg16_conv.summary()

#Create your own input format (here 3x200x200)
input = Input(shape=(256,256,3),name = 'image_input')

#Use the generated model 
output_vgg16_conv = model_vgg16_conv(input)

#Add the fully-connected layers 
x = Flatten(name='flatten')(output_vgg16_conv)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dropout(0.5)(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(1, activation='softmax', name='predictions')(x)

#Create your own model 
my_model = Model(inputs=input, outputs=x)

#In the summary, weights and layers from VGG part will be hidden, but they will be fit during the training
my_model.summary()
my_model.compile(loss = "binary_crossentropy", 
                    optimizer = SGD(lr=1e-5, momentum=0.9), 
                    metrics=["accuracy"])

history = my_model.fit(train_shuffled,labels_shuffled, batch_size=batch, epochs=epochs, 
                    verbose=1,validation_split=0.2, shuffle=True)```
@JGuillaumin

This comment has been minimized.

Show comment
Hide comment
@JGuillaumin

JGuillaumin Jul 23, 2018

Hi,
Did you try higher learning rate (1e-4, 1e-3, 1e-2) ? Even with a pre-trained neural network, 1e-5 is very small.
You can test with 2 output neurons and 'categorical_crossentropy' as the loss.

You final layers are very big !!
I recommend starting from a smaller network like this:

model_vgg16_conv = VGG16(weights='imagenet', include_top=False)

input = Input(shape=(256,256,3),name = 'image_input')

output_vgg16_conv = model_vgg16_conv(input)
# shape [?, 8, 8, 512]

x = GlobalAveragePooling2D(output_vgg16_conv)
# shape [?, 512]

x = Dense(1, activation='softmax', name='predictions')
# shape [?, 1]

JGuillaumin commented Jul 23, 2018

Hi,
Did you try higher learning rate (1e-4, 1e-3, 1e-2) ? Even with a pre-trained neural network, 1e-5 is very small.
You can test with 2 output neurons and 'categorical_crossentropy' as the loss.

You final layers are very big !!
I recommend starting from a smaller network like this:

model_vgg16_conv = VGG16(weights='imagenet', include_top=False)

input = Input(shape=(256,256,3),name = 'image_input')

output_vgg16_conv = model_vgg16_conv(input)
# shape [?, 8, 8, 512]

x = GlobalAveragePooling2D(output_vgg16_conv)
# shape [?, 512]

x = Dense(1, activation='softmax', name='predictions')
# shape [?, 1]
@JGuillaumin

This comment has been minimized.

Show comment
Hide comment
@JGuillaumin

JGuillaumin Jul 23, 2018

I think it does not train because you use softmax as the final activation function !!
For binary cross entropy you have to use sigmoid

JGuillaumin commented Jul 23, 2018

I think it does not train because you use softmax as the final activation function !!
For binary cross entropy you have to use sigmoid

@nabsabraham

This comment has been minimized.

Show comment
Hide comment
@nabsabraham

nabsabraham Jul 24, 2018

Hi Julien,
The learning rates did not change anything but you are right, changing it to sigmoid made a difference. It converges really quickly (2-3 epochs) but this might be a data problem as I'm using biomedical images.
Do you know why it can only work with sigmoid and not softmax? Any literature you can provide would be great!

nabsabraham commented Jul 24, 2018

Hi Julien,
The learning rates did not change anything but you are right, changing it to sigmoid made a difference. It converges really quickly (2-3 epochs) but this might be a data problem as I'm using biomedical images.
Do you know why it can only work with sigmoid and not softmax? Any literature you can provide would be great!

@JGuillaumin

This comment has been minimized.

Show comment
Hide comment
@JGuillaumin

JGuillaumin Jul 24, 2018

Because softmax is not an element-wise activation function. Here is the formula for x a vector of dim K (sorry there is no MathJax rendering in GitHub issue .. )

$$softmax(x)_i = \frac{exp(x_i)}{\sum_j^K exp(x_j)}$$

When you have K=1 (your case), whatever your network outputs, the softmax will be 1 !

That's why your loss is constant !

JGuillaumin commented Jul 24, 2018

Because softmax is not an element-wise activation function. Here is the formula for x a vector of dim K (sorry there is no MathJax rendering in GitHub issue .. )

$$softmax(x)_i = \frac{exp(x_i)}{\sum_j^K exp(x_j)}$$

When you have K=1 (your case), whatever your network outputs, the softmax will be 1 !

That's why your loss is constant !

@nabsabraham

This comment has been minimized.

Show comment
Hide comment
@nabsabraham

nabsabraham Jul 24, 2018

awesome thank you for this! 👍

nabsabraham commented Jul 24, 2018

awesome thank you for this! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment