-
Notifications
You must be signed in to change notification settings - Fork 19.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to use fit_generator with multiple image inputs #8130
Comments
https://github.com/fchollet/keras/issues/3386 |
Thanks. Here is how I solved it following some ideas from issue 3386, maybe someone might find it useful input_imgen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
rotation_range=5.,
horizontal_flip = True)
test_imgen = ImageDataGenerator(rescale = 1./255)
def generate_generator_multiple(generator,dir1, dir2, batch_size, img_height,img_width):
genX1 = generator.flow_from_directory(dir1,
target_size = (img_height,img_width),
class_mode = 'categorical',
batch_size = batch_size,
shuffle=False,
seed=7)
genX2 = generator.flow_from_directory(dir2,
target_size = (img_height,img_width),
class_mode = 'categorical',
batch_size = batch_size,
shuffle=False,
seed=7)
while True:
X1i = genX1.next()
X2i = genX2.next()
yield [X1i[0], X2i[0]], X2i[1] #Yield both images and their mutual label
inputgenerator=generate_generator_multiple(generator=input_imgen,
dir1=train_dir_1,
dir2=train_dir_2,
batch_size=batch_size,
img_height=img_height,
img_width=img_height)
testgenerator=generate_generator_multiple(test_imgen,
dir1=train_dir_1,
dir2=train_dir_2,
batch_size=batch_size,
img_height=img_height,
img_width=img_height)
history=model.fit_generator(inputgenerator,
steps_per_epoch=trainsetsize/batch_size,
epochs = epochs,
validation_data = testgenerator,
validation_steps = testsetsize/batch_size,
use_multiprocessing=True,
shuffle=False) |
That was great. How would you find the class_indices for the inputgenerator and testgenerator? I'm not able to use this syntax: inputgenerator.class_indices() here since now, the generator works with multiple inputs. |
@aendrs I would like to ask how to give the two inputs same pre-processing everytime? As in the function of input_generator, input_imgen is called twice, for each time randomly selected preprocessing steps are applied on the input. |
@aendrs @raaju-shiv @wangqianwen0418 @laukun @bmabey Could anyone of you guys kindly help me solve this problem #10499. I tried implementing the same generator as in this post but i dont seem to figure out where is my mistake. any help is very much appreciated. |
@laukun I would suggest write your own data generator like in this example: class DataGenerator(keras.utils.Sequence):
"""Generates data for Keras."""
def __init__(self, img_files, clinical_info, labels, ave=None, std=None, batch_size=32, dim=(300, 300), n_channels=3,
n_classes=2, shuffle=True):
"""Initialization.
Args:
img_files: A list of path to image files.
clinical_info: A dictionary of corresponding clinical variables.
labels: A dictionary of corresponding labels.
"""
self.img_files = img_files
self.clinical_info = clinical_info
self.labels = labels
self.batch_size = batch_size
self.dim = dim
if ave is None:
self.ave = np.zeros(n_channels)
else:
self.ave = ave
if std is None:
self.std = np.zeros(n_channels) + 1
else:
self.std = std
self.n_channels = n_channels
self.n_classes = n_classes
self.shuffle = shuffle
self.on_epoch_end()
def __len__(self):
"""Denotes the number of batches per epoch."""
return int(np.floor(len(self.img_files) / self.batch_size))
def __getitem__(self, index):
"""Generate one batch of data."""
# Generate indexes of the batch
indexes = self.indexes[index*self.batch_size:(index+1)*self.batch_size]
# Find list of IDs
img_files_temp = [self.img_files[k] for k in indexes]
# Generate data
X, y = self.__data_generation(img_files_temp)
return X, y
def on_epoch_end(self):
"""Updates indexes after each epoch."""
self.indexes = np.arange(len(self.img_files))
if self.shuffle == True:
np.random.shuffle(self.indexes)
def __data_generation(self, img_files_temp):
"""Generates data containing batch_size samples."""
# X : (n_samples, *dim, n_channels)
# X = [np.empty((self.batch_size, self.dim[0], self.dim[1], self.n_channels))]
X_img = []
X_clinical = []
y = np.empty((self.batch_size), dtype=int)
# Generate data
for i, img_file in enumerate(img_files_temp):
# Read image
img = skimage.io.imread(img_file)
# Resize
img = skimage.transform.resize(img, output_shape=self.dim, mode='constant', preserve_range=True)
# Normalization
for ch in range(self.n_channels):
img[:, :, ch] = (img[:, :, ch] - self.ave[ch])/self.std[ch]
if self.shuffle:
# Some image augmentation codes
###### You can put your preprocessing codes here. #####
X_img.append(img)
X_clinical.append(self.clinical_info[img_file])
y[i] = self.labels[img_file]
X = [np.array(X_img), np.array(X_clinical)]
return X, keras.utils.to_categorical(y, num_classes=self.n_classes) And to call the generator:
The different thing I'm doing here is using two different kinds of input, an image and a numpy array. I saved image paths as a list, and created two dictionaries whose key words are the image paths. When you want to do the preprocessing, you can easily apply them right after reading the image using the image paths. But this way, the speed will be definitely slower than loading data directly from .npy |
I'm trying to train a two streams CNN architecture, but unfortunately I got an unexpected result whitch is almost the same as one of the two streams (I have already trained every one separately): Arch
` Training
|
@scstu I would like to merge the input data first, which might be a better idea. |
@scstu that can be done in different ways. but keep in mind the number of learned parameters differ depending on the way you merge the input images. so if youre merging in a multispectral-like shape (for 2xRGB images you can merge in two ways as far as i know. so it would be something like: HxWx6 presuming that your concatenating the images, or it could be HxWx3 if youre finding a transformation of the 2 RGB images such as multiplication, add, sub etc..) in either case be careful with the input size for your model and remember that the number of the learned parameters will differ. to know whats better you can only try and compare the results. it also will affect the extracted feature by the convolution differently. i worked on a similar problem and i used multiple branch CNN and 2 input images where each images passes a branch from the CNN and then the features extracted by the two CNN branches are merged in a merging layer before passing it to the top layers. |
@aendrs thank you very much on your algorithm. He is fantastic. |
@aendrs it was really good work to generate multi-streams input. thanks |
@aendrs could you please share your portion of code for concatenation and inputs manipulation |
I have a doubt. I will train input sets on the same network, for example, model1 receives input X1 (three folders containing classes and each class has the training, validation, and test data) and model2 receives input X2 (three folders containing three classes and each class has the training, validation and test data). Then I will concatenate a convolution of model X1 with one of model X2. So I have two input data, two validation data and two test data. My question is about the following command: steps_per_epoch = nb_train_samples // batchsize. I would like to know if my nb_train_samples is the sum of only the training_class1_X1 + training_class2_X1 + training_class3_X1 or if nb_train_samples is the sum of (training_class1_X1 + training_class2_X1 + training_class3_X1) + (training_class1_X2 + training_class2_X2 + training_class3_X2). |
@MjdMahasneh I believe your comment is interesting. Could you help me with a doubt? I will train input sets on the same network, for example, model1 receives input X1 (three folders containing classes and each class has the training, validation, and test data) and model2 receives input X2 (three folders containing three classes and each class has the training, validation and test data). Then I will concatenate a convolution of model X1 with one of model X2. So I have two input data, two validation data and two test data. My question is about the following command: steps_per_epoch = nb_train_samples // batchsize. I would like to know if my nb_train_samples is the sum of only the training_class1_X1 + training_class2_X1 + training_class3_X1 or if nb_train_samples is the sum of (training_class1_X1 + training_class2_X1 + training_class3_X1) + (training_class1_X2 + training_class2_X2 + training_class3_X2). |
Could you help me with a doubt? I will train input sets on the same network, for example, model1 receives input X1 (three folders containing classes and each class has the training, validation, and test data) and model2 receives input X2 (three folders containing three classes and each class has the training, validation and test data). Then I will concatenate a convolution of model X1 with one of model X2. So I have two input data, two validation data and two test data. My question is about the following command: steps_per_epoch = nb_train_samples // batchsize. I would like to know if my nb_train_samples is the sum of only the training_class1_X1 + training_class2_X1 + training_class3_X1 or if nb_train_samples is the sum of (training_class1_X1 + training_class2_X1 + training_class3_X1) + (training_class1_X2 + training_class2_X2 + training_class3_X2). |
Hello, Did you find an answer / solution to your question? I have the same problem / doubt. |
I think the training size is the later one, sum of all training samples. The question is equal with: A model has three sets of parameters A, B and C. We use X1 data to train A & C with B fixed, and use X2 to train B & C with A fixed. I think now the answer is clear. |
I am trying similar things, but I am getting stuck with the input of the model. `
I first trained 3 different models:
I would like to combine these 3 models into a final model with 1 output containing 9 labels (8,9,10,30,31,80,81,82,83). The final input needs to get 1 input image instead of 3 images. But I still get stuck. I am building a kind of hierarchy here to improve the accuracy. |
How is that that the test generator and train generator(input generator) have the same directories?? |
def generate_generator_multiple(generator,dir1, dir2, batch_size, img_height,img_width):
In this method shown,
|
Hi, I am using the train generator for two inputs as @aendrs wrote. |
To reply to my comment, I made my own class of generator that takes as input 2 images and allows multiple labels: import numpy as np class DataGenerator(keras.utils.Sequence): This way the order of the two batches of data is the same. |
Thanks for your guide on multiple inputs, very much appreciated. I have a similar issue which I used your shared code. However, while executing, I have the TypeError: can't pickle generator objects Kindly provide guidance to resolve this issue. Thank you |
I experience a similar problem. I found that the best solution is to manipulate the keras.utils.Sequence.TimeseriesGenerator functionality for your own purpose here. For my problem [x1, x2], y this is a working generator: import numpy as np class TimeseriesGenerator(keras.utils.Sequence):
def timeseries_generator_from_json(json_string):
|
This code is almost perfect, but for me I have the problem that it just runs forever. If I replace the while true I could get it to work as I expected. |
in fit.generator..what is trainsetsize and testsetsize???where it is defined?? |
https://keras.io/api/models/model_training_apis/ |
I am using a similar generator for prediction purposes. But, it is running infinitely. How can I resolve this? |
I have 2 inputs and each one have 5 classes, 30 images. And I use this code but only get
But i think it should be like this.
How can I resolve this? |
Hello, I'm trying to use a model with paired input images through (in their own similar directory trees), augmented through ImageDataGenerator using also flow_from_directory (so the method infers the labels by the folder structure). I'm getting an error because keras can't handle it in this way.
How can I combine the generators (using flow_from_directory) to be accepted by fit_generator?
Here is a sample code
Model definition ***
The error I get is the following:
TypeError: Error when checking model input: data should be a Numpy array, or list/dict of Numpy arrays. Found: <keras.preprocessing.image.DirectoryIterator object at 0x7f824c5080f0>...
The text was updated successfully, but these errors were encountered: