ImageGenerator for multiple inputs #3386

jagiella · 2016-08-03T16:09:45Z

I have built a model which constists of two branches which are then merged into a single one. For the training of the model I would like to use the ImageGenerator to augement the image data, but don't know how to make work for the mixed input type. Does anybody have an idea how to deal with this in keras?
Any help would be highly appreciated!

Best,
Nick

MODEL
The first branchen takes images as inputs:

img_model = Sequential()
img_model.add(Convolution2D( 4, 9,9, border_mode='valid', input_shape=(1, 120, 160)))
img_model.add(Activation('relu'))
img_model.add(MaxPooling2D(pool_size=(2, 2)))
img_model.add(Dropout(0.5))
img_model.add(Flatten())

The second branch takes auxiliary data as input:

aux_model = Sequential()
aux_model.add(Dense(3, input_dim=3))

Then those get merged into the final model:

model = Sequential()
model.add(Merge([img_model, aux_model], mode='concat'))
model.add(Dropout(0.5))
model.add(Dense(5))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adadelta', metrics=['accuracy'])

TRAINING / PROBLEM:
I tried to do the following which obviously failed:

datagen = ImageDataGenerator(
            featurewise_center=False,  # set input mean to 0 over the dataset
            samplewise_center=False,  # set each sample mean to 0
            featurewise_std_normalization=False,  # divide inputs by std of the dataset
            samplewise_std_normalization=False,  # divide each input by its std
            zca_whitening=False,  # apply ZCA whitening
            rotation_range=10, #180,  # randomly rotate images in the range (degrees, 0 to 180)
            width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)
            height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
            horizontal_flip=False,  # randomly flip images
            vertical_flip=False)  # randomly flip images

model.fit_generator( datagen.flow( [X,I], Y, batch_size=64),
               samples_per_epoch=X.shape[0],
               nb_epoch=20,
               validation_data=([Xval, Ival], Yval))

This produces the following error message:

Traceback (most recent call last):
  File "importdata.py", line 139, in <module>
    model.fit_generator( datagen.flow( [X,I], Y, batch_size=64),
  File "/usr/local/lib/python3.5/dist-packages/keras/preprocessing/image.py", line 261, in flow
    save_to_dir=save_to_dir, save_prefix=save_prefix, save_format=save_format)
  File "/usr/local/lib/python3.5/dist-packages/keras/preprocessing/image.py", line 454, in __init__
    'Found: X.shape = %s, y.shape = %s' % (np.asarray(X).shape, np.asarray(y).shape))
  File "/usr/local/lib/python3.5/dist-packages/numpy/core/numeric.py", line 482, in asarray
    return array(a, dtype, copy=False, order=order)
ValueError: could not broadcast input array from shape (42700,1,120,160) into shape (42700)

fchollet · 2016-08-03T19:26:45Z

You need a generator that yields something of the form ([x1, x2], y). So you need to write your own generator, for which you can reuse the original ImageDataGenerator for one or more input.

jagiella · 2016-08-04T08:22:36Z

That is what I like to do, but I don't really know how to create one which will give proper results. One issue I see is for example related to shuffling. If I would use the original ImageDataGenerator with shuffling, I somehow would need to know for each image the corresponding index in the original image stack.

jagiella · 2016-08-04T13:38:31Z

Ok, I made it work! For anybody asking himself the same question here is my example solution:

def createGenerator( X, I, Y):

    while True:
        # suffled indices    
        idx = np.random.permutation( X.shape[0])
        # create image generator
        datagen = ImageDataGenerator(
                featurewise_center=False,  # set input mean to 0 over the dataset
                samplewise_center=False,  # set each sample mean to 0
                featurewise_std_normalization=False,  # divide inputs by std of the dataset
                samplewise_std_normalization=False,  # divide each input by its std
                zca_whitening=False,  # apply ZCA whitening
                rotation_range=10, #180,  # randomly rotate images in the range (degrees, 0 to 180)
                width_shift_range=0.1, #0.1,  # randomly shift images horizontally (fraction of total width)
                height_shift_range=0.1, #0.1,  # randomly shift images vertically (fraction of total height)
                horizontal_flip=False,  # randomly flip images
                vertical_flip=False)  # randomly flip images

        batches = datagen.flow( X[idx], Y[idx], batch_size=64, shuffle=False)
        idx0 = 0
        for batch in batches:
            idx1 = idx0 + batch[0].shape[0]

            yield [batch[0], I[ idx[ idx0:idx1 ] ]], batch[1]

            idx0 = idx1
            if idx1 >= X.shape[0]:
                break

jockes60 · 2017-01-26T10:32:08Z

Here's a piece of code that formats the outputs of two generators. It can be extended to any number of generators. Assuming the output of both generators is of the form (x,y) and the wanted output is of the form ([x1, x2], y1):

def format_gen_outputs(gen1,gen2):
    x1 = gen1[0]
    x2 = gen2[0]
    y1 = gen1[1]
    return [x1, x2], y1

combo_gen = map(format_gen_outputs, gen1, gen2)

ahmedhosny · 2017-03-08T22:09:12Z

@jagiella I have a similar structure but instead of one datagen.flow, I have three from three different sources. My problem is I want to make sure the same set of augmentations is applied to arrays of the same index across all three batches. Any ideas? I think the seed argument in datagen.flow is for shuffling only.

drorhilman · 2017-06-15T12:32:36Z

I am using a lightly different variation...

generator = ImageDataGenerator(rotation_range=90, 
                                   width_shift_range=0.05, 
                                   height_shift_range=0.05,
                                   zoom_range=0.1)

def generate_data_generator_for_two_images(X1, X2, Y):
    genX1 = generator.flow(X1,Y, seed=7)
    genX2 = generator.flow(X2, seed=7)
    while True:
            X1i = genX1.next()
            X2i = genX2 .next()
            yield [X1i[0], X2i ], X1i[1]

zyavrik · 2017-10-15T18:30:12Z

I get the following error when using the function below:

File "data_utils.py", line 569, in data_generator_task
generator_output = next(self._generator)
TypeError: 'function' object is not an iterator

trainDataGenerator = ImageDataGenerator(...)
trainGeneratorBasic = trainDataGenerator.flow(input, inputLabels)

def trainGenerator():
    while True:
        xy = trainGeneratorBasic.next()
        yield [xy[0], xy[0], xy[0]], xy[1]

UPDATE

Fixed with the following code:

def trainGeneratorFunc():
    while True:
        xy = trainGeneratorBasic.next()
        yield [xy[0], xy[0], xy[0]], xy[1]

trainGenerator = trainGeneratorFunc()

tenbabagu · 2017-11-06T18:22:31Z

I have a similar question: I want to use the triplet loss, so I need three image, two different ones from the same class, and the other different one from another class. Did anyone do the similar work?

tlatlbtle · 2017-12-29T05:35:16Z

@jagiella I use this peice of code, however it shows the error message:
`
~/anaconda3/lib/python3.6/site-packages/keras/utils/data_utils.py in data_generator_task()
633 try:
634 if self._use_multiprocessing or self.queue.qsize() < max_queue_size:
--> 635 generator_output = next(self._generator)
636 self.queue.put((True, generator_output))
637 else:

ValueError: generator already executing
`
Could you find that what's wrong with it?

DNXie · 2018-01-22T03:03:38Z

@drorhilman
I'm using your method like this:
parallel_model.fit_generator(generate_data_generator_for_two_images(xs_train, xl_train, y_train), epochs=epochs,steps_per_epoch=int(np.ceil(xs_train.shape[0] / float(batch_size))), validation_data=([xs_test,xl_test], y_test), class_weight = 'auto', workers=4)

and got this error:
Exception in thread Thread-5: Traceback (most recent call last): File "/usr/lib/python3.4/threading.py", line 920, in _bootstrap_inner self.run() File "/usr/lib/python3.4/threading.py", line 868, in run self._target(*self._args, **self._kwargs) File "/usr/local/lib/python3.4/dist-packages/keras/utils/data_utils.py", line 579, in data_generator_task generator_output = next(self._generator) ValueError: generator already executing

There are a lot of similar methods just like yours. I tried most of them and got similar error message.
Why does this happen? How to fix this?

DNXie · 2018-01-22T03:06:30Z

@jockes60
Oh my god! This works! Thanks a lot!!!

FrancisYizhang · 2018-01-26T09:33:23Z

@fchollet
I have multiple outputs and it run successfully when the workers=1, however, it will become brocken when the workers are larger than 1 and the error is showed below:
Exception in thread Thread-7:

Traceback (most recent call last):
File "C:\Users\Francis_161014\AppData\Local\conda\conda\envs\tensorflow\lib\threading.py", line 914, in _bootstrap_inner
self.run()
File "C:\Users\Francis_161014\AppData\Local\conda\conda\envs\tensorflow\lib\threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\Francis_161014\AppData\Local\conda\conda\envs\tensorflow\lib\site-packages\keras\utils\data_utils.py", line 560, in data_generator_task
generator_output = next(self._generator)
ValueError: generator already executing
Traceback (most recent call last):
File "C:\Program Files\JetBrains\PyCharm Community Edition 2017.2\helpers\pydev_pydevd_bundle\pydevd_exec2.py", line 3, in Exec
exec(exp, global_vars, local_vars)
File "", line 12, in
File "C:\Users\Francis_161014\AppData\Local\conda\conda\envs\tensorflow\lib\site-packages\keras\legacy\interfaces.py", line 87, in wrapper
return func(*args, **kwargs)
File "C:\Users\Francis_161014\AppData\Local\conda\conda\envs\tensorflow\lib\site-packages\keras\engine\training.py", line 1809, in fit_generator
generator_output = next(output_generator)
StopIteration

The following is my code:
` generator_train = ImageDataGenerator(
featurewise_center=False,
featurewise_std_normalization=False,
rotation_range=10,
width_shift_range=0.1,
height_shift_range=0.1)
generator_train.fit(x_train)

model.fit_generator(
    generate_data_generator(generator=generator_train,
                            X=x_train,
                            Y=y_train,
                            batch_size=batch_size,
                            num_classes=num_classes),
    steps_per_epoch=len(x_train) // batch_size,
    epochs=epochs,
    validation_data=(x_test, y_test),
    callbacks=callbacks,
    workers=4,
    verbose=2)`

FrancisYizhang · 2018-01-26T09:35:25Z

@DNXie
How about is your program when the workers are larger than 1?

FrancisYizhang · 2018-01-26T12:41:21Z

My system is win7

DNXie · 2018-01-27T06:36:35Z

@FrancisYizhang My code above has workers=4

FrancisYizhang · 2018-01-27T14:14:02Z

@DNXie
Do you use multiple processor?
use_multiprocessing=True

MjdMahasneh · 2018-06-22T12:25:55Z

Could anyone of you guys kindly help me solve this problem #10499. I tried implementing the same generator as in this post but i dont seem to figure out where is my mistake. any help is very much appreciated.

ad12 · 2018-07-25T05:17:46Z

@ahmedhosny did you ever find a solution for applying the same transform to the images in the same index in the two different arrays?

gledsonmelotti · 2018-11-29T18:18:01Z

Hello guys, how are you? I have a doubt. I will train input sets on the same network, for example, model1 receives input X1 (three folders containing classes and each class has the training, validation, and test data) and model2 receives input X2 (three folders containing three classes and each class has the training, validation and test data). Then I will concatenate a convolution of model X1 with one of model X2. So I have two input data, two validation data and two test data. My question is about the following command: steps_per_epoch = nb_train_samples // batchsize. I would like to know if my nb_train_samples is the sum of only the training_class1_X1 + training_class2_X1 + training_class3_X1 or if nb_train_samples is the sum of (training_class1_X1 + training_class2_X1 + training_class3_X1) + (training_class1_X2 + training_class2_X2 + training_class3_X2).

TheStoneMX · 2019-10-07T14:25:44Z

found it on the internet, dont remeber where....

def two_image_generator(generator,
df,
directory,
batch_size,
x_col = 'filename',
y_col = None,
model = None,
shuffle = False,
img_size1 = (224, 224),
img_size2 = (299,299)):

gen1 = generator.flow_from_dataframe(
    df,
    directory,
    x_col = x_col,
    y_col = y_col,
    target_size = img_size1,
    class_mode = model,
    batch_size = batch_size,
    shuffle = shuffle,
    seed = 1)

gen2 = generator.flow_from_dataframe(
    df,
    directory,
    x_col = x_col,
    y_col = y_col,
    target_size = img_size2,
    class_mode = model,
    batch_size = batch_size,
    shuffle = shuffle,
    seed = 1)

while True:
    X1i = gen1.next()
    X2i = gen2.next()
    if y_col:
        yield [X1i[0], X2i[0]], X1i[1]  #X1i[1] is the label
    else:
        yield [X1i, X2i]

gledsonmelotti · 2019-10-07T15:01:05Z

@TheStoneMX good ideia. Thank you very much.

lamba92 · 2020-03-03T16:50:53Z

but what if your ImageGenerator adds augmented data and you need to match the right features for that image? How do I know which image belongs to which row of my dataframe?

fchollet closed this as completed Aug 3, 2016

tinalegre mentioned this issue Jun 6, 2018

Extending ImageDataGenerator #3338

Closed

lamba92 mentioned this issue Mar 4, 2020

Keras iterator with augmented images and other features #13865

Closed

Leeky18 mentioned this issue Feb 14, 2022

Using image location as feature ttothea/Turtle-image-recognition-Zindi-challenge#17

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ImageGenerator for multiple inputs #3386

ImageGenerator for multiple inputs #3386

jagiella commented Aug 3, 2016 •

edited

fchollet commented Aug 3, 2016 •

edited

jagiella commented Aug 4, 2016

jagiella commented Aug 4, 2016 •

edited

jockes60 commented Jan 26, 2017

ahmedhosny commented Mar 8, 2017

drorhilman commented Jun 15, 2017 •

edited

zyavrik commented Oct 15, 2017 •

edited

tenbabagu commented Nov 6, 2017

tlatlbtle commented Dec 29, 2017 •

edited

DNXie commented Jan 22, 2018

DNXie commented Jan 22, 2018

FrancisYizhang commented Jan 26, 2018

FrancisYizhang commented Jan 26, 2018

FrancisYizhang commented Jan 26, 2018

DNXie commented Jan 27, 2018

FrancisYizhang commented Jan 27, 2018

MjdMahasneh commented Jun 22, 2018

ad12 commented Jul 25, 2018

gledsonmelotti commented Nov 29, 2018

TheStoneMX commented Oct 7, 2019 •

edited

gledsonmelotti commented Oct 7, 2019

lamba92 commented Mar 3, 2020 •

edited

ImageGenerator for multiple inputs #3386

ImageGenerator for multiple inputs #3386

Comments

jagiella commented Aug 3, 2016 • edited

fchollet commented Aug 3, 2016 • edited

jagiella commented Aug 4, 2016

jagiella commented Aug 4, 2016 • edited

jockes60 commented Jan 26, 2017

ahmedhosny commented Mar 8, 2017

drorhilman commented Jun 15, 2017 • edited

zyavrik commented Oct 15, 2017 • edited

tenbabagu commented Nov 6, 2017

tlatlbtle commented Dec 29, 2017 • edited

DNXie commented Jan 22, 2018

DNXie commented Jan 22, 2018

FrancisYizhang commented Jan 26, 2018

FrancisYizhang commented Jan 26, 2018

FrancisYizhang commented Jan 26, 2018

DNXie commented Jan 27, 2018

FrancisYizhang commented Jan 27, 2018

MjdMahasneh commented Jun 22, 2018

ad12 commented Jul 25, 2018

gledsonmelotti commented Nov 29, 2018

TheStoneMX commented Oct 7, 2019 • edited

found it on the internet, dont remeber where....

gledsonmelotti commented Oct 7, 2019

lamba92 commented Mar 3, 2020 • edited

jagiella commented Aug 3, 2016 •

edited

fchollet commented Aug 3, 2016 •

edited

jagiella commented Aug 4, 2016 •

edited

drorhilman commented Jun 15, 2017 •

edited

zyavrik commented Oct 15, 2017 •

edited

tlatlbtle commented Dec 29, 2017 •

edited

TheStoneMX commented Oct 7, 2019 •

edited

lamba92 commented Mar 3, 2020 •

edited