Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combining 2 models with batch normalizaton #5221

Closed
stepjam opened this issue Jan 29, 2017 · 9 comments
Closed

Combining 2 models with batch normalizaton #5221

stepjam opened this issue Jan 29, 2017 · 9 comments

Comments

@stepjam
Copy link

stepjam commented Jan 29, 2017

Hi,

There seems to be a problem when you combine 2 models (sub-models) that use batch normalization into another model (master-model), and then try and train one of the sub-models. When removing the batch normalization, it works as expected.

Below is a code snipped to reproduce.
When you run the example, you should see:
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'combined_input' with dtype float

Thank you in advanced.

from keras.models import Model
from keras.layers import Dense, Input
from keras.layers.core import Activation
from keras.layers.normalization import BatchNormalization
from keras.optimizers import Adam
import numpy as np

LR = 0.0002
BATCH_SIZE = 128

def generator_model():
    input = x = Input(shape=(100,), name='generator_input')
    x = Dense(40)(x)
    x = BatchNormalization()(x)
    x = Activation('relu')(x)
    x = Dense(10)(x)
    x = Activation('tanh')(x)
    return Model(input, x, name='generator')


def discriminator_model():
    input = x = Input(shape=(10,), name='discriminator_input')
    x = Dense(60)(x)
    x = BatchNormalization()(x)
    x = Activation('relu')(x)
    x = Dense(1, activation='sigmoid')(x)
    return Model(input, x, name='discriminator')


def combined_model(generator, discriminator):
    input = Input(shape=(100,), name='combined_input')
    x = generator(input)
    dcganOutput = discriminator(x)
    return Model(input=input, output=dcganOutput)


adam = Adam(lr=LR, beta_1=0.5)

generator = generator_model()
discriminator = discriminator_model()
generator.summary()
discriminator.summary()
generator.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])
discriminator.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])

dcgan = combined_model(generator, discriminator)
dcgan.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])

train_data = np.zeros((BATCH_SIZE, 10), dtype=np.float32)
train_labels = np.ones(BATCH_SIZE)

# HERE is when the error occurs
discriminator_loss = discriminator.train_on_batch(x=train_data, y=train_labels)
@stepjam stepjam changed the title Combining 2 models Combining 2 models with batch normalizaton Jan 29, 2017
@bstriner
Copy link
Contributor

Do you have a minimal script to reproduce? Can't tell based on your description.

Cheers

@stepjam
Copy link
Author

stepjam commented Jan 29, 2017

Of course. Start with this, and I will try and see if I can make a smaller example. Thanks

EDIT: I have updated the description with a code example

@bstriner
Copy link
Contributor

@stepjam Your code block works for me in theano. I can try it out in tensorflow instead. What is your backend and what version of everything are you using?

On the note of GANs, I put together this module for making a combined GAN model. It lets you train/test both the generator and the discriminator in a single fit/train_on_batch/evaluate/etc. Please let me know if it works for you, and if it doesn't, please let me know what is missing.

https://github.com/bstriner/keras-adversarial

@bstriner
Copy link
Contributor

OK. Replicated on tensorflow current stable release. I got a similar error on my other computer using theano but it went away when I updated theano. Not sure if it is happening in tensorflow bleeding edge.

@bstriner
Copy link
Contributor

So the issue is that call adds updates to self.updates and then making the training function gets from self.updates. BN mode=0 will store a running average of the mean and std.

If you discriminator._make_train_function() before you build the combined model, the error goes away. You could then layer.updates=[] before you use that layer in a different model.

Other fix would probably just be to use mode=1 or 2 instead of 0 (default).

The solution in keras-adversarial is to just run everything and average any updates.

@stepjam
Copy link
Author

stepjam commented Jan 30, 2017

@bstriner - Thanks for looking into this!
Just so you know, I am using tesorflow as backend (tensorflow_gpu-0.12.1), and keras version: 1.2.1.

OK great, thanks for finding the problem. I'll also have a look at your keras-adversarial work.

@bstriner
Copy link
Contributor

No problem. If you come up with anything cool in keras-adversarial let me know and I can add it as an example.

@adamcavendish
Copy link
Contributor

Also, there's some similar issue with Batch Normalization Layer in this question, which uses sequential mode:

http://stackoverflow.com/questions/42422646/keras-train-partial-model-issue-about-gan-model

@stale
Copy link

stale bot commented Jun 5, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

@stale stale bot closed this as completed Jul 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants