Connecting Functional API models together #4205

Neltherion · 2016-10-26T18:33:29Z

@fchollet
I'm trying to connect 2 functional API models together. here's the summary of the 2 models:

The First "Input" Model (It works as a single model just fine):

The Second Model that is supposed to be connected to the first model:

I'm trying to connect them together like this:

model = Model(input=generator.input, output=[discriminator.output[0], discriminator.output[1]])

But I get this error:

Graph disconnected: cannot obtain value for tensor discriminator_input at layer "discriminator_input". The following previous layers were accessed without issue: []

I tried to make a model out of them like this:
Model(input=[generator.input, discriminator.input], output=[discriminator.output[0], discriminator.output[1]])

But this code just resulted in the second Model (and not the 2 of them together), or at least this is what I think after getting a summary of the model and plotting it's structure.

can we do this in Keras (connecting functional API models) or is there another way?
Thanks

The text was updated successfully, but these errors were encountered:

dieuwkehupkes · 2016-10-31T12:44:04Z

You should connect the output layer of the first network to the input layer of the second network. Something like:

Model(input=generator.input, output=discriminator(generator.output))

Neltherion · 2016-10-31T17:28:36Z

@dieuwkehupkes
Thank you so much... this was turning into a real bother for me...
It works now...

but (there's always a BUT!) when I use discriminator.train_on_batch(X,Y) it returns 3 losses instead of 2... the discriminator was supposed to return 2 losses and here I am getting 3...
did I do something wrong?

Thanks again

dieuwkehupkes · 2016-10-31T19:51:06Z

Not sure, what did you do exactly? Wasn't discriminator your second model? What did you call your stacked model? Maybe you put the wrong functions when you compiled it?

Neltherion · 2016-11-01T17:59:49Z

@fchollet @dieuwkehupkes
We have 2 Functional API models which are connected together:

Generator => Input : 1 x 128 x 128 | Output : 3 x 128 x 128
Discriminator => Input : 3 x 128 x 128 | Output : [3 x 128 x 128, 1]

I want one of the discriminator outputs to be a MAE loss which regresses to an RGB Image and the other output to be the normal discriminator output (0 or 1) to discriminate whether the input is fake or not... the diagrams have been shown in the previous comment.

The usual way (without the regressor) is to use 'Binary Cross-Entropy' loss for both the Generator and the Discriminator and also the model which consists of them both...

But right now, I don't know how to choose the loss for the models... what kind of loss should I give the generator to account for both the discriminator's 'MAE' and 'BCE' losses ?

here's what I did :

discriminator = discriminator_model()
generator = generator_model()
discriminator_on_generator = Model(input=generator.input, output=discriminator(generator.output))

generator.compile(loss=['binary_crossentropy'], optimizer='adadelta')
discriminator_on_generator.compile(loss=['mae', 'binary_crossentropy'], loss_weights=[0.5, 1.0], optimizer='adadelta')
discriminator.compile(loss=['mae', 'binary_crossentropy'], loss_weights=[0.5, 1.0], optimizer='adadelta')

This settings gives me 3 Losses whenever I call discriminator.train_on_batch() or discriminator_on_generator.train_on_batch() and I don't know where the third loss comes from...

dieuwkehupkes · 2016-11-01T22:44:11Z

The way you setup the loss (putting a different one for the two output units) seems fine. Maybe the third 'loss' you see during training is actually a default metric?

I have trouble reproducing the issue, could you maybe past your code in a gist and send the link?

Neltherion · 2016-11-02T02:55:11Z

Sure... Here's the gist.

Thanks!

dieuwkehupkes · 2016-11-02T07:40:25Z

I get an error when running your code because there is still a reference to a folder higher in your directory. When I call discriminator_on_generator.loss I just get two losses, as you specified.

I then ran predict_on_batch for discriminator_on_generator with some random arrays as in and output, and then I finally understood what you mean. The three losses you get are: the overall loss of the network, the loss of the first output layer and the loss of the second output layer. If you are ever in doubt about the numbers some training output, you can put model.metrics_names to see what they should correspond to. In your case we get:

>>> discriminator_on_generator.metrics_names
['loss', 'model_1_loss', 'model_1_loss']

That solve your problem?

Neltherion · 2016-11-05T16:16:06Z

Thanks... I learned so much!

It finally started working even though I think the architecture is useless...
When I was thinking of such an architecture I wanted a GAN network which could learn to both Colorize Grayscale images and avoid the averaging problem that most of the networks end up with... one part a discriminator and another an MAE regressor...

But it seems no matter what parameters I set, the discriminator does its own thing while the generator does it the other way around!

Even though the Discriminator has 3 losses but the overall loss isn't exactly the sum of 2 losses and when we have 2 losses I'm not sure how the update is propagated to the generator... from the outputs I checked it seems after a while they start overriding each other's changes and becoming devastative instead of useful...

I should think up an architecture where the Discriminator helps the Regressor and vice versa...

Nevertheless, Thanks for the help and your time... You really helped me out!

dieuwkehupkes · 2016-11-07T09:52:14Z

Glad to hear that, happy to help! If you have some more questions just lmk :)

roveneliah · 2018-09-26T23:00:42Z

@dieuwkehupkes I'm having a similar issue but using a model as a few intermediate layers.

def Generator():
    # INPUTS
    vq = keras.layers.Input(shape=(16,16,256), name="vq")
    r = keras.layers.Input(shape=(16,16,256), name="r")

    h0 = keras.layers.Input(shape=(16,16,256), name="h0")
    c0 = keras.layers.Input(shape=(16,16,256), name="c0")
    u0 = keras.layers.Input(shape=(64,64,256), name="u0")

    cell0 = GeneratorCell()(inputs=[vq, r, h0, c0, u0])

    something = keras.layers.Conv2D(
                            filters = 3,
                            kernel_size = (1,1),
                            strides = (1,1)
    )(cell0[2])

    model = keras.Model(inputs=[vq, r, h0, c0, u0], outputs=something)
    return model

Keras tell me:

ValueError: An operation has None for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

The strange thing is that I've tested the GeneratorCell model independently and it works, and the Generator model becomes differentiable if I remove the GeneratorCell.

I assume I must be using the functional API wrong, but I'm not sure why this would not be differentiable?

dieuwkehupkes · 2018-09-29T03:19:19Z

Hi @elihanover,

It's been quite a while since I used keras, so I'm not sure if I'll be able to help you. One question I have about your post is where this generator cell is coming from, you defined it yourself? Perhaps you are calling it with the wrong arguments? What do you mean when you say it works when you remove the generator cell? In case what would be the input to your convolutional layer?

krishhtof · 2019-07-30T15:45:22Z

You should connect the output layer of the first network to the input layer of the second network. Something like:
Model(input=generator.input, output=discriminator(generator.output))

Hi @dieuwkehupkes,

I know it's an old comment of yours, but it is still helpful. I'm using the same code as you've mentioned above to combine 2 separate Keras Models and it works correctly. EXCEPT for one small problem when I want to print out the summary of the newly combined model. When I print out the summary, I get the following output:

The main info from above to look out for is the last layer called model_1(Model). This is actually the model that I connected my first model to which you can see correctly above the model_1(Model) layer. What I want is the summary method from keras to print the entire model with all its layers and not just model_1(Model) as it shows above.

Do you or does anyone else know how one can achieve this in Keras?

Neltherion closed this as completed Nov 7, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Connecting Functional API models together #4205

Connecting Functional API models together #4205

Neltherion commented Oct 26, 2016 •

edited

dieuwkehupkes commented Oct 31, 2016

Neltherion commented Oct 31, 2016 •

edited

dieuwkehupkes commented Oct 31, 2016

Neltherion commented Nov 1, 2016 •

edited

dieuwkehupkes commented Nov 1, 2016

Neltherion commented Nov 2, 2016 •

edited

dieuwkehupkes commented Nov 2, 2016

Neltherion commented Nov 5, 2016 •

edited

dieuwkehupkes commented Nov 7, 2016

roveneliah commented Sep 26, 2018 •

edited

dieuwkehupkes commented Sep 29, 2018

krishhtof commented Jul 30, 2019

Connecting Functional API models together #4205

Connecting Functional API models together #4205

Comments

Neltherion commented Oct 26, 2016 • edited

dieuwkehupkes commented Oct 31, 2016

Neltherion commented Oct 31, 2016 • edited

dieuwkehupkes commented Oct 31, 2016

Neltherion commented Nov 1, 2016 • edited

dieuwkehupkes commented Nov 1, 2016

Neltherion commented Nov 2, 2016 • edited

dieuwkehupkes commented Nov 2, 2016

Neltherion commented Nov 5, 2016 • edited

dieuwkehupkes commented Nov 7, 2016

roveneliah commented Sep 26, 2018 • edited

dieuwkehupkes commented Sep 29, 2018

krishhtof commented Jul 30, 2019

Neltherion commented Oct 26, 2016 •

edited

Neltherion commented Oct 31, 2016 •

edited

Neltherion commented Nov 1, 2016 •

edited

Neltherion commented Nov 2, 2016 •

edited

Neltherion commented Nov 5, 2016 •

edited

roveneliah commented Sep 26, 2018 •

edited