Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add autoencoder and denoising autoencoder #180

Merged
merged 5 commits into from
Jun 4, 2015

Conversation

jramapuram
Copy link
Contributor

This implementation creates a base autoencoder class that should be inherited for all other implementations of autoencoders. Since we have added a get_hidden() method this can be used later on when we start stacking autoencoders. I.e. we can do something along the lines of:

    if isinstance(my_new_stacked_autoencoder_layer, AutoEncoder):
        return previous_layer.get_hidden() # instead of .get_output()

A denoising autoencoder has also been added to demo how to inherit from AutoEncoder.
I have tried to be consistent with your coding style. Let me know if there are any issues.

@jramapuram jramapuram mentioned this pull request Jun 1, 2015
@phreeza
Copy link
Contributor

phreeza commented Jun 1, 2015

This is certainly a way of doing this that fits with the existing structures more nicely. However, I think it limits the usefulness in the long term.

Some drawbacks I see with this as opposed to the version in #155

  • Not using existing layers internally means every new style autoencoder will have to implement a new get_output function, as you did in the Denoising Autoencoder. In Autoencoder model #155 you can assemble them from existing layers. Denoising is just a matter of adding a dropout layer to the encoding stage (see test_autoencoder.py in that PR)
  • The line you mention for stacking autoencoders would have to be added into every possible subsequent layer, which goes against modularity. In Autoencoder model #155 you just continue with the encoder and use it like any other model.
  • You have to stick the Layer into a Model, and then call fit(X_train,X_train), which seems a bit unintuitive.

@jramapuram
Copy link
Contributor Author

The benefits I see here are that:

  • stacking layers is actually easy: you can create StackedDenoisingAutoEncoder by inheriting DenoisingAutoencoder & passing in a list of layer sizes.

  • the isinstance() is ONLY during merging of autoencoder pre-training layers with classical layers. No where else is this needed. The overhead is minimal. In my mind this can even be scrapped and you can just do something like this (i am adding this test already):

    ##########################
    # autoencoder model test #
    ##########################
    
    print("Training DenoisingAutoEncoder")
    autoencoder = Sequential()
    autoencoder.add(DenoisingAutoEncoder(784, 392))
    autoencoder.compile(loss='mean_squared_error', optimizer='adam')
    autoencoder.fit(X_train, X_train, batch_size=batch_size, nb_epoch=nb_epoch, show_accuracy=True, verbose=1, validation_data=(X_test, X_test))
    
    # Do an inference pass
    # weights = [W, encoder_bias, decoder_bias]
    autoencoder.predict(X_train, verbose=0)
    train_inference = autoencoder.get_weights()
    print("Weights: ", train_inference[0].shape)
    print("encoder_bias: ", train_inference[1].shape)
    print("decoder_bias: ", train_inference[2].shape)
    prefilter_train = np.dot(X_train, train_inference[0]) + train_inference[1]
    autoencoder.predict(X_test, verbose=0)
    test_inference = autoencoder.get_weights()
    prefilter_test = np.dot(X_test, test_inference[0]) + test_inference[1]
    
    print("Building classical fully connected layer for classification")
    model = Sequential()
    model.add(Dense(392, 10))
    model.add(Activation('softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adam')
    model.fit(prefilter_train, Y_train, batch_size=batch_size, nb_epoch=50, show_accuracy=True, verbose=0, validation_split=0.3)
    score = model.evaluate(prefilter_test, Y_test, verbose=0)
    print('\nscore:', score)

    (I understand the above is not quite optimal for inference. We should consider an Unsupervised Model that implements some abstract methods for inference and have a one liner that does that for us)

  • While slightly verbose, this gives clarity for what an autoencoder is actually doing. It also allows for the end user to modify and tweak the input or output targets (eg: target propagation). Having it fixed is too rigid for a library.

@fchollet
Copy link
Member

fchollet commented Jun 1, 2015

I will take some time to review both approaches asap.

The way I see it, returning the hidden representation or the reconstruction should not depend on outside rules, it should be self-contained. We could set a flag on the Autoencoder (eg. constructor argument, or parameter that you can set at a later time) to get it to return either the hidden representation or the reconstruction in .get_output.

In the layer API, you should only ever call .get_output.

@jramapuram
Copy link
Contributor Author

This sounds reasonable. I can drop these changes in early tomorrow.

@phreeza
Copy link
Contributor

phreeza commented Jun 1, 2015

A flag to switch the output sounds like a good solution. If we can figure
out a way to assemble encoder and decoder from other existing layers, that
would be perfect.
On Jun 1, 2015 7:53 PM, "Jason Ramapuram" notifications@github.com wrote:

This sounds reasonable. I can drop these changes in early tomorrow.


Reply to this email directly or view it on GitHub
#180 (comment).

@jramapuram
Copy link
Contributor Author

@phreeza : So I really want to get in something that does allow for the layers to be customizable. This is proving a little tricky though as the graph connectivity needs to be handled. I am working on it.

@jramapuram
Copy link
Contributor Author

I have updated my implementation to allow for any type of layer to be fed into the autoencoder.
I have also added the output_reconstruction flag. This will still ensure training is done with the entire autoencoder, but the reconstruction will only return the encoded portion. I have also added a test for this implementation.

eg:

    autoencoder.add(AutoEncoder(encoder=Dense(32, 16, activation='tanh')
                               , decoder=Dense(16, 32, activation='tanh')
                               , output_reconstruction=False, tie_weights=True))

@jramapuram
Copy link
Contributor Author

One question regarding extending this to deep autoencoders: do I have to do something special to do greedy layer-wise training? I have implemented this feature as well by simply passing in a list of encoders/decoders.

@fchollet
Copy link
Member

fchollet commented Jun 4, 2015

This is great. The implementation is neat, this is definitely good solution to autoencoders within the Keras framework.

Merging this now! : )

Next, documentation will be needed now that this is part of Keras...

@fchollet
Copy link
Member

fchollet commented Jun 4, 2015

One question regarding extending this to deep autoencoders: do I have to do something special to do greedy layer-wise training? I have implemented this feature as well by simply passing in a list of encoders/decoders.

I think that's fine for now. But we can try new things in the future...

@fchollet fchollet merged commit b4e2dd8 into keras-team:master Jun 4, 2015
@corywalker
Copy link

I think there are some issues with the test_autoencoder script. It seems to confuse the error (higher is worse) with a score (higher is better). Practically, this means the "percent improvement" reported is technically the percent decline in performance. I don't have too much time to look into this now or fix it, but I believe it should be addressed.

@fchollet
Copy link
Member

fchollet commented Jun 7, 2015

You are right. It's not clear to me what the logical expected behavior would be here --should we really expect that autoencoding would make the learning problem easier in this case?

In any case, it needs to be made clear in the tests that performance is in fact declining with this setup. Having an example where autoencoding does help with a task would be neat.

@jramapuram
Copy link
Contributor Author

Yup. I misinterpreted that one. Will mull it over and see if the example is erroneous or its just a case of not having enough data/training enough. Ill run something by tomorrow.

@joetigger
Copy link

Why is weight tying removed in AutoEncoder? What's the alternative?

fchollet pushed a commit that referenced this pull request Sep 22, 2023
Co-authored-by: Haifeng Jin <haifeng-jin@users.noreply.github.com>
hubingallin pushed a commit to hubingallin/keras that referenced this pull request Sep 22, 2023
Co-authored-by: Haifeng Jin <haifeng-jin@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants