Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autoencoder model #155

Closed
wants to merge 2 commits into from
Closed

Autoencoder model #155

wants to merge 2 commits into from

Conversation

phreeza
Copy link
Contributor

@phreeza phreeza commented May 25, 2015

I added a relatively generic autoencoder model. It allows the encoding stage to be reused as a layer in subsequent networks, and it can be frozen using a trick I got from @kenterao in #56 . In order to make this concise, I had to modify the Merge layer to allow 'merging' a single model, so basically turning it into a container layer. Also happy to do this in a separate Container layer, but I think that would include some duplication.

One thing that is typically done in autoencoders, which for now has to be done by hand, is tying of weights. I couldn't come up with a neat way of doing this, while allowing general encoders and decoders. So I left it up to the enduser to do this for now (see the test to see how it is done.)

Let me know what you think, and I am happy to write documentation for this.

@fchollet
Copy link
Member

It would make much more sense to me to implement an autoencoder as a layer, rather than a model.

A layer that takes an input of arbitrary size, and has a single weight matrix as parameter. It would take the dot product with the weight matrix (projection) then the dot product with the transposed matrix (reconstruction). And return that.

For one, it solves the problem of weight symmetry/reuse. And it allows to easily stack autoencoders.

I haven't added an Autoencoder layer so far because I've never had to use autoencoders. If you're interested, feel free to add one.

@fchollet fchollet closed this May 25, 2015
@phreeza
Copy link
Contributor Author

phreeza commented May 25, 2015

The way I understand that stacking autoencoders work is training a first autoencoder, then training a second autoencoder using the hidden units of the first autoencoder as inputs. This is one reason why I think separating the encoder and the decoder makes sense. [1]

The other is that this allows for richer autoencoders. A purely linear autoencoder like you described will just learn the principal components. But there are more options, such as denoising autoencoders, as implemented in the test. Really the encoders and decoders can be any combination of layers you like. Convolutional autoencoders can also be implemented in this framework.

Another reason why I think this should be a model onto itself is that it requires its own compilation and fitting stages, before being reused in a bigger network.

[1] http://ufldl.stanford.edu/wiki/index.php/Stacked_Autoencoders

@fchollet fchollet reopened this May 25, 2015
@phreeza
Copy link
Contributor Author

phreeza commented Jun 1, 2015

Any more thoughts on this? I could write documentation, but I could also change things in the code if desired.

For what its worth, Autoencoders in Torch are built in a very simiar way, but the autoencoder model lives in a separate package for unsupervised methods.

@jramapuram
Copy link
Contributor

A better solution based on the structure of keras is to implement a base Autoencoder class that inherits from Layer. Then DAE and such can inherit from this. We can then add a get_hidden(train) call to get the hidden layer of the autoencoder. I am almost done writing something up

@jramapuram
Copy link
Contributor

See #180

@fchollet
Copy link
Member

fchollet commented Jun 1, 2015

One question: I don't understand your use of Merge. Could you clarify?

@phreeza
Copy link
Contributor Author

phreeza commented Jun 1, 2015

In order to connect decoder and encoder, I had to construct a Sequential of the encoder and the decoder. But because both of those are models here, I needed to wrap them as Layers. That is what I used Merge for.

Another option would be to have a separate layer to implement a separate Layer to only wrap a model, ie a container layer.

@fchollet fchollet closed this Jun 4, 2015
@phreeza phreeza deleted the autoencoder branch July 17, 2015 10:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants