Skip to content
This repository has been archived by the owner on Jul 10, 2021. It is now read-only.

Support for autoencoders in greedy layer-wise pre-training #50

Closed
wants to merge 7 commits into from
Closed

Support for autoencoders in greedy layer-wise pre-training #50

wants to merge 7 commits into from

Conversation

leconteur
Copy link

This pull request presents a prototype to include Autoencoders support in scikit-neuralnetwork. It does not include tests but does include a working example.

@coveralls
Copy link

Coverage Status

Coverage decreased (-11.96%) to 88.04% when pulling 345a28d on leconteur:master into 61dca5c on aigamedev:master.

@alexjc
Copy link
Member

alexjc commented May 15, 2015

OK, this is looking promising! I got it to work with minor changes after merging the latest code.

I'd suggest we iterate on the external API first. How does this look to you:

    # Setup as normal, with some additional parameters.
    nn = mlp.Classifier(layers=[
        L("Rectifier", units=32, pretrain_type='denoising', pretrain_corruption=0.5)]
        n_iter=100,
    )

    # Creates unsupervised trainer automatically, copies over weights when done.
    nn.pretrain(X_digits, layers=1)

    # Works as normal, no changes.
    net.fit(X_digits, y_digits)

Things like tied_weights should be on by default, no? Also, act_enc and act_dec should always be the same, and as used in the original layer, no?

Thoughts welcome!

@ssamot
Copy link
Contributor

ssamot commented May 16, 2015

My two cents:

Autoencoders can be used to compress/denoise stuff, a nice "transform" operation in sklearn terms. The transformed output (top layer inputs * weights) can then be thrown to any classifier/regressor to learn on top of it - which should be pretty cool. So I think it has to be a seperate module/class whatever, and if you want to mix with the mlps you can do as in the example - if not, you can use them let's say in pipeline with any classifier/regressor on top, which should be nice. I am not sure the MLP class should have any knowledge of this.

@alexjc
Copy link
Member

alexjc commented May 17, 2015

The problem I noticed is that autoencoders only seem to support sigmoid and tanh activations, so other activations may have significantly fewer benefits of pretraining — and worse results?

@alexjc
Copy link
Member

alexjc commented May 17, 2015

Based on @ssamot's comment, I started a branch autoencoder to add support for AE first, then we can figure out the pre-training.

alexjc added a commit that referenced this pull request May 17, 2015
@leconteur
Copy link
Author

I agree with @ssamot that the autoencoder should implements the transform interface of sklearn. However, I think it is a good idea that the mlp class could take a PretrainedLayer in the constructor.

The transform could probably be implemented by using the "encode" method of the pylearn2 autoencoder class that form the last layer of the network.

@alexjc
Copy link
Member

alexjc commented May 18, 2015

I'm considering it now... One problem I see currently with PretrainedLayers is that we'd have to support serialization separately than we do now, since they are not "regular" layers that had the weights copied from elsewhere.

@leconteur
Copy link
Author

I don't think we need serialization at first. However, all that is needed for a pretrained layer is to pass it the pylearn2 layer in the constructor.

The reason I think is important is to facilitate the fine-tuning of a layer that could have been trained in a variety of ways. They are also already implemented in pylearn2.

If you do want to implement them, I think it should be done in a way similar to this: The autoencoder should have a method that returns a list of its autoencoder layers wrapped in a sknn layer with a type 'pretrained'. The mlp should then have a condition in its create_layer method that calls the right pylearn2 constructor.

I do understand however that the use case where I need this features is not very similar to the scikit-learn api and use case. The main problem I have is that this kind of algorithm is hard to fit in a scikit-learn pipeline paradigm.

@coveralls
Copy link

Coverage Status

Coverage decreased (-13.06%) to 86.94% when pulling 8eb245c on leconteur:master into d221c57 on aigamedev:master.

@alexjc
Copy link
Member

alexjc commented May 19, 2015

About the features in the auto-encoder, do your final neural networks also use sigmoid and tanh? ReLU doesn't seem to be supported out of the box, but could be added... otherwise it seems like you'd be better off training a full MLP in an unsupervised style.

@leconteur
Copy link
Author

I don't think I'll need ReLU activation. My use case is very similar to the example I pushed, except that the pretraining is done on another dataset.

Sorry about the other changes, I seem to have misunderstood some details about pull-request.

@alexjc
Copy link
Member

alexjc commented May 19, 2015

No problem about the Pull Request. All changes in that branch are automatically posted.

We won't merge this PR since it contains your IDE files too :-)

@alexjc
Copy link
Member

alexjc commented May 22, 2015

Closing this since there are lots of secondary files that we don't want merging. Continuing discussion in #35, also see recent commit [8a9701a].

@alexjc alexjc closed this May 22, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants