Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

mlp with pretraining #3281

Open
wants to merge 7 commits into from

4 participants

@IssamLaradji

This extends the generic multi-layer perceptron (mlp) #3204 with a pre-training capability.

An example file - called mlp_with_pretraining.py - is added to show how pre-training with an RBM is done and how it could improve mlp's performance on the digits dataset.

I got the following results,
1) Testing accuracy of mlp without pretraining: 0.964
2) Testing accuracy of mlp with pretraining: 0.978

@coveralls

Coverage Status

Coverage increased (+0.07%) when pulling e10d2d6 on IssamLaradji:mlp-with-pretraining into 31f2e07 on scikit-learn:master.

@IssamLaradji IssamLaradji referenced this pull request
Open

[MRG] Generic multi layer perceptron #3204

3 of 4 tasks complete
@ogrisel
Owner

You can make a pipeline with make_pipeline instead of calling fit_transform on the first and passing the results to call fit on the second.

dbn = make_pipeline(
    BernoulliRBM(n_components=40, random_state=random_state,
                 learning_rate=0.01, n_iter=100),
    BernoulliRBM(n_components=10, random_state=random_state,
                 learning_rate=0.01, n_iter=100),
).fit(X_train)

Also true pre-training is done only on X_train rather than the full X.

@ogrisel
Owner

Also, please put a global random_state=0 at the beginning and use that as argument everywhere. You will see that pre-training does not always help when you try several times with random_state=None. I think this is might be caused by the fact that we have very few samples to train the RBMs. It should be explicitly stressed in the docstring that pre-training does not seem to help in this case as.

@ogrisel
Owner

@IssamLaradji could work on the newly created #3289 independently? I think it should be pretty quick to merge and can be useful in all the MLP + digits examples.

@IssamLaradji

Hi @ogrisel , I updated the code to have RBMs train using make_pipeline. Is this the right, concise way to initialize mlp? It looks quite generic but one thing is hard coded, the RBMs are manually initialized. Maybe I could dynamically create RBMs based on the number of layers and their number of neurons. What do you think?

mlp = MultilayerPerceptronClassifier(
    n_hidden=n_hidden, random_state=random_state)

# Set warm start to true
mlp.warm_start = True

# Train RBMs
dbn = make_pipeline(
    BernoulliRBM(n_components=n_hidden, random_state=random_state,
                 learning_rate=0.001, n_iter=50),
    BernoulliRBM(n_components=10, random_state=random_state,
                 learning_rate=0.001, n_iter=50),
).fit(X_train)

coefficients = [param[1].components_ for param in dbn.steps]
intercepts = [param[1].intercept_hidden_ for param in dbn.steps]

# Assign initial coefficients and intercepts with rbm's parameters
mlp.layers_coef_ = coefficients
mlp.layers_intercept_ = intercepts

Thanks.

@coveralls

Coverage Status

Coverage increased (+0.08%) when pulling 70d8f8c on IssamLaradji:mlp-with-pretraining into 31f2e07 on scikit-learn:master.

@ogrisel
Owner
coefficients = [param[1].components_ for param in dbn.steps]
intercepts = [param[1].intercept_hidden_ for param in dbn.steps]

# Assign initial coefficients and intercepts with rbm's parameters
mlp.layers_coef_ = coefficients
mlp.layers_intercept_ = intercepts

could be directly written:

# Assign initial coefficients and intercepts with rbm's parameters
mlp.layers_coef_ = [param[1].components_ for param in dbn.steps]
mlp.layers_intercept_ = [param[1].intercept_hidden_ for param in dbn.steps]
@ogrisel
Owner

And:

mlp = MultilayerPerceptronClassifier(
    n_hidden=n_hidden, random_state=random_state)

# Set warm start to true
mlp.warm_start = True

could be written:

mlp = MultilayerPerceptronClassifier(
    n_hidden=n_hidden, random_state=random_state, warm_start=True)
@IssamLaradji

Fixed, thanks :)

@coveralls

Coverage Status

Coverage increased (+0.08%) when pulling 408e0af on IssamLaradji:mlp-with-pretraining into 31f2e07 on scikit-learn:master.

@IssamLaradji

Here is a dynamic, generic way of constructing and training RBMs. I am using Pipeline because it allows me to construct the estimators as a list. This might be better than make_pipeline unless it supports list parameters as input. What do you think?

# Cross-validate multi-layer perceptron with rbm pre-training
mlp = MultilayerPerceptronClassifier(
    n_hidden=n_hidden, random_state=random_state, warm_start=True)

# Construct RBMs
layer_sizes = n_hidden + [n_output]
estimators = [('rbm' +str(i), BernoulliRBM(n_components=layer_sizes[i], 
                random_state=random_state, learning_rate=0.001, n_iter=50)) 
                for i in range(len(layer_sizes))]
# Train RBMs
dbn = Pipeline(estimators).fit(X_train)

# Assign initial coefficients and intercepts with rbm's parameters
mlp.layers_coef_ = [param[1].components_ for param in dbn.steps]
mlp.layers_intercept_ = [param[1].intercept_hidden_ for param in dbn.steps]
@IssamLaradji

In addition, I could add a list parameter_list whose ith index defines the parameters of the ith rbm. The user can fill the list with the parameters using the keyword argument syntax, **kwargs.

@IssamLaradji

I think the parameter_list is an overkill. I believe the user should only need to set the estimators list manually which can be given as input to the warm_start parameter. The rest should be automated.

examples/neural_network/mlp_with_pretraining.py
@@ -0,0 +1,68 @@
+"""
+=======================================================================
+Pre-training Multi-layer Perceptron using Restricted Boltzmann Machines
+=======================================================================
+
+This compares the performance of multi-layer perceptron (MLP) with and without
+pre-training using Restricted Boltzmann Machines. MLP without pre-training
+initializes the coefficient and intercept components using scaled, random
+distribution. With pre-training, an RBM trains on the dataset and the resultant
+parameters are given to MLP as initial coefficient and intercept parameters.
+This example justifies the hypothesis that pre-training allows MLP to converge
+in a better local minima.
@ogrisel Owner
ogrisel added a note

Have you tried to change the random_state to other values. Is pre-training always beneficial on this dataset?

I would add a remark that pre-training can be beneficial when the training set is small which is the case here but is generally considered useless when the (labeled) training set grows large.

@ogrisel , in 7 out of 10 random states, MLP with pre-training performed better - so it is not always beneficial.

I changed the wording to state that, as the labeled training set grows larger the less pre-training is beneficial. Moreover, it can even result in a performance less than if the weights were randomly initialized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@GaelVaroquaux
@IssamLaradji

@GaelVaroquaux you are absolutely right. I realized that a while after I suggested **kwargs. It is too low level for the user. Like you said, passing an object is best since the user wouldn't have to dig too deep to know the syntax of initializing pre-training parameters.

@IssamLaradji

So, for pre-training, I allowed warm_start to accept a list of unsupervised learning objects - such as RBMs (please see the example file, mlp_with_pretraining.py). I also added a method _pretraining in mlp that trains the rbms list, which initialize their corresponding layer weights. This might be a clean implementation.

Thanks.

@coveralls

Coverage Status

Coverage increased (+0.07%) when pulling 7cd49c4 on IssamLaradji:mlp-with-pretraining into 31f2e07 on scikit-learn:master.

@coveralls

Coverage Status

Coverage increased (+0.06%) when pulling 46ae255 on IssamLaradji:mlp-with-pretraining into 6d8ccbc on scikit-learn:master.

@IssamLaradji IssamLaradji reopened this
@coveralls

Coverage Status

Coverage increased (+0.06%) when pulling 46ae255 on IssamLaradji:mlp-with-pretraining into 6d8ccbc on scikit-learn:master.

@coveralls

Coverage Status

Coverage increased (+0.06%) when pulling 4261df1 on IssamLaradji:mlp-with-pretraining into 6d8ccbc on scikit-learn:master.

@ogrisel
Owner

This PR creates the examples/neural_network folder while there already exists a examples/neural_networks folder. Please reuse the existing folder instead.

@ogrisel
Owner

This PR creates the examples/neural_network folder while there already exists a examples/neural_networks folder. Please reuse the existing folder instead.

Actually this should be fixed in the parent #3204 PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Aug 16, 2014
  1. @IssamLaradji

    (WIP) Added Multi-layer perceptron (MLP)

    IssamLaradji authored
    Seeking to finalize MLP
  2. @IssamLaradji

    Fixed some typos

    IssamLaradji authored
  3. @IssamLaradji

    squashed

    IssamLaradji authored
  4. @IssamLaradji

    updates

    IssamLaradji authored
  5. @IssamLaradji

    updates

    IssamLaradji authored
  6. @IssamLaradji

    bug fix

    IssamLaradji authored
Commits on Aug 17, 2014
  1. @IssamLaradji

    doc update

    IssamLaradji authored
Something went wrong with that request. Please try again.