New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mlp with pretraining #3281

Open
wants to merge 7 commits into
base: master
from

Conversation

Projects
None yet
4 participants
@IssamLaradji
Contributor

IssamLaradji commented Jun 15, 2014

This extends the generic multi-layer perceptron (mlp) #3204 with a pre-training capability.

An example file - called mlp_with_pretraining.py - is added to show how pre-training with an RBM is done and how it could improve mlp's performance on the digits dataset.

I got the following results,

  1. Testing accuracy of mlp without pretraining: 0.964
  2. Testing accuracy of mlp with pretraining: 0.978
@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Jun 15, 2014

Coverage Status

Coverage increased (+0.07%) when pulling e10d2d6 on IssamLaradji:mlp-with-pretraining into 31f2e07 on scikit-learn:master.

coveralls commented Jun 15, 2014

Coverage Status

Coverage increased (+0.07%) when pulling e10d2d6 on IssamLaradji:mlp-with-pretraining into 31f2e07 on scikit-learn:master.

@IssamLaradji IssamLaradji referenced this pull request Jun 16, 2014

Closed

[MRG] Generic multi layer perceptron #3204

3 of 4 tasks complete
@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jun 18, 2014

Member

You can make a pipeline with make_pipeline instead of calling fit_transform on the first and passing the results to call fit on the second.

dbn = make_pipeline(
    BernoulliRBM(n_components=40, random_state=random_state,
                 learning_rate=0.01, n_iter=100),
    BernoulliRBM(n_components=10, random_state=random_state,
                 learning_rate=0.01, n_iter=100),
).fit(X_train)

Also true pre-training is done only on X_train rather than the full X.

Member

ogrisel commented Jun 18, 2014

You can make a pipeline with make_pipeline instead of calling fit_transform on the first and passing the results to call fit on the second.

dbn = make_pipeline(
    BernoulliRBM(n_components=40, random_state=random_state,
                 learning_rate=0.01, n_iter=100),
    BernoulliRBM(n_components=10, random_state=random_state,
                 learning_rate=0.01, n_iter=100),
).fit(X_train)

Also true pre-training is done only on X_train rather than the full X.

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jun 18, 2014

Member

Also, please put a global random_state=0 at the beginning and use that as argument everywhere. You will see that pre-training does not always help when you try several times with random_state=None. I think this is might be caused by the fact that we have very few samples to train the RBMs. It should be explicitly stressed in the docstring that pre-training does not seem to help in this case as.

Member

ogrisel commented Jun 18, 2014

Also, please put a global random_state=0 at the beginning and use that as argument everywhere. You will see that pre-training does not always help when you try several times with random_state=None. I think this is might be caused by the fact that we have very few samples to train the RBMs. It should be explicitly stressed in the docstring that pre-training does not seem to help in this case as.

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jun 18, 2014

Member

@IssamLaradji could work on the newly created #3289 independently? I think it should be pretty quick to merge and can be useful in all the MLP + digits examples.

Member

ogrisel commented Jun 18, 2014

@IssamLaradji could work on the newly created #3289 independently? I think it should be pretty quick to merge and can be useful in all the MLP + digits examples.

@IssamLaradji

This comment has been minimized.

Show comment
Hide comment
@IssamLaradji

IssamLaradji Jun 24, 2014

Contributor

Hi @ogrisel , I updated the code to have RBMs train using make_pipeline. Is this the right, concise way to initialize mlp? It looks quite generic but one thing is hard coded, the RBMs are manually initialized. Maybe I could dynamically create RBMs based on the number of layers and their number of neurons. What do you think?

mlp = MultilayerPerceptronClassifier(
    n_hidden=n_hidden, random_state=random_state)

# Set warm start to true
mlp.warm_start = True

# Train RBMs
dbn = make_pipeline(
    BernoulliRBM(n_components=n_hidden, random_state=random_state,
                 learning_rate=0.001, n_iter=50),
    BernoulliRBM(n_components=10, random_state=random_state,
                 learning_rate=0.001, n_iter=50),
).fit(X_train)

coefficients = [param[1].components_ for param in dbn.steps]
intercepts = [param[1].intercept_hidden_ for param in dbn.steps]

# Assign initial coefficients and intercepts with rbm's parameters
mlp.layers_coef_ = coefficients
mlp.layers_intercept_ = intercepts

Thanks.

Contributor

IssamLaradji commented Jun 24, 2014

Hi @ogrisel , I updated the code to have RBMs train using make_pipeline. Is this the right, concise way to initialize mlp? It looks quite generic but one thing is hard coded, the RBMs are manually initialized. Maybe I could dynamically create RBMs based on the number of layers and their number of neurons. What do you think?

mlp = MultilayerPerceptronClassifier(
    n_hidden=n_hidden, random_state=random_state)

# Set warm start to true
mlp.warm_start = True

# Train RBMs
dbn = make_pipeline(
    BernoulliRBM(n_components=n_hidden, random_state=random_state,
                 learning_rate=0.001, n_iter=50),
    BernoulliRBM(n_components=10, random_state=random_state,
                 learning_rate=0.001, n_iter=50),
).fit(X_train)

coefficients = [param[1].components_ for param in dbn.steps]
intercepts = [param[1].intercept_hidden_ for param in dbn.steps]

# Assign initial coefficients and intercepts with rbm's parameters
mlp.layers_coef_ = coefficients
mlp.layers_intercept_ = intercepts

Thanks.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Jun 24, 2014

Coverage Status

Coverage increased (+0.08%) when pulling 70d8f8c on IssamLaradji:mlp-with-pretraining into 31f2e07 on scikit-learn:master.

coveralls commented Jun 24, 2014

Coverage Status

Coverage increased (+0.08%) when pulling 70d8f8c on IssamLaradji:mlp-with-pretraining into 31f2e07 on scikit-learn:master.

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jun 24, 2014

Member
coefficients = [param[1].components_ for param in dbn.steps]
intercepts = [param[1].intercept_hidden_ for param in dbn.steps]

# Assign initial coefficients and intercepts with rbm's parameters
mlp.layers_coef_ = coefficients
mlp.layers_intercept_ = intercepts

could be directly written:

# Assign initial coefficients and intercepts with rbm's parameters
mlp.layers_coef_ = [param[1].components_ for param in dbn.steps]
mlp.layers_intercept_ = [param[1].intercept_hidden_ for param in dbn.steps]
Member

ogrisel commented Jun 24, 2014

coefficients = [param[1].components_ for param in dbn.steps]
intercepts = [param[1].intercept_hidden_ for param in dbn.steps]

# Assign initial coefficients and intercepts with rbm's parameters
mlp.layers_coef_ = coefficients
mlp.layers_intercept_ = intercepts

could be directly written:

# Assign initial coefficients and intercepts with rbm's parameters
mlp.layers_coef_ = [param[1].components_ for param in dbn.steps]
mlp.layers_intercept_ = [param[1].intercept_hidden_ for param in dbn.steps]
@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Jun 24, 2014

Member

And:

mlp = MultilayerPerceptronClassifier(
    n_hidden=n_hidden, random_state=random_state)

# Set warm start to true
mlp.warm_start = True

could be written:

mlp = MultilayerPerceptronClassifier(
    n_hidden=n_hidden, random_state=random_state, warm_start=True)
Member

ogrisel commented Jun 24, 2014

And:

mlp = MultilayerPerceptronClassifier(
    n_hidden=n_hidden, random_state=random_state)

# Set warm start to true
mlp.warm_start = True

could be written:

mlp = MultilayerPerceptronClassifier(
    n_hidden=n_hidden, random_state=random_state, warm_start=True)
@IssamLaradji

This comment has been minimized.

Show comment
Hide comment
@IssamLaradji

IssamLaradji Jun 24, 2014

Contributor

Fixed, thanks :)

Contributor

IssamLaradji commented Jun 24, 2014

Fixed, thanks :)

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Jun 24, 2014

Coverage Status

Coverage increased (+0.08%) when pulling 408e0af on IssamLaradji:mlp-with-pretraining into 31f2e07 on scikit-learn:master.

coveralls commented Jun 24, 2014

Coverage Status

Coverage increased (+0.08%) when pulling 408e0af on IssamLaradji:mlp-with-pretraining into 31f2e07 on scikit-learn:master.

@IssamLaradji

This comment has been minimized.

Show comment
Hide comment
@IssamLaradji

IssamLaradji Jun 25, 2014

Contributor

Here is a dynamic, generic way of constructing and training RBMs. I am using Pipeline because it allows me to construct the estimators as a list. This might be better than make_pipeline unless it supports list parameters as input. What do you think?

# Cross-validate multi-layer perceptron with rbm pre-training
mlp = MultilayerPerceptronClassifier(
    n_hidden=n_hidden, random_state=random_state, warm_start=True)

# Construct RBMs
layer_sizes = n_hidden + [n_output]
estimators = [('rbm' +str(i), BernoulliRBM(n_components=layer_sizes[i], 
                random_state=random_state, learning_rate=0.001, n_iter=50)) 
                for i in range(len(layer_sizes))]
# Train RBMs
dbn = Pipeline(estimators).fit(X_train)

# Assign initial coefficients and intercepts with rbm's parameters
mlp.layers_coef_ = [param[1].components_ for param in dbn.steps]
mlp.layers_intercept_ = [param[1].intercept_hidden_ for param in dbn.steps]
Contributor

IssamLaradji commented Jun 25, 2014

Here is a dynamic, generic way of constructing and training RBMs. I am using Pipeline because it allows me to construct the estimators as a list. This might be better than make_pipeline unless it supports list parameters as input. What do you think?

# Cross-validate multi-layer perceptron with rbm pre-training
mlp = MultilayerPerceptronClassifier(
    n_hidden=n_hidden, random_state=random_state, warm_start=True)

# Construct RBMs
layer_sizes = n_hidden + [n_output]
estimators = [('rbm' +str(i), BernoulliRBM(n_components=layer_sizes[i], 
                random_state=random_state, learning_rate=0.001, n_iter=50)) 
                for i in range(len(layer_sizes))]
# Train RBMs
dbn = Pipeline(estimators).fit(X_train)

# Assign initial coefficients and intercepts with rbm's parameters
mlp.layers_coef_ = [param[1].components_ for param in dbn.steps]
mlp.layers_intercept_ = [param[1].intercept_hidden_ for param in dbn.steps]
@IssamLaradji

This comment has been minimized.

Show comment
Hide comment
@IssamLaradji

IssamLaradji Jun 25, 2014

Contributor

In addition, I could add a list parameter_list whose ith index defines the parameters of the ith rbm. The user can fill the list with the parameters using the keyword argument syntax, **kwargs.

Contributor

IssamLaradji commented Jun 25, 2014

In addition, I could add a list parameter_list whose ith index defines the parameters of the ith rbm. The user can fill the list with the parameters using the keyword argument syntax, **kwargs.

@IssamLaradji

This comment has been minimized.

Show comment
Hide comment
@IssamLaradji

IssamLaradji Jun 26, 2014

Contributor

I think the parameter_list is an overkill. I believe the user should only need to set the estimators list manually which can be given as input to the warm_start parameter. The rest should be automated.

Contributor

IssamLaradji commented Jun 26, 2014

I think the parameter_list is an overkill. I believe the user should only need to set the estimators list manually which can be given as input to the warm_start parameter. The rest should be automated.

@GaelVaroquaux

This comment has been minimized.

Show comment
Hide comment
@GaelVaroquaux

GaelVaroquaux Jun 28, 2014

Member

The user can fill the list with the parameters using the keyword argument syntax, **kwargs.

I personally strive to avoid such function signatures. They make code
that is not autodocumented (the function signature is not helpful to
understand the function) and has weak error-catching behavior (a typo in
an argument name will not raise an error message in the right part of the
codebase, and in the worst case will be ignored). Finally, they are not
great for further extension of the API, as it often happen that in the
evolution of an API, the **kwargs get routed to several sub-routines.

If you really need a fairly generic recipient for arguments to
initialization of the objects, I'd rather have a dictionary. But in
general I favor passing an object, rather than a dictionary of parameters
to initialize the corresponding object.

Member

GaelVaroquaux commented Jun 28, 2014

The user can fill the list with the parameters using the keyword argument syntax, **kwargs.

I personally strive to avoid such function signatures. They make code
that is not autodocumented (the function signature is not helpful to
understand the function) and has weak error-catching behavior (a typo in
an argument name will not raise an error message in the right part of the
codebase, and in the worst case will be ignored). Finally, they are not
great for further extension of the API, as it often happen that in the
evolution of an API, the **kwargs get routed to several sub-routines.

If you really need a fairly generic recipient for arguments to
initialization of the objects, I'd rather have a dictionary. But in
general I favor passing an object, rather than a dictionary of parameters
to initialize the corresponding object.

@IssamLaradji

This comment has been minimized.

Show comment
Hide comment
@IssamLaradji

IssamLaradji Jun 29, 2014

Contributor

@GaelVaroquaux you are absolutely right. I realized that a while after I suggested **kwargs. It is too low level for the user. Like you said, passing an object is best since the user wouldn't have to dig too deep to know the syntax of initializing pre-training parameters.

Contributor

IssamLaradji commented Jun 29, 2014

@GaelVaroquaux you are absolutely right. I realized that a while after I suggested **kwargs. It is too low level for the user. Like you said, passing an object is best since the user wouldn't have to dig too deep to know the syntax of initializing pre-training parameters.

@IssamLaradji

This comment has been minimized.

Show comment
Hide comment
@IssamLaradji

IssamLaradji Jul 12, 2014

Contributor

So, for pre-training, I allowed warm_start to accept a list of unsupervised learning objects - such as RBMs (please see the example file, mlp_with_pretraining.py). I also added a method _pretraining in mlp that trains the rbms list, which initialize their corresponding layer weights. This might be a clean implementation.

Thanks.

Contributor

IssamLaradji commented Jul 12, 2014

So, for pre-training, I allowed warm_start to accept a list of unsupervised learning objects - such as RBMs (please see the example file, mlp_with_pretraining.py). I also added a method _pretraining in mlp that trains the rbms list, which initialize their corresponding layer weights. This might be a clean implementation.

Thanks.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Jul 12, 2014

Coverage Status

Coverage increased (+0.07%) when pulling 7cd49c4 on IssamLaradji:mlp-with-pretraining into 31f2e07 on scikit-learn:master.

coveralls commented Jul 12, 2014

Coverage Status

Coverage increased (+0.07%) when pulling 7cd49c4 on IssamLaradji:mlp-with-pretraining into 31f2e07 on scikit-learn:master.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Aug 16, 2014

Coverage Status

Coverage increased (+0.06%) when pulling 46ae255 on IssamLaradji:mlp-with-pretraining into 6d8ccbc on scikit-learn:master.

coveralls commented Aug 16, 2014

Coverage Status

Coverage increased (+0.06%) when pulling 46ae255 on IssamLaradji:mlp-with-pretraining into 6d8ccbc on scikit-learn:master.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Aug 16, 2014

Coverage Status

Coverage increased (+0.06%) when pulling 46ae255 on IssamLaradji:mlp-with-pretraining into 6d8ccbc on scikit-learn:master.

coveralls commented Aug 16, 2014

Coverage Status

Coverage increased (+0.06%) when pulling 46ae255 on IssamLaradji:mlp-with-pretraining into 6d8ccbc on scikit-learn:master.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Aug 17, 2014

Coverage Status

Coverage increased (+0.06%) when pulling 4261df1 on IssamLaradji:mlp-with-pretraining into 6d8ccbc on scikit-learn:master.

coveralls commented Aug 17, 2014

Coverage Status

Coverage increased (+0.06%) when pulling 4261df1 on IssamLaradji:mlp-with-pretraining into 6d8ccbc on scikit-learn:master.

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Aug 21, 2014

Member

This PR creates the examples/neural_network folder while there already exists a examples/neural_networks folder. Please reuse the existing folder instead.

Member

ogrisel commented Aug 21, 2014

This PR creates the examples/neural_network folder while there already exists a examples/neural_networks folder. Please reuse the existing folder instead.

@ogrisel

This comment has been minimized.

Show comment
Hide comment
@ogrisel

ogrisel Aug 21, 2014

Member

This PR creates the examples/neural_network folder while there already exists a examples/neural_networks folder. Please reuse the existing folder instead.

Actually this should be fixed in the parent #3204 PR.

Member

ogrisel commented Aug 21, 2014

This PR creates the examples/neural_network folder while there already exists a examples/neural_networks folder. Please reuse the existing folder instead.

Actually this should be fixed in the parent #3204 PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment