Learning rate multipliers for convolutional and dense layers #3004

jrhupc · 2016-06-17T10:06:38Z

I have updated the pull request #1991 to the latest master branch. The pull request adds functionality to provide learning rate multipliers for convolutional and dense layers (see issue #414).

…nd pass them to optimizers

…t use of bias

tetmin · 2016-07-27T14:28:56Z

Has this been implemented?

fchollet · 2016-07-27T22:17:45Z

Looking at this now. Two things come to mind:

multipliers should be handled on a per-layer basis, not on a per-weight basis, and should be abstracted into the Layer class. That will minimize the amount of changes to the codebase (you only need to add support in one place, not in every layer).
at the optimizer level, it should be handled like we handle constraints. That is to say, multipliers should be a dictionary mapping weights to coefficients.

Also, avoid unnecessary abbreviations that make code harder to read, such as "lr_mult".

…to follow chollet comments

jrhupc · 2016-07-30T20:17:16Z

I have updated the pull request to follow the suggestions. Now learning rate multipliers are a dictionary mapping weights to coefficients similar to constraints. It sure makes more sense and leaves the optimizer code cleaner.

Some functionality has been moved to the Layer base class but still some changes in each layer class is needed (following the constrains implementation, in order to create the dictionary of weights to coefficients, implementation needs to be in the child class). Is there a better way to move more code to the base Layer class?

Furthermore, I have removed tensorflow temporarily from some of the test code as setting a manual random seed is needed and tensorflow ignores np.random.seed(). Is there a random seed setter for the tensorflow backend?

FlorianImagia · 2016-08-23T19:21:35Z

This feature seems to be ready (there is now conflicts as it's been done 1 month ago).
Is there still something blocking?

AvantiShri · 2016-09-16T18:33:46Z

keras/layers/convolutional.py

@@ -178,7 +197,9 @@ def get_config(self):
                  'b_constraint': self.b_constraint.get_config() if self.b_constraint else None,
                  'bias': self.bias,
                  'input_dim': self.input_dim,
-                  'input_length': self.input_length}
+                  'input_length': self.input_length,
+                  'W_learning_rate_multiplier': self.W_learning_rate_multiplier if self.W_learning_rate_multiplier else None,


Hi Javier, Avanti here. I'm merging your adaptation of our code back into the kundajelab fork - thanks again for doing this. I had a minor thought: is the ifelse in this line really necessary, since self.W_learning_rate_multiplier is either None or a value (i.e. wouldn't it be equivalent to "'W_learning_rate_multiplier': self.W_learning_rate_multiplier")?

Hi Avanti,

I wanted to follow the same structure as constraints and that's way I added the ifelse. However you might be right and it seems not needed.

AvantiShri · 2016-09-16T19:24:29Z

keras/optimizers.py

-            v = self.momentum * m - lr * g  # velocity
+            # Apply learning rate multipliers if needed
+            if p in multipliers:
+                lrm = K.variable(multipliers[p])


Avanti here again (author of the learning rate multipliers implementation on the kundajelab branch that this was based on). Is the K.variable wrapping really necessary? K.variable returns a shared variable, which I understand is only necessary for trainable parameters.

As lr is a K.variable I thought it was safer to multiply lrm times lr both being K.variables... not sure if keras was really happy with number times K.variable.

igorbb · 2016-10-12T09:10:17Z

Hi.

Sorry to ask/disturb... but is there any idea when this pull will be reviewed for merging ?

Haven't had time to look at it yet, sorry. But I will.

dnola · 2016-10-17T04:31:59Z

I too have been using this code with success - it is pretty important to imitate a lot of the other reference models out there. I think it would be a great addition to the main branch

DingKe · 2016-11-21T05:37:52Z

any update?

albertbuchard · 2016-12-22T10:05:02Z

+1 :)

nymph332088 · 2017-01-06T18:52:59Z

+1 :)

aurora95 · 2017-01-16T17:48:22Z

+1 :)

jmhessel · 2017-02-04T22:10:01Z

+1 :)

meetps · 2017-02-05T13:52:38Z

+1 :)

delchiaro · 2017-02-09T15:24:01Z

I really need this feature, so i merged this pull-request branch with the latest keras commit to date (Commits on Feb 7, 2017) and solved manually the conflicts.

I made a fork of keras with my changes in the branch keras-lrmult-implementation.
I could make a new pull request, but probably is more correct to push the changes in this pull request (I only solved some conflicts, the real work is made by the author of this pull request).

Running locally the tests I got some failure, but I got the same with untouched keras master branch.

yushuinanrong · 2017-02-14T16:13:51Z

+1:)

Tutufa · 2017-02-15T12:16:58Z

up

fchollet · 2017-03-15T21:36:40Z

Closing outdated PR. If you still care about the content of the PR, please submit a new PR to master, updated for the Keras 2.0 API.

* Based on keras-team#3004

yinghuang · 2017-05-08T03:33:56Z

+1:)

farzaa · 2017-06-28T20:06:14Z

Any update on this or perhaps another way (using the current version of Keras) where I'd be able to do the same thing?

gsabran · 2017-07-17T18:09:00Z

I've shared a design review doc before making a new PR for the 2.0 API: https://docs.google.com/document/d/1l4k811Mxz1fIIzyw7-nOVMLkBN6a7bRcmW6EuIW0cc8/edit# stay tuned :)

gsabran · 2017-07-17T18:52:18Z

If you've comments / references on why this has been proven to be useful, please comment on google doc!

sachinruk · 2017-09-11T05:50:37Z

Has this PR been accepted, or something similar instead?

gsabran · 2017-09-22T17:59:58Z

Last time I checked (see response to proposal by @fchollet) there was not significant research showing this is beneficial. @meetshah1995 pointed to some work using LR multipliers, and the conversation has not moved from there.

hellojialee · 2018-03-21T14:43:45Z

Is it not accepted in keras 2.x?
I can't find it.

HuangBo-Terraloupe · 2018-04-08T18:18:34Z

I also can not find it, can some one explain it or maybe give a example how to using it, if the layer-wise learning rate is merged.

brunoklein99 · 2018-04-17T01:25:06Z

What is pending exactly for this to get merged?

jmhessel · 2018-04-18T18:23:50Z

I don't think there are any plans to have this merged because there is no PR for compatible with keras 2.X it at the moment.

andreapi87 · 2018-06-05T17:54:56Z

Hi! any news?
If I would use it how can I do? I have the last version of keras

brunoklein99 · 2018-06-05T18:20:18Z

@envytails I did my own version, which was enough for the implementation I was training to achieve.

https://github.com/brunoklein99/srcnn/blob/5e874eb161d4d27cfdb6ac9b2196b3ad154fc672/LRMultiplierSGD.py#L46

jrhupc added 10 commits June 13, 2016 12:59

Added support for learning rate multipliers in optimizers

5c8fcd7

Collect learning rate multipliers at training from trainable layers a…

a54e064

…nd pass them to optimizers

Added learning rate multiplier to Dense layer

b3165bb

Test for Dense layer learning rate multiplier

3af0efe

Fix implementation of lrmults in Dense layer to be more robust agains…

ef38936

…t use of bias

Merge branch 'master' into lrmult

fd01e3b

Added multipliers to maxoutdense layer

d708738

Added learning rate multipliers to convolutional layers

c545530

Merge branch 'master' into lrmult

728baa8

Test exception when no bias is used but multiplier provided

7614d01

jrhupc mentioned this pull request Jun 17, 2016

Provide learning rate multipliers for each layer #1991

Closed

jrhupc added 4 commits June 17, 2016 12:13

Updated test

b16b449

Merge branch 'master' into lrmult

d3ccc32

Fixed PEP8 test

4f5dac8

Merge branch 'master' into lrmult

07308a8

jrhupc added 10 commits July 29, 2016 20:15

Updated branch to latest master and starting changing implementation …

8f4c9d0

…to follow chollet comments

Added support for multipliers in convolutional layers

4594c69

Add self.multipliers in InputLayer

713832c

Fix initialization when multipliers were 1.0

733e410

Do not test on tensorflow as fixing random seed is needed

650a6b8

reorder testing

1ccd946

add missing import

8bc1430

Simplify conv2d test

d706cd8

Fix output for conv2d test

3b7edb8

Really fix output for conv2d test

09e2979

benedictb mentioned this pull request Aug 5, 2016

Difference between Activity and weight regularizer #3236

Closed

AvantiShri reviewed Sep 16, 2016

View reviewed changes

fchollet closed this Mar 15, 2017

miguelmartin75 mentioned this pull request Mar 22, 2017

Learning/Decay Rate Multiplier #5920

Closed

kencoken added a commit to kencoken/keras that referenced this pull request Apr 30, 2017

Added learning rate multiplier support

d1aa190

* Based on keras-team#3004

smehdia mentioned this pull request Jan 17, 2018

layer-wise learning rate fizyr/keras-retinanet#229

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Learning rate multipliers for convolutional and dense layers #3004

Learning rate multipliers for convolutional and dense layers #3004

jrhupc commented Jun 17, 2016

tetmin commented Jul 27, 2016

fchollet commented Jul 27, 2016

jrhupc commented Jul 30, 2016

FlorianImagia commented Aug 23, 2016 •

edited

Loading

AvantiShri Sep 16, 2016

jrhupc Sep 16, 2016

AvantiShri Sep 16, 2016

jrhupc Sep 16, 2016 •

edited

Loading

igorbb commented Oct 12, 2016

dnola commented Oct 17, 2016

DingKe commented Nov 21, 2016

albertbuchard commented Dec 22, 2016

nymph332088 commented Jan 6, 2017

aurora95 commented Jan 16, 2017

jmhessel commented Feb 4, 2017

meetps commented Feb 5, 2017

delchiaro commented Feb 9, 2017 •

edited

Loading

yushuinanrong commented Feb 14, 2017

Tutufa commented Feb 15, 2017

fchollet commented Mar 15, 2017

yinghuang commented May 8, 2017

farzaa commented Jun 28, 2017

gsabran commented Jul 17, 2017

gsabran commented Jul 17, 2017

sachinruk commented Sep 11, 2017

gsabran commented Sep 22, 2017

hellojialee commented Mar 21, 2018

HuangBo-Terraloupe commented Apr 8, 2018

brunoklein99 commented Apr 17, 2018

jmhessel commented Apr 18, 2018

andreapi87 commented Jun 5, 2018 •

edited

Loading

brunoklein99 commented Jun 5, 2018

Learning rate multipliers for convolutional and dense layers #3004

Learning rate multipliers for convolutional and dense layers #3004

Conversation

jrhupc commented Jun 17, 2016

tetmin commented Jul 27, 2016

fchollet commented Jul 27, 2016

jrhupc commented Jul 30, 2016

FlorianImagia commented Aug 23, 2016 • edited Loading

AvantiShri Sep 16, 2016

Choose a reason for hiding this comment

jrhupc Sep 16, 2016

Choose a reason for hiding this comment

AvantiShri Sep 16, 2016

Choose a reason for hiding this comment

jrhupc Sep 16, 2016 • edited Loading

Choose a reason for hiding this comment

igorbb commented Oct 12, 2016

dnola commented Oct 17, 2016

DingKe commented Nov 21, 2016

albertbuchard commented Dec 22, 2016

nymph332088 commented Jan 6, 2017

aurora95 commented Jan 16, 2017

jmhessel commented Feb 4, 2017

meetps commented Feb 5, 2017

delchiaro commented Feb 9, 2017 • edited Loading

yushuinanrong commented Feb 14, 2017

Tutufa commented Feb 15, 2017

fchollet commented Mar 15, 2017

yinghuang commented May 8, 2017

farzaa commented Jun 28, 2017

gsabran commented Jul 17, 2017

gsabran commented Jul 17, 2017

sachinruk commented Sep 11, 2017

gsabran commented Sep 22, 2017

hellojialee commented Mar 21, 2018

HuangBo-Terraloupe commented Apr 8, 2018

brunoklein99 commented Apr 17, 2018

jmhessel commented Apr 18, 2018

andreapi87 commented Jun 5, 2018 • edited Loading

brunoklein99 commented Jun 5, 2018

FlorianImagia commented Aug 23, 2016 •

edited

Loading

jrhupc Sep 16, 2016 •

edited

Loading

delchiaro commented Feb 9, 2017 •

edited

Loading

andreapi87 commented Jun 5, 2018 •

edited

Loading