DL4J (/SameDiff): Add L2/L1 regularization schedules #7076

AlexDBlack · 2019-01-26T03:47:16Z

Currently, the L1/L2 regularization coefficients are fixed values.

What I see is that when using a schedule for lr, and the lr becomes very low and stays there, then it seems like the l2 "overwhelms" the parameters (another observation is that parameters:updates ratio goes up, and that the network "collapses"), and I've speculated that the result is basically zeroing out of all the weights.

Though fixed L1/L2 is most common in practice, that's a reasonable observation, and I can see how that could occur in some cases. Adding L1/L2 schedules would provide extra flexibility for situations like this.
(Side note: it woudl be nice to report in the UI the contribution of the loss function and the L2)

Aha! Link: https://skymindai.aha.io/features/ND4J-37

stolsvik · 2019-01-26T08:42:09Z

Nice. But I also have a question of whether dl4j actually does this wrong? In that the more common setup for l2, AFAIU, is that l2 is affected by the learning rate with the l2-decay is "within the parenthesis" before being multiplied by lr (thus it will be proportionally affected by lr). But since you do the l2-correction as a separate step, you have fixed the l2-effect. (It just hit me: If the lr goes below the l2, you'd actually negate the gradient?!)

Here's a comment from another issue, which raises the same question:
#5843 (comment)
.. it might be interesting to read a couple of comments upstream for that.

AlexDBlack · 2019-01-26T10:25:31Z

Right, as discussed is gitter, the "main" issue is this: #7079
It might still be nice to add this feature, however.

lock · 2019-03-03T11:02:48Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

AlexDBlack added DL4J General DeepLearning4j issues SameDiff Autodiff related issues labels Jan 26, 2019

stolsvik mentioned this issue Jan 26, 2019

DL4J: L2 regularization coefficient should be scaled by learning rate? #7079

Closed

AlexDBlack mentioned this issue Feb 1, 2019

[WIP] DL4J: L2/Weight decay #7097

Merged

AlexDBlack closed this as completed in #7097 Feb 1, 2019

lock bot locked and limited conversation to collaborators Mar 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DL4J (/SameDiff): Add L2/L1 regularization schedules #7076

DL4J (/SameDiff): Add L2/L1 regularization schedules #7076

AlexDBlack commented Jan 26, 2019 •

edited by SkymindBot

Loading

stolsvik commented Jan 26, 2019 •

edited

Loading

AlexDBlack commented Jan 26, 2019

lock bot commented Mar 3, 2019

DL4J (/SameDiff): Add L2/L1 regularization schedules #7076

DL4J (/SameDiff): Add L2/L1 regularization schedules #7076

Comments

AlexDBlack commented Jan 26, 2019 • edited by SkymindBot Loading

stolsvik commented Jan 26, 2019 • edited Loading

AlexDBlack commented Jan 26, 2019

lock bot commented Mar 3, 2019

AlexDBlack commented Jan 26, 2019 •

edited by SkymindBot

Loading

stolsvik commented Jan 26, 2019 •

edited

Loading