Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
DL4J (/SameDiff): Add L2/L1 regularization schedules #7076
Currently, the L1/L2 regularization coefficients are fixed values.
From gitter, @stolsvik
Though fixed L1/L2 is most common in practice, that's a reasonable observation, and I can see how that could occur in some cases. Adding L1/L2 schedules would provide extra flexibility for situations like this.
Aha! Link: https://skymindai.aha.io/features/ND4J-37
Nice. But I also have a question of whether dl4j actually does this wrong? In that the more common setup for l2, AFAIU, is that l2 is affected by the learning rate with the l2-decay is "within the parenthesis" before being multiplied by lr (thus it will be proportionally affected by lr). But since you do the l2-correction as a separate step, you have fixed the l2-effect. (It just hit me: If the lr goes below the l2, you'd actually negate the gradient?!)
Here's a comment from another issue, which raises the same question: