New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for custom weight init? #6813
Comments
Ok, just realized that extending/implementing Distributions doesn't work either as they come from Distributions.createDistribution. Old sins or is there some reason for the extra effort of not allowing custom distributions? |
Yeah, the only reason we don't have fully custom initializations (via an interface) is we haven't needed it internally, and there hasn't been much user demand... and specifying a distribution has been fine for most people who want custom inits. I'd be in favor of adding a properly extensible/custom weight initialization interface - no real issues there other than it must be JSON serializable and any changes must be backward compatible for old serialized nets (depending on the design you go with, that would be trivial or a little challenging, but I can point you in the right direction in a PR review or whatever). |
I'll give it a shot then! Just so I don't waste alot of time for nothing: Do I need to build libnd4j for this? I was hoping I could avoid this as I won't be touching anything but java code (or?), but after spending some time with the cloned monorepo I'm not so sure anymore. Assuming I manage to get going, I will look at how it was done for activation functions and try to copy that design (I assume they are also backwards compatible). |
@DrChainsaw if you're just modifying/building DL4J (using maven) ND4J snapshots should be pulled in automatically (including the native libraries - i.e., libnd4j) from sonatype. |
Implemented and merged some time ago here: #6820 |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
I was looking for a way to set identity mapping weight init for convolution layers, but from what I can tell the only way to do this is either to set in the shape from the outside before init or the pseudo-hack of extending OrthogonalDistribution as other distributions don't get to see the shape.
Would there be any issues in supporting an IWeightInit which just gets to see the correctly shaped paramView (or paramView and shape). I could take a stab at a PR if there aren't any reasons you don't want this.
I guess the implementations should be serializable in the same way as other network configs, but other than that it looks pretty straight forward.
The text was updated successfully, but these errors were encountered: