Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
Support for custom weight init? #6813
I was looking for a way to set identity mapping weight init for convolution layers, but from what I can tell the only way to do this is either to set in the shape from the outside before init or the pseudo-hack of extending OrthogonalDistribution as other distributions don't get to see the shape.
Would there be any issues in supporting an IWeightInit which just gets to see the correctly shaped paramView (or paramView and shape). I could take a stab at a PR if there aren't any reasons you don't want this.
I guess the implementations should be serializable in the same way as other network configs, but other than that it looks pretty straight forward.
Yeah, the only reason we don't have fully custom initializations (via an interface) is we haven't needed it internally, and there hasn't been much user demand... and specifying a distribution has been fine for most people who want custom inits.
I'd be in favor of adding a properly extensible/custom weight initialization interface - no real issues there other than it must be JSON serializable and any changes must be backward compatible for old serialized nets (depending on the design you go with, that would be trivial or a little challenging, but I can point you in the right direction in a PR review or whatever).
I'll give it a shot then!
Just so I don't waste alot of time for nothing: Do I need to build libnd4j for this?
I was hoping I could avoid this as I won't be touching anything but java code (or?), but after spending some time with the cloned monorepo I'm not so sure anymore.
Assuming I manage to get going, I will look at how it was done for activation functions and try to copy that design (I assume they are also backwards compatible).