New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update DepthwiseConvolution2DLayer.java #6885
Conversation
Please explain this change. |
the standard in other conv layers is Runningit without this change it throws an error claiming to want input depth equal to whatever you set as kernel height. |
@sorech I appreciate the effort, but then you should go all the way. The weights come in a certain order, see here: Line 202 in 6bef4d5
which have been chosen like this since they reflect our c++ impl:
Now, unless you want to change everything down to that level (including tests, because what you did right now will break this layer), I'd suggest not to do it. The weights aren't magically in the order you want them to be :D. |
That all sound great, but it does not work with the shape of the weight matrix I get using nn.conf.DepthwiseConvolution2D which in turn calls nn.params.DepthwiseConvolutionParamInitializer. The conf is: Should the fix rather go in DepthwiseConvolutionParamInitializer? |
@sorech Maybe start with a test that reproduces the issue? |
Sorry for the delay, been busy. That test uses a different code path (initializeParams = true) in. Lines 192 to 211 in 6bef4d5
While my code uses initializeParams = false so shapes are given by line 209 instead of 202 that is used in the test. As you can see 209 uses {depthMultiplier, layerConf.getNIn(), kernel[0], kernel[1]} |
Ah, so that's what should be fixed then :) |
@raver119 Will you take care of it? |
@sorech maybe file an issue for this and I start a new PR, ok? |
@maxpumperla Hi, do you have the DepthwiseConvolution2D examples? I am looking for how to use DepthwiseConvolution2D to build a MobileNet (see open issue #8423). So, is it possible that using DepthwiseConvolution2D to build up a MobileNet? Any examples would be great. Thanks. |
@zhangy10 unfortunately we don't have any advanced examples for this layer, but all gradient tests pass and so does Keras model import. The layer can be used pretty much the same way as many other conv 2d layers. If your goal is to replicate a MobileNet architecture, that shouldn't cause you more problems than other components of this network. Is there anything in particular that you're struggling with? What have you tried so far? |
@AlexDBlack Hi Alex, do you have any examples to show how to use DepthwiseConvolution2D in the latest release version? And is it good to build a simple MobileNet? Many thanks. |
@zhangy10 No examples, because it should be possible to use a depthwise convolution layer anywhere you use a "normal" convolution layer. We have unit tests though: As for whether your can use it in a mobilenet architecture - I've never tried it, but I don't see why not. Whether it's better than standard conv2d layers is something you would have to test. |
@AlexDBlack Thanks for that, I will have a look at the tests. |
interpret weight size dimensions correctly (like other conv layers)