Update DepthwiseConvolution2DLayer.java #6885

sorech · 2018-12-17T15:14:09Z

interpret weight size dimensions correctly (like other conv layers)

raver119 · 2018-12-17T15:16:11Z

Please explain this change.

sorech · 2018-12-17T15:21:51Z

the standard in other conv layers is
int outDepth = (int) weights.size(0);
int inDepth = (int) weights.size(1);
int kH = (int) weights.size(2);
int kW = (int) weights.size(3);
here depthmultiplier replaces outdepth

Runningit without this change it throws an error claiming to want input depth equal to whatever you set as kernel height.

maxpumperla · 2018-12-17T15:41:18Z

@sorech I appreciate the effort, but then you should go all the way. The weights come in a certain order, see here:

deeplearning4j/deeplearning4j/deeplearning4j-nn/src/main/java/org/deeplearning4j/nn/params/DepthwiseConvolutionParamInitializer.java

Line 202 in 6bef4d5

    
           val weightsShape = new long[] {kernel[0], kernel[1], inputDepth, depthMultiplier};

which have been chosen like this since they reflect our c++ impl:

deeplearning4j/libnd4j/include/ops/declarable/generic/convo/depthwiseConv2d.cpp

Line 35 in af7155d

    
           auto weights = INPUT_VARIABLE(1);                                    // [kH, kW, iC, mC] always

Now, unless you want to change everything down to that level (including tests, because what you did right now will break this layer), I'd suggest not to do it. The weights aren't magically in the order you want them to be :D.

sorech · 2018-12-17T16:51:19Z

That all sound great, but it does not work with the shape of the weight matrix I get using nn.conf.DepthwiseConvolution2D which in turn calls nn.params.DepthwiseConvolutionParamInitializer.

The conf is:
depthMultiplier = 5
kernelSize = [4, 2]
nIn = 7
and the resulting weight tensor has shapeInformation = [4,5,7,4,2,56,8,2,1,0,1,99].
That maps poorly to the expected shape of the {kernel[0], kernel[1], inputDepth, depthMultiplier}

Should the fix rather go in DepthwiseConvolutionParamInitializer?

AlexDBlack · 2018-12-17T23:09:56Z

@sorech Maybe start with a test that reproduces the issue?
Given we already have tests for this layer (that are passing) the problem you are running into might not be exactly what it seems...
https://github.com/deeplearning4j/deeplearning4j/blob/master/deeplearning4j/deeplearning4j-core/src/test/java/org/deeplearning4j/gradientcheck/CNNGradientCheckTest.java#L1184-L1250

sorech · 2018-12-20T15:37:44Z

Sorry for the delay, been busy. That test uses a different code path (initializeParams = true) in.

deeplearning4j/deeplearning4j/deeplearning4j-nn/src/main/java/org/deeplearning4j/nn/params/DepthwiseConvolutionParamInitializer.java

Lines 192 to 211 in 6bef4d5

    
               if (initializeParams) { 
        
                   Distribution dist = Distributions.createDistribution(layerConf.getDist()); 
        
                   int[] kernel = layerConf.getKernelSize(); 
        
                   int[] stride = layerConf.getStride(); 
        
                   val inputDepth = layerConf.getNIn(); 
        
                   double fanIn = inputDepth * kernel[0] * kernel[1]; 
        
                   double fanOut = depthMultiplier * kernel[0] * kernel[1] / ((double) stride[0] * stride[1]); 
        
                   val weightsShape = new long[] {kernel[0], kernel[1], inputDepth, depthMultiplier}; 
        
                   return WeightInitUtil.initWeights(fanIn, fanOut, weightsShape, layerConf.getWeightInit(), dist, 'c', 
        
                                   weightView); 
        
               } else { 
        
                   int[] kernel = layerConf.getKernelSize(); 
        
                   return WeightInitUtil.reshapeWeights( 
        
                                   new long[] {depthMultiplier, layerConf.getNIn(), kernel[0], kernel[1]}, weightView, 'c'); 
        
               } 
        
           }

While my code uses initializeParams = false so shapes are given by line 209 instead of 202 that is used in the test.

As you can see 209 uses {depthMultiplier, layerConf.getNIn(), kernel[0], kernel[1]}
instead of {kernel[0], kernel[1], inputDepth, depthMultiplier}

raver119 · 2018-12-20T15:39:55Z

Ah, so that's what should be fixed then :)

sorech · 2018-12-20T17:00:05Z

@raver119 Will you take care of it?

maxpumperla · 2018-12-20T17:21:10Z

@sorech @raver119 I can take care of it. cheers

maxpumperla · 2018-12-20T17:21:47Z

@sorech maybe file an issue for this and I start a new PR, ok?

zhangy10 · 2019-12-11T05:09:41Z

@maxpumperla Hi, do you have the DepthwiseConvolution2D examples? I am looking for how to use DepthwiseConvolution2D to build a MobileNet (see open issue #8423).

So, is it possible that using DepthwiseConvolution2D to build up a MobileNet? Any examples would be great. Thanks.

maxpumperla · 2019-12-11T08:39:28Z

@zhangy10 unfortunately we don't have any advanced examples for this layer, but all gradient tests pass and so does Keras model import. The layer can be used pretty much the same way as many other conv 2d layers. If your goal is to replicate a MobileNet architecture, that shouldn't cause you more problems than other components of this network.

Is there anything in particular that you're struggling with? What have you tried so far?

zhangy10 · 2019-12-18T06:55:09Z

@AlexDBlack Hi Alex, do you have any examples to show how to use DepthwiseConvolution2D in the latest release version? And is it good to build a simple MobileNet?

Many thanks.

AlexDBlack · 2019-12-18T23:22:30Z

@zhangy10 No examples, because it should be possible to use a depthwise convolution layer anywhere you use a "normal" convolution layer. We have unit tests though:
https://github.com/eclipse/deeplearning4j/blob/master/deeplearning4j/deeplearning4j-core/src/test/java/org/deeplearning4j/gradientcheck/CNNGradientCheckTest.java#L1178-L1246

As for whether your can use it in a mobilenet architecture - I've never tried it, but I don't see why not. Whether it's better than standard conv2d layers is something you would have to test.

zhangy10 · 2019-12-29T07:17:44Z

@AlexDBlack Thanks for that, I will have a look at the tests.

sorech closed this Dec 21, 2018

sorech mentioned this pull request Dec 21, 2018

DepthwiseConv2D weights shape with initializeParams == false #6911

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update DepthwiseConvolution2DLayer.java #6885

Update DepthwiseConvolution2DLayer.java #6885

sorech commented Dec 17, 2018

raver119 commented Dec 17, 2018

sorech commented Dec 17, 2018 •

edited

maxpumperla commented Dec 17, 2018

sorech commented Dec 17, 2018

AlexDBlack commented Dec 17, 2018

sorech commented Dec 20, 2018

raver119 commented Dec 20, 2018

sorech commented Dec 20, 2018

maxpumperla commented Dec 20, 2018

maxpumperla commented Dec 20, 2018

zhangy10 commented Dec 11, 2019

maxpumperla commented Dec 11, 2019

zhangy10 commented Dec 18, 2019

AlexDBlack commented Dec 18, 2019

zhangy10 commented Dec 29, 2019

Update DepthwiseConvolution2DLayer.java #6885

Update DepthwiseConvolution2DLayer.java #6885

Conversation

sorech commented Dec 17, 2018

raver119 commented Dec 17, 2018

sorech commented Dec 17, 2018 • edited

maxpumperla commented Dec 17, 2018

sorech commented Dec 17, 2018

AlexDBlack commented Dec 17, 2018

sorech commented Dec 20, 2018

raver119 commented Dec 20, 2018

sorech commented Dec 20, 2018

maxpumperla commented Dec 20, 2018

maxpumperla commented Dec 20, 2018

zhangy10 commented Dec 11, 2019

maxpumperla commented Dec 11, 2019

zhangy10 commented Dec 18, 2019

AlexDBlack commented Dec 18, 2019

zhangy10 commented Dec 29, 2019

sorech commented Dec 17, 2018 •

edited