2d atrous convolution and atrous depthwise convolution #794

oveddan · 2018-02-28T16:02:48Z

This implements 2d atrous convolution and 2d atrous depthwise convolution, as described in Rethinking Atrous Convolution for Semantic Image Segmentation. Refer to the documentation for tf.nn.atrous_conv_2d for an explanation of of how atrous convolution works.

Atrous convolution is currently implemented in TensorFlow in the following methods (the parameter name is listed next to each method):

tf.nn.conv2d - dilations
tf.nn.conv2d_backprop_filter - dilations
tf.nn.conv2d_backprop_input - dilations
tf.nn.conv3d - dilations
tf.nn.conv3d_backprop_filter_v2 - dilations
tf.nn.convolution - dilation_rate
tf.nn.atrous_conv2d -rate
tf.nn.atrous_conv2d_transpose - rate
tf.nn.depthwise_conv2d - rate
tf.nn.depthwise_conv2d_native - dilations
tf.nn.depthwise_conv2d_native_backprop_filter - dilations
tf.nn.depthwise_conv2d_native_backprop_input - dilations
tf.nn.dilation2d - rates
tf.nn.quantized_conv2d - dilations

This PR modifies the webgl and cpu ops depthwiseConv2d and conv2d to take the parameter dilationRate and perform atrous convolution if the rate width or height is greater than 1. In its current state, it is potentially a breaking change in the api for conv1d and conv2d as it adds the positional argument dilations.

This change is

oveddan · 2018-02-28T16:16:39Z

This is not ready to merge yet, it's just been opened to get some feedback/review. Particular things that I would like feedback on:

Both backend_cpu's conv2d and depthwiseConv2d were inconsistent with the way their corresponding gpu methods were written, and I found it challenging to figure out how to use the dilation parameters in the cpu method. I refactored those cpu methods to have similar logic to their corresponding gpu methods, and was able to more easily use dilation. I'm not sure if the cpu methods were different by design for performance reasons.
For testing atrous convolution, a standard filter was converted into the dilated version by padding it with 0s in between the values. Then atrous convolution was performed using the standard filter, and standard convolution was done with the dilated version of the filter, and the results were compared for similarity. The advantage here is that these tests show the effect of atrous convolution. The con is that it's slightly inconsistent with the other tests.
The change to the apis of conv1d and conv2d would be a breaking change as it adds a positional argument of dilations in the middle of the arguments. It may be better to add that parameter to the end, or just to create a separate method atrousConv2d.

Besides the above requested feedback, still what needs to be done in this PR:

raise errors when doing backprop on convolution layers where dilation rates are greater than 1
If we are sticking with the breaking change, update all the models and examples in the repo to account for this new parameter.
Add more tests for depthwiseConv2d where dilation is greater than 1.

dsmilkov · 2018-03-06T01:41:15Z

Hi Dan,
This is an amazing PR. Replies below:

Don't worry much about the backend_cpu implementations. We mostly use it for sanity checking our webgl implementations. Refactoring is fine.
It's ok for testing atrous in a slightly different way.
Regarding API, I left a comment below. We are ok with being a breaking change, but have some request for the order of params and adding dataFormat now.

Thanks again for the awesome PR.

Review status: 0 of 12 files reviewed at latest revision, 1 unresolved discussion, some commit checks failed.

src/ops/conv.ts, line 129 at r2 (raw file):

  static conv2d<T extends Tensor3D|Tensor4D>(
      x: T, filter: Tensor4D, strides: [number, number]|number,
      dilations: [number, number]|number = [1, 1], pad: 'valid'|'same'|number,

Let's add dilations after padding, but before dimRoundingMode. Also can you add dataFormat = 'NHWC' before the dilations, so we don't break users again when we add data format later.

Basically the signature should be x, filter, strides, pad, data_format='NHWC', dilations=[1, 1].

Throw an error if dataFormat is not "NHWC" for now. We can add the functionality later.

To the same for conv1d

Comments from Reviewable

oveddan · 2018-03-07T17:02:10Z

Regarding testing atrous convolution, I struggled to properly setup a test for depthwiseConv2d when there is dilation, inDepth, and chMul greater than 1.

What I'm trying to do is populate the weight tensor with 0s to represent dilation, first columnwise then rowwise (first two dimensions) but I can't figure out the proper way.

Here is the test code:

    const fSize = 2;
    const pad = 'same';
    const stride = 1;
    const dilation = 2;
    const chMul = 2;
    const inDepth = 2;

    const x = dl.tensor4d(
        [
          0.675707, 0.758567, 0.413529, 0.963967, 0.217291, 0.101335, 0.804231,
          0.329673, 0.924503, 0.728742, 0.180217, 0.210459, 0.133869, 0.650827,
          0.047613, 0.554795, 0.653365, 0.442196
        ],
        [1, 3, 3, inDepth]);
    const w = dl.tensor4d(
        [
          0.347154, 0.386692, 0.327191, 0.483784, 0.591807, 0.24263, 0.95182,
          0.174353, 0.592136, 0.623469, 0.988244, 0.660731, 0.946534, 0.0801365,
          0.864889, 0.874602
        ],
        [fSize, fSize, inDepth, chMul]);
    // adding a dilation rate is equivalent to using a filter
    // with 0s for the dilation rate
    const fSizeDilated = fSize + (fSize - 1) * (dilation - 1);
    const wDilated = dl.tensor4d(
        [
          0.347154, 0, 0.386692,  0, 0, 0, 0.327191, 0, 0.483784,
          0.591807, 0, 0.24263,   0, 0, 0, 0.95182,  0, 0.174353,
          0.592136, 0, 0.623469,  0, 0, 0, 0.988244, 0, 0.660731,
          0.946534, 0, 0.0801365, 0, 0, 0, 0.864889, 0, 0.874602
        ],
        [fSizeDilated, fSizeDilated, inDepth, chMul],
    );

    const result = dl.depthwiseConv2d(x, w, stride, pad, dilation);

    const expectedResult = dl.depthwiseConv2d(x, wDilated, stride, pad);

    expect(result.shape).toEqual(expectedResult.shape);
    expectArraysClose(result, expectedResult);

This test fails because the matrix does not get populated the right way due to me not putting the 0s in the proper positions. How should I write something that populates it the right way? I've also tried something like:

const wDilated = dl.tensor4d(
        [
          [[
            [0.347154, 0, 0.386692, 0, 0, 0, 0.327191, 0, 0.483784],
            [0.591807, 0, 0.24263, 0, 0, 0, 0.95182, 0, 0.174353]
          ]],
          [[
            [0.592136, 0, 0.623469, 0, 0, 0, 0.988244, 0, 0.660731],
            [0.946534, 0, 0.0801365, 0, 0, 0, 0.864889, 0, 0.874602]
          ]]
        ],
        [fSizeDilated, fSizeDilated, inDepth, chMul],
    );

but get the error: Error: Error creating a new Tensor. Inferred shape (2,1,2,9) does not match the provided shape (3,3,2,2). Shapes 3,3,2,2 and 2,1,2,9 must match

Review status: 0 of 10 files reviewed at latest revision, 1 unresolved discussion.

src/ops/conv.ts, line 129 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

Let's add dilations after padding, but before dimRoundingMode. Also can you add dataFormat = 'NHWC' before the dilations, so we don't break users again when we add data format later.

Basically the signature should be x, filter, strides, pad, data_format='NHWC', dilations=[1, 1].

Throw an error if dataFormat is not "NHWC" for now. We can add the functionality later.

To the same for conv1d

Fixed in 25a532b.

The ordering here is slightly inconsistent, in order to match the ordering in Tensorflow:

In conv1d, as in tf.nn.conv1d, it's ordered stride, pad, dataFormat, dilation
In conv2d, as in tf.nn.conv2d, it's ordered strides, pad, dataFormat, dilations
In depthwiseConv2d, as in tf.nn.depthwiseConv2d, it's ordered strides, pad, dilations, dataFormat

Let me know if this is ok with you, or if you think it would be better to keep the ordering consistent within deeplearn.js.

For conv1d it accepts NWC or NCW but throws an error if it's not NWC

Also, TF doesn't have a dilation or rate parameter for tf.nn.conv1d - so not sure if we should get rid of that parameter that was added in this PR for that method.

Comments from Reviewable

dsmilkov · 2018-03-18T03:10:51Z

Sorry for the long delay. This is amazing work and would love to check this:

Regarding the test, maybe easier to populate the weights using a tensor buffer and using buffer.set(value, ...coords):

const buf = dl.buffer<Rank.R4>([fSizeDilated, fSizeDilated, inDepth, chMul]);
buf.set(0.347154, 0, 0, 0, 0); // [0,0] in filter, 0 inDepth, 0 chMul
buf.set(0, 0, 1, 0, 0); // [0,1] in filter, 0 inDepth, 0 chMul
buf.set(0.386692, 0, 2, 0, 0); // [0, 2] in filter, 0 inDepth, 0 chMul
.....
wDilated = buf.toTensor();

Left 1 comment about the API. Thanks for summarizing the differences so nicely.

Reviewed 2 of 12 files at r1, 1 of 3 files at r2, 3 of 5 files at r4, 3 of 4 files at r5.
Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks failed.

src/ops/conv.ts, line 129 at r2 (raw file):

Previously, oveddan (Dan Oved) wrote…

Fixed in 25a532b.

The ordering here is slightly inconsistent, in order to match the ordering in Tensorflow:

In conv1d, as in tf.nn.conv1d, it's ordered stride, pad, dataFormat, dilation

In conv2d, as in tf.nn.conv2d, it's ordered strides, pad, dataFormat, dilations

In depthwiseConv2d, as in tf.nn.depthwiseConv2d, it's ordered strides, pad, dilations, dataFormat

Let me know if this is ok with you, or if you think it would be better to keep the ordering consistent within deeplearn.js.

For conv1d it accepts NWC or NCW but throws an error if it's not NWC

Also, TF doesn't have a dilation or rate parameter for tf.nn.conv1d - so not sure if we should get rid of that parameter that was added in this PR for that method.

Thanks for the detailed list. Let's go with being consistent within deeplearn,
so have depthwiseConv2D take strides, pad, dataFormat, dilations.

Conv1 accepting nwc or ncw sounds great.
Keep dilation and rate in conv1d in deeplearn to be consistent across the board.

Comments from Reviewable

* Updated backend gpu and cpu to do atrous convolution when there is a dilation rate. * Refactored backend_cpu.conv2d to be consistent with backend_gpu.conv2d, and also to make the way it does dilated convolution consistent. * Added tests for 1d and 2d convolution with dilation rates that show the effect on the filter when dilation rates are set. * Updated computeDefaultPad to account for a dilation rate. Implemented atrous convolution for depthwiseConv2D * Modified cpu depthwiseConv2D logic to be similar to that on the GPU, so that dilation can be easily applied. Still need to add more tests for depthwise atrous convolution Per feedback, changed order of parameters to match Tensorflow api. Added dataFormat parameter to conv1d, conv2d, and depthwiseConv2d, but did not yet implement that parameter; it defaults to a value and cannot be a different value until the functionality is implemented raising error when gradient is done with atrous convolution

…w dilation parameters

oveddan · 2018-03-19T03:09:37Z

Review status: 5 of 9 files reviewed at latest revision, 1 unresolved discussion.

src/ops/conv.ts, line 129 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

Thanks for the detailed list. Let's go with being consistent within deeplearn,
so have depthwiseConv2D take strides, pad, dataFormat, dilations.

Conv1 accepting nwc or ncw sounds great.
Keep dilation and rate in conv1d in deeplearn to be consistent across the board.

Fixed 281335f

Comments from Reviewable

dsmilkov · 2018-03-19T19:32:41Z

This looks great. Let me know when the depthwise tests are added, and we'll be good to go. Thanks for the amazing work Dan!

Reviewed 4 of 4 files at r6.
Review status: all files reviewed at latest revision, all discussions resolved.

Comments from Reviewable

oveddan · 2018-03-20T03:57:48Z

So I added a couple depthwise conv2d tests for atrous convolution, but in a different way than you suggested. It would have been challenging to build up a 3x3x2x2 filter then set each value one by one.

It was easier to do it by building individual 2d filters with the desired values, then to stack them:

  const w =
        dl.stack(
              [
                dl.tensor2d(
                    [0.614293, 0.0648011, 0.101113, 0.452887], [fSize, fSize]),
                dl.tensor2d(
                    [0.0582746, 0.426481, 0.872743, 0.765767], [fSize, fSize])
              ],
              2)
            .expandDims(3) as dl.Tensor4D;

    // adding a dilation rate is equivalent to using a filter
    // with 0s for the dilation rate
    const fSizeDilated = fSize + (fSize - 1) * (dilation - 1);
    const wDilated =
        dl.stack(
              [
                dl.tensor2d(
                    [0.614293, 0, 0.0648011, 0, 0, 0, 0.101113, 0, 0.452887],
                    [fSizeDilated, fSizeDilated]),
                dl.tensor2d(
                    [0.0582746, 0, 0.426481, 0, 0, 0, 0.872743, 0, 0.765767],
                    [fSizeDilated, fSizeDilated])
              ],
              2)
            .expandDims(3) as dl.Tensor4D;

I added a couple tests that do this, one with dilation of 2 and chMul of 1, and another with dilation of 2 and chMul of 2 (resulting in a 3x3x2x2 dilated filter).

Review status: 8 of 9 files reviewed at latest revision, all discussions resolved.

Comments from Reviewable

dsmilkov · 2018-03-20T13:37:11Z

That's really smart. This looks ready to submit. Thanks again for the amazing work!

Reviewed 1 of 1 files at r7.
Review status: all files reviewed at latest revision, all discussions resolved.

Comments from Reviewable

oveddan force-pushed the atrous-convolution-implementation branch from 67c169f to 8ace2ee Compare March 4, 2018 21:51

oveddan force-pushed the atrous-convolution-implementation branch from 8ace2ee to 321481d Compare March 7, 2018 15:10

oveddan force-pushed the atrous-convolution-implementation branch from b715eb2 to a8b7a52 Compare March 7, 2018 22:35

oveddan added 2 commits March 18, 2018 22:21

Made ordering of parameters consistent in convolution methods with ne…

281335f

…w dilation parameters

oveddan force-pushed the atrous-convolution-implementation branch from a8b7a52 to 281335f Compare March 19, 2018 02:34

added depthwise conv tests

9f0c0f9

dsmilkov requested review from dsmilkov and nsthorat March 20, 2018 13:37

dsmilkov merged commit 40a6bfd into tensorflow:master Mar 20, 2018

dsmilkov mentioned this pull request Mar 20, 2018

Update tfjs-layers to depend on tfjs-core 0.0.2 tensorflow/tfjs-layers#58

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2d atrous convolution and atrous depthwise convolution #794

2d atrous convolution and atrous depthwise convolution #794

oveddan commented Feb 28, 2018 •

edited

Loading

oveddan commented Feb 28, 2018 •

edited

Loading

dsmilkov commented Mar 6, 2018

oveddan commented Mar 7, 2018

dsmilkov commented Mar 18, 2018

oveddan commented Mar 19, 2018

dsmilkov commented Mar 19, 2018

oveddan commented Mar 20, 2018

dsmilkov commented Mar 20, 2018

2d atrous convolution and atrous depthwise convolution #794

2d atrous convolution and atrous depthwise convolution #794

Conversation

oveddan commented Feb 28, 2018 • edited Loading

oveddan commented Feb 28, 2018 • edited Loading

dsmilkov commented Mar 6, 2018

oveddan commented Mar 7, 2018

dsmilkov commented Mar 18, 2018

oveddan commented Mar 19, 2018

dsmilkov commented Mar 19, 2018

oveddan commented Mar 20, 2018

dsmilkov commented Mar 20, 2018

oveddan commented Feb 28, 2018 •

edited

Loading

oveddan commented Feb 28, 2018 •

edited

Loading