Skip to content
This repository has been archived by the owner on Aug 15, 2019. It is now read-only.

2d atrous convolution and atrous depthwise convolution #794

Merged
merged 3 commits into from
Mar 20, 2018

Conversation

oveddan
Copy link
Contributor

@oveddan oveddan commented Feb 28, 2018

This implements 2d atrous convolution and 2d atrous depthwise convolution, as described in Rethinking Atrous Convolution for Semantic Image Segmentation. Refer to the documentation for tf.nn.atrous_conv_2d for an explanation of of how atrous convolution works.

Atrous convolution is currently implemented in TensorFlow in the following methods (the parameter name is listed next to each method):

This PR modifies the webgl and cpu ops depthwiseConv2d and conv2d to take the parameter dilationRate and perform atrous convolution if the rate width or height is greater than 1. In its current state, it is potentially a breaking change in the api for conv1d and conv2d as it adds the positional argument dilations.


This change is Reviewable

@oveddan
Copy link
Contributor Author

oveddan commented Feb 28, 2018

This is not ready to merge yet, it's just been opened to get some feedback/review. Particular things that I would like feedback on:

  • Both backend_cpu's conv2d and depthwiseConv2d were inconsistent with the way their corresponding gpu methods were written, and I found it challenging to figure out how to use the dilation parameters in the cpu method. I refactored those cpu methods to have similar logic to their corresponding gpu methods, and was able to more easily use dilation. I'm not sure if the cpu methods were different by design for performance reasons.
  • For testing atrous convolution, a standard filter was converted into the dilated version by padding it with 0s in between the values. Then atrous convolution was performed using the standard filter, and standard convolution was done with the dilated version of the filter, and the results were compared for similarity. The advantage here is that these tests show the effect of atrous convolution. The con is that it's slightly inconsistent with the other tests.
  • The change to the apis of conv1d and conv2d would be a breaking change as it adds a positional argument of dilations in the middle of the arguments. It may be better to add that parameter to the end, or just to create a separate method atrousConv2d.

Besides the above requested feedback, still what needs to be done in this PR:

  • raise errors when doing backprop on convolution layers where dilation rates are greater than 1
  • If we are sticking with the breaking change, update all the models and examples in the repo to account for this new parameter.
  • Add more tests for depthwiseConv2d where dilation is greater than 1.

@oveddan oveddan force-pushed the atrous-convolution-implementation branch from 67c169f to 8ace2ee Compare March 4, 2018 21:51
@dsmilkov
Copy link
Contributor

dsmilkov commented Mar 6, 2018

Hi Dan,
This is an amazing PR. Replies below:

  • Don't worry much about the backend_cpu implementations. We mostly use it for sanity checking our webgl implementations. Refactoring is fine.
  • It's ok for testing atrous in a slightly different way.
  • Regarding API, I left a comment below. We are ok with being a breaking change, but have some request for the order of params and adding dataFormat now.

Thanks again for the awesome PR.


Review status: 0 of 12 files reviewed at latest revision, 1 unresolved discussion, some commit checks failed.


src/ops/conv.ts, line 129 at r2 (raw file):

  static conv2d<T extends Tensor3D|Tensor4D>(
      x: T, filter: Tensor4D, strides: [number, number]|number,
      dilations: [number, number]|number = [1, 1], pad: 'valid'|'same'|number,

Let's add dilations after padding, but before dimRoundingMode. Also can you add dataFormat = 'NHWC' before the dilations, so we don't break users again when we add data format later.

Basically the signature should be x, filter, strides, pad, data_format='NHWC', dilations=[1, 1].

Throw an error if dataFormat is not "NHWC" for now. We can add the functionality later.

To the same for conv1d


Comments from Reviewable

@oveddan oveddan force-pushed the atrous-convolution-implementation branch from 8ace2ee to 321481d Compare March 7, 2018 15:10
@oveddan
Copy link
Contributor Author

oveddan commented Mar 7, 2018

Regarding testing atrous convolution, I struggled to properly setup a test for depthwiseConv2d when there is dilation, inDepth, and chMul greater than 1.

What I'm trying to do is populate the weight tensor with 0s to represent dilation, first columnwise then rowwise (first two dimensions) but I can't figure out the proper way.

Here is the test code:

    const fSize = 2;
    const pad = 'same';
    const stride = 1;
    const dilation = 2;
    const chMul = 2;
    const inDepth = 2;

    const x = dl.tensor4d(
        [
          0.675707, 0.758567, 0.413529, 0.963967, 0.217291, 0.101335, 0.804231,
          0.329673, 0.924503, 0.728742, 0.180217, 0.210459, 0.133869, 0.650827,
          0.047613, 0.554795, 0.653365, 0.442196
        ],
        [1, 3, 3, inDepth]);
    const w = dl.tensor4d(
        [
          0.347154, 0.386692, 0.327191, 0.483784, 0.591807, 0.24263, 0.95182,
          0.174353, 0.592136, 0.623469, 0.988244, 0.660731, 0.946534, 0.0801365,
          0.864889, 0.874602
        ],
        [fSize, fSize, inDepth, chMul]);
    // adding a dilation rate is equivalent to using a filter
    // with 0s for the dilation rate
    const fSizeDilated = fSize + (fSize - 1) * (dilation - 1);
    const wDilated = dl.tensor4d(
        [
          0.347154, 0, 0.386692,  0, 0, 0, 0.327191, 0, 0.483784,
          0.591807, 0, 0.24263,   0, 0, 0, 0.95182,  0, 0.174353,
          0.592136, 0, 0.623469,  0, 0, 0, 0.988244, 0, 0.660731,
          0.946534, 0, 0.0801365, 0, 0, 0, 0.864889, 0, 0.874602
        ],
        [fSizeDilated, fSizeDilated, inDepth, chMul],
    );

    const result = dl.depthwiseConv2d(x, w, stride, pad, dilation);

    const expectedResult = dl.depthwiseConv2d(x, wDilated, stride, pad);

    expect(result.shape).toEqual(expectedResult.shape);
    expectArraysClose(result, expectedResult);

This test fails because the matrix does not get populated the right way due to me not putting the 0s in the proper positions. How should I write something that populates it the right way? I've also tried something like:

const wDilated = dl.tensor4d(
        [
          [[
            [0.347154, 0, 0.386692, 0, 0, 0, 0.327191, 0, 0.483784],
            [0.591807, 0, 0.24263, 0, 0, 0, 0.95182, 0, 0.174353]
          ]],
          [[
            [0.592136, 0, 0.623469, 0, 0, 0, 0.988244, 0, 0.660731],
            [0.946534, 0, 0.0801365, 0, 0, 0, 0.864889, 0, 0.874602]
          ]]
        ],
        [fSizeDilated, fSizeDilated, inDepth, chMul],
    );

but get the error: Error: Error creating a new Tensor. Inferred shape (2,1,2,9) does not match the provided shape (3,3,2,2). Shapes 3,3,2,2 and 2,1,2,9 must match


Review status: 0 of 10 files reviewed at latest revision, 1 unresolved discussion.


src/ops/conv.ts, line 129 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

Let's add dilations after padding, but before dimRoundingMode. Also can you add dataFormat = 'NHWC' before the dilations, so we don't break users again when we add data format later.

Basically the signature should be x, filter, strides, pad, data_format='NHWC', dilations=[1, 1].

Throw an error if dataFormat is not "NHWC" for now. We can add the functionality later.

To the same for conv1d

Fixed in 25a532b.

The ordering here is slightly inconsistent, in order to match the ordering in Tensorflow:

  • In conv1d, as in tf.nn.conv1d, it's ordered stride, pad, dataFormat, dilation
  • In conv2d, as in tf.nn.conv2d, it's ordered strides, pad, dataFormat, dilations
  • In depthwiseConv2d, as in tf.nn.depthwiseConv2d, it's ordered strides, pad, dilations, dataFormat

Let me know if this is ok with you, or if you think it would be better to keep the ordering consistent within deeplearn.js.

For conv1d it accepts NWC or NCW but throws an error if it's not NWC

Also, TF doesn't have a dilation or rate parameter for tf.nn.conv1d - so not sure if we should get rid of that parameter that was added in this PR for that method.


Comments from Reviewable

@oveddan oveddan force-pushed the atrous-convolution-implementation branch from b715eb2 to a8b7a52 Compare March 7, 2018 22:35
@dsmilkov
Copy link
Contributor

Sorry for the long delay. This is amazing work and would love to check this:

Regarding the test, maybe easier to populate the weights using a tensor buffer and using buffer.set(value, ...coords):

const buf = dl.buffer<Rank.R4>([fSizeDilated, fSizeDilated, inDepth, chMul]);
buf.set(0.347154, 0, 0, 0, 0); // [0,0] in filter, 0 inDepth, 0 chMul
buf.set(0, 0, 1, 0, 0); // [0,1] in filter, 0 inDepth, 0 chMul
buf.set(0.386692, 0, 2, 0, 0); // [0, 2] in filter, 0 inDepth, 0 chMul
.....
wDilated = buf.toTensor();

Left 1 comment about the API. Thanks for summarizing the differences so nicely.


Reviewed 2 of 12 files at r1, 1 of 3 files at r2, 3 of 5 files at r4, 3 of 4 files at r5.
Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks failed.


src/ops/conv.ts, line 129 at r2 (raw file):

Previously, oveddan (Dan Oved) wrote…

Fixed in 25a532b.

The ordering here is slightly inconsistent, in order to match the ordering in Tensorflow:

  • In conv1d, as in tf.nn.conv1d, it's ordered stride, pad, dataFormat, dilation
  • In conv2d, as in tf.nn.conv2d, it's ordered strides, pad, dataFormat, dilations
  • In depthwiseConv2d, as in tf.nn.depthwiseConv2d, it's ordered strides, pad, dilations, dataFormat

Let me know if this is ok with you, or if you think it would be better to keep the ordering consistent within deeplearn.js.

For conv1d it accepts NWC or NCW but throws an error if it's not NWC

Also, TF doesn't have a dilation or rate parameter for tf.nn.conv1d - so not sure if we should get rid of that parameter that was added in this PR for that method.

Thanks for the detailed list. Let's go with being consistent within deeplearn,
so have depthwiseConv2D take strides, pad, dataFormat, dilations.

Conv1 accepting nwc or ncw sounds great.
Keep dilation and rate in conv1d in deeplearn to be consistent across the board.


Comments from Reviewable

* Updated backend gpu and cpu to do atrous convolution when there is a dilation rate.
* Refactored backend_cpu.conv2d to be consistent with backend_gpu.conv2d, and also to make the way it does dilated convolution consistent.
* Added tests for 1d and 2d convolution with dilation rates that show the effect on the filter when dilation rates are set.
* Updated computeDefaultPad to account for a dilation rate.

Implemented atrous convolution for depthwiseConv2D

* Modified cpu depthwiseConv2D logic to be similar to that on the GPU,
so that dilation can be easily applied.

Still need to add more tests for depthwise atrous convolution

Per feedback, changed order of parameters to match
Tensorflow api.  Added dataFormat parameter to conv1d, conv2d, and
depthwiseConv2d, but did not yet implement that parameter; it defaults
to a value and cannot be a different value until the functionality
is implemented

raising error when gradient is done with atrous convolution
@oveddan oveddan force-pushed the atrous-convolution-implementation branch from a8b7a52 to 281335f Compare March 19, 2018 02:34
@oveddan
Copy link
Contributor Author

oveddan commented Mar 19, 2018

Review status: 5 of 9 files reviewed at latest revision, 1 unresolved discussion.


src/ops/conv.ts, line 129 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

Thanks for the detailed list. Let's go with being consistent within deeplearn,
so have depthwiseConv2D take strides, pad, dataFormat, dilations.

Conv1 accepting nwc or ncw sounds great.
Keep dilation and rate in conv1d in deeplearn to be consistent across the board.

Fixed 281335f


Comments from Reviewable

@dsmilkov
Copy link
Contributor

This looks great. Let me know when the depthwise tests are added, and we'll be good to go. Thanks for the amazing work Dan!


Reviewed 4 of 4 files at r6.
Review status: all files reviewed at latest revision, all discussions resolved.


Comments from Reviewable

@oveddan
Copy link
Contributor Author

oveddan commented Mar 20, 2018

So I added a couple depthwise conv2d tests for atrous convolution, but in a different way than you suggested. It would have been challenging to build up a 3x3x2x2 filter then set each value one by one.

It was easier to do it by building individual 2d filters with the desired values, then to stack them:

  const w =
        dl.stack(
              [
                dl.tensor2d(
                    [0.614293, 0.0648011, 0.101113, 0.452887], [fSize, fSize]),
                dl.tensor2d(
                    [0.0582746, 0.426481, 0.872743, 0.765767], [fSize, fSize])
              ],
              2)
            .expandDims(3) as dl.Tensor4D;

    // adding a dilation rate is equivalent to using a filter
    // with 0s for the dilation rate
    const fSizeDilated = fSize + (fSize - 1) * (dilation - 1);
    const wDilated =
        dl.stack(
              [
                dl.tensor2d(
                    [0.614293, 0, 0.0648011, 0, 0, 0, 0.101113, 0, 0.452887],
                    [fSizeDilated, fSizeDilated]),
                dl.tensor2d(
                    [0.0582746, 0, 0.426481, 0, 0, 0, 0.872743, 0, 0.765767],
                    [fSizeDilated, fSizeDilated])
              ],
              2)
            .expandDims(3) as dl.Tensor4D;

I added a couple tests that do this, one with dilation of 2 and chMul of 1, and another with dilation of 2 and chMul of 2 (resulting in a 3x3x2x2 dilated filter).


Review status: 8 of 9 files reviewed at latest revision, all discussions resolved.


Comments from Reviewable

@dsmilkov
Copy link
Contributor

:lgtm_strong: That's really smart. This looks ready to submit. Thanks again for the amazing work!


Reviewed 1 of 1 files at r7.
Review status: all files reviewed at latest revision, all discussions resolved.


Comments from Reviewable

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants