-
Notifications
You must be signed in to change notification settings - Fork 949
2d atrous convolution and atrous depthwise convolution #794
2d atrous convolution and atrous depthwise convolution #794
Conversation
This is not ready to merge yet, it's just been opened to get some feedback/review. Particular things that I would like feedback on:
Besides the above requested feedback, still what needs to be done in this PR:
|
67c169f
to
8ace2ee
Compare
Hi Dan,
Thanks again for the awesome PR. Review status: 0 of 12 files reviewed at latest revision, 1 unresolved discussion, some commit checks failed. src/ops/conv.ts, line 129 at r2 (raw file):
Let's add dilations after padding, but before dimRoundingMode. Also can you add dataFormat = 'NHWC' before the dilations, so we don't break users again when we add data format later. Basically the signature should be x, filter, strides, pad, data_format='NHWC', dilations=[1, 1]. Throw an error if dataFormat is not "NHWC" for now. We can add the functionality later. To the same for conv1d Comments from Reviewable |
8ace2ee
to
321481d
Compare
Regarding testing atrous convolution, I struggled to properly setup a test for What I'm trying to do is populate the weight tensor with 0s to represent dilation, first columnwise then rowwise (first two dimensions) but I can't figure out the proper way. Here is the test code: const fSize = 2;
const pad = 'same';
const stride = 1;
const dilation = 2;
const chMul = 2;
const inDepth = 2;
const x = dl.tensor4d(
[
0.675707, 0.758567, 0.413529, 0.963967, 0.217291, 0.101335, 0.804231,
0.329673, 0.924503, 0.728742, 0.180217, 0.210459, 0.133869, 0.650827,
0.047613, 0.554795, 0.653365, 0.442196
],
[1, 3, 3, inDepth]);
const w = dl.tensor4d(
[
0.347154, 0.386692, 0.327191, 0.483784, 0.591807, 0.24263, 0.95182,
0.174353, 0.592136, 0.623469, 0.988244, 0.660731, 0.946534, 0.0801365,
0.864889, 0.874602
],
[fSize, fSize, inDepth, chMul]);
// adding a dilation rate is equivalent to using a filter
// with 0s for the dilation rate
const fSizeDilated = fSize + (fSize - 1) * (dilation - 1);
const wDilated = dl.tensor4d(
[
0.347154, 0, 0.386692, 0, 0, 0, 0.327191, 0, 0.483784,
0.591807, 0, 0.24263, 0, 0, 0, 0.95182, 0, 0.174353,
0.592136, 0, 0.623469, 0, 0, 0, 0.988244, 0, 0.660731,
0.946534, 0, 0.0801365, 0, 0, 0, 0.864889, 0, 0.874602
],
[fSizeDilated, fSizeDilated, inDepth, chMul],
);
const result = dl.depthwiseConv2d(x, w, stride, pad, dilation);
const expectedResult = dl.depthwiseConv2d(x, wDilated, stride, pad);
expect(result.shape).toEqual(expectedResult.shape);
expectArraysClose(result, expectedResult); This test fails because the matrix does not get populated the right way due to me not putting the 0s in the proper positions. How should I write something that populates it the right way? I've also tried something like: const wDilated = dl.tensor4d(
[
[[
[0.347154, 0, 0.386692, 0, 0, 0, 0.327191, 0, 0.483784],
[0.591807, 0, 0.24263, 0, 0, 0, 0.95182, 0, 0.174353]
]],
[[
[0.592136, 0, 0.623469, 0, 0, 0, 0.988244, 0, 0.660731],
[0.946534, 0, 0.0801365, 0, 0, 0, 0.864889, 0, 0.874602]
]]
],
[fSizeDilated, fSizeDilated, inDepth, chMul],
); but get the error: Review status: 0 of 10 files reviewed at latest revision, 1 unresolved discussion. src/ops/conv.ts, line 129 at r2 (raw file): Previously, dsmilkov (Daniel Smilkov) wrote…
Fixed in 25a532b. The ordering here is slightly inconsistent, in order to match the ordering in Tensorflow:
Let me know if this is ok with you, or if you think it would be better to keep the ordering consistent within deeplearn.js. For Also, TF doesn't have a Comments from Reviewable |
b715eb2
to
a8b7a52
Compare
Sorry for the long delay. This is amazing work and would love to check this: Regarding the test, maybe easier to populate the weights using a tensor buffer and using const buf = dl.buffer<Rank.R4>([fSizeDilated, fSizeDilated, inDepth, chMul]);
buf.set(0.347154, 0, 0, 0, 0); // [0,0] in filter, 0 inDepth, 0 chMul
buf.set(0, 0, 1, 0, 0); // [0,1] in filter, 0 inDepth, 0 chMul
buf.set(0.386692, 0, 2, 0, 0); // [0, 2] in filter, 0 inDepth, 0 chMul
.....
wDilated = buf.toTensor(); Left 1 comment about the API. Thanks for summarizing the differences so nicely. Reviewed 2 of 12 files at r1, 1 of 3 files at r2, 3 of 5 files at r4, 3 of 4 files at r5. src/ops/conv.ts, line 129 at r2 (raw file): Previously, oveddan (Dan Oved) wrote…
Thanks for the detailed list. Let's go with being consistent within deeplearn, Conv1 accepting nwc or ncw sounds great. Comments from Reviewable |
* Updated backend gpu and cpu to do atrous convolution when there is a dilation rate. * Refactored backend_cpu.conv2d to be consistent with backend_gpu.conv2d, and also to make the way it does dilated convolution consistent. * Added tests for 1d and 2d convolution with dilation rates that show the effect on the filter when dilation rates are set. * Updated computeDefaultPad to account for a dilation rate. Implemented atrous convolution for depthwiseConv2D * Modified cpu depthwiseConv2D logic to be similar to that on the GPU, so that dilation can be easily applied. Still need to add more tests for depthwise atrous convolution Per feedback, changed order of parameters to match Tensorflow api. Added dataFormat parameter to conv1d, conv2d, and depthwiseConv2d, but did not yet implement that parameter; it defaults to a value and cannot be a different value until the functionality is implemented raising error when gradient is done with atrous convolution
…w dilation parameters
a8b7a52
to
281335f
Compare
Review status: 5 of 9 files reviewed at latest revision, 1 unresolved discussion. src/ops/conv.ts, line 129 at r2 (raw file): Previously, dsmilkov (Daniel Smilkov) wrote…
Fixed 281335f Comments from Reviewable |
This looks great. Let me know when the depthwise tests are added, and we'll be good to go. Thanks for the amazing work Dan! Reviewed 4 of 4 files at r6. Comments from Reviewable |
So I added a couple depthwise conv2d tests for atrous convolution, but in a different way than you suggested. It would have been challenging to build up a 3x3x2x2 filter then set each value one by one. It was easier to do it by building individual 2d filters with the desired values, then to const w =
dl.stack(
[
dl.tensor2d(
[0.614293, 0.0648011, 0.101113, 0.452887], [fSize, fSize]),
dl.tensor2d(
[0.0582746, 0.426481, 0.872743, 0.765767], [fSize, fSize])
],
2)
.expandDims(3) as dl.Tensor4D;
// adding a dilation rate is equivalent to using a filter
// with 0s for the dilation rate
const fSizeDilated = fSize + (fSize - 1) * (dilation - 1);
const wDilated =
dl.stack(
[
dl.tensor2d(
[0.614293, 0, 0.0648011, 0, 0, 0, 0.101113, 0, 0.452887],
[fSizeDilated, fSizeDilated]),
dl.tensor2d(
[0.0582746, 0, 0.426481, 0, 0, 0, 0.872743, 0, 0.765767],
[fSizeDilated, fSizeDilated])
],
2)
.expandDims(3) as dl.Tensor4D; I added a couple tests that do this, one with dilation of 2 and chMul of 1, and another with dilation of 2 and chMul of 2 (resulting in a 3x3x2x2 dilated filter). Review status: 8 of 9 files reviewed at latest revision, all discussions resolved. Comments from Reviewable |
That's really smart. This looks ready to submit. Thanks again for the amazing work! Reviewed 1 of 1 files at r7. Comments from Reviewable |
This implements 2d atrous convolution and 2d atrous depthwise convolution, as described in Rethinking Atrous Convolution for Semantic Image Segmentation. Refer to the documentation for tf.nn.atrous_conv_2d for an explanation of of how atrous convolution works.
Atrous convolution is currently implemented in TensorFlow in the following methods (the parameter name is listed next to each method):
dilations
dilations
dilations
dilations
dilations
dilation_rate
rate
rate
rate
dilations
dilations
dilations
rates
dilations
This PR modifies the webgl and cpu ops
depthwiseConv2d
andconv2d
to take the parameterdilationRate
and perform atrous convolution if the rate width or height is greater than 1. In its current state, it is potentially a breaking change in the api forconv1d
andconv2d
as it adds the positional argumentdilations
.This change is