cuDNN v6 dilated convolution #2858

anaruse · 2017-06-14T01:32:30Z

This PR is related to dilated convolution (#2693) and enabling Chainer to use an implementation of dilated convolution in cuDNN v6. Note that you need to use this branch of Cupy (cupy/cupy#133).

ir5

Thank you for PR! I put comments, so could you check them?

ir5 · 2017-06-19T06:04:18Z

chainer/functions/connection/convolution_2d.py

@@ -33,7 +33,7 @@ def _pair(x):

 class Convolution2DFunction(function.Function):

-    def __init__(self, stride=1, pad=0, cover_all=False, **kwargs):
+    def __init__(self, stride=1, pad=0, cover_all=False, dilate=1, **kwargs):


It seems you put dilate option in Convolution2DFunction, but it is not supported in CPU mode. This is misleading for users. Further, if Convolution2DFunction supports dilation feature fully, then DilatedConvolution2DFunction will become the subset of it. This means we will have some sort of overlapping codes. I think there are two ways to fix this.

Do not modify Convolution2DFunction (i.e., do not put dilate option here) and modify only the dilated function.

Complete to implement dilate argument (i.e., implement dilation in CPU mode) and make the current dilated function just calls Convolution2DFunction internally.

How do you think about this?

(Please look at my comment below)

ir5 · 2017-06-19T06:05:51Z

chainer/functions/connection/convolution_2d.py

        assert out_w > 0, 'Width in the output should be positive.'

        y = cuda.cupy.empty((n, out_c, out_h, out_w), dtype=x.dtype)
+        # print('# conv_2d.py:120, y.shape: {}'.format(y.shape))  # debug


Please remove debug message.

Noted, will remove it.

ir5 · 2017-06-19T06:06:31Z

chainer/links/connection/convolution_2d.py

@@ -101,7 +101,8 @@ class Convolution2D(link.Link):
    """  # NOQA

    def __init__(self, in_channels, out_channels, ksize=None, stride=1, pad=0,
-                 nobias=False, initialW=None, initial_bias=None, **kwargs):
+                 dilate=1, nobias=False, initialW=None, initial_bias=None,


Similar problem holds for the link.

(Please look at my comment below)

anaruse · 2017-06-19T10:28:34Z

Thank you for the feedback.

It seems you put dilate option in Convolution2DFunction, but it is not supported in CPU mode. This is misleading for users. Further, if Convolution2DFunction supports dilation feature fully, then DilatedConvolution2DFunction will become the subset of it. This means we will have some sort of overlapping codes. I think there are two ways to fix this.

Do not modify Convolution2DFunction (i.e., do not put dilate option here) and modify only the dilated function.

Complete to implement dilate argument (i.e., implement dilation in CPU mode) and make the current dilated function just calls Convolution2DFunction internally.

I was not sure which was better when I sent the PR, but now I think the option two would be better mainly because:

the codes of Convolution2DFunction and DilatedConvolution2DFunction are alike and there are a lot of duplicated lines among them, except the codes that utilizes cuDNN v5.x or before (which do not support dilated convolution) and calls cuDNN's 1x1 convolutions multiple times for dilated convolution (Dilated Convolution Layer #1335). The current implementation of dilated convolution using cuDNN v5.x or before is great, but seems it is not so faster than an implementation with cupy and looks it is not so easy to maintain. Considering that dilated convolution is natively supported by cuDNN from v6 and you can add dilated convolution feature to Convolution2DFunction without large amount of code modifications, I think it is good timing to merge Convolution2DFunction and DilatedConvolution2DFunction.

What would you think?

ir5 · 2017-06-20T06:16:22Z

Thank you for your opinion. Now I discussed with chainer core developers about this since merge is a little big change. They and I consider merging two similar functions is a good contribution, but it is somehow different from the objective in this PR. So, I recommend you to do (1) implementing cuDNN v6 dilated conv and (2) merging the two functions in separate PRs. I guess you may do either one ((1) or (2)) at first. If you feel making two PRs is troublesome, however, doing the two things in one PR is probably admissible.

anaruse · 2017-06-20T07:18:36Z

Thank you for the comment.

I agree your comment that it is not a small change to make two functions together. So, I will focus mainly on implementing dilated convolution feature in Convolution2DFunction for CPU and GPU and try to make an amount of code modifications to DilatedConvolution2DFunction small in this PR.

ir5 · 2017-06-21T09:43:47Z

So, I assume we are working on only (1) in this PR.
I guess we need to check if the result of standard convolution with dilate option is compatible with the existing dilated convolution.
I noticed you add a hidden flag _use_standard_convolution in dilated_convolution_2d.py which is probably for local test. When I modified this to True and ran test for test_dilated_convolution_2d.py, however, some test cases failed. My cudnn is a little old (4.0.7), so my environment is probably different with yours. My cupy was up to latest master branch.
I feel the way of testing using _use_standard_convolution is a little awkward.
How about adding the tests in dilated convolution to standard convolution so that we can see the test results on CI?

anaruse · 2017-06-21T10:40:56Z

Thank you for the feedback.
As you noticed, the flag _use_standard_convolution is for local testing. And I agree with you making that flag adjustable from test scripts like test_dilated_convolution_2d.py. So, will fix the codes.

anaruse · 2017-06-21T12:10:06Z

I've fixed test_dilated_convolution_2d.py so that it also tests the dilated convolution codes in a class Convolution2DFunction.

ir5 · 2017-06-22T04:39:29Z

Thank you for adding tests. My comments are:

I feel it is suitable to move the new tests in test_convolution_2d.py to test_dilated_convolution_2d.py, because the tests are about dilate argument in Convolution2DFunction. I guess testing.parameterize will reduce overlaps of codes.
Probably we need to add an explanation of dilate argument in docstrings of both function and link. There are many ways to explain, but I guess a simple description is enough. For example, adding an item in "Args:" section such as "dilate (int or pair of ints): Dilation factor of filter applications. For detail, please see :class:DilatedConvolution2DFunction." looks enough. Also, it may be worth noting that cuDNN is used in dilated convolution if its version is >= 6 in docstrings. I guess we may write this either in standard convolution or in dilated convolution.
_use_standard_convolution is unused code, so I feel it's better to remove it from this PR.

ir5 · 2017-06-22T04:42:32Z

chainer/functions/connection/dilated_convolution_2d.py

@@ -61,6 +70,9 @@ def check_type_forward(self, in_types):
            )

    def forward_cpu(self, inputs):
+        if getattr(self, '_func', False):


Knowing that we will work on merging tasks (2) right after this PR, this part is unused in this PR as far as I could see. So, to avoid confusion, it might be better to remove this part in this PR. How do you think about this? The same thing holds for backward_cpu.

anaruse · 2017-06-23T03:21:32Z

Thank you for the comment.

I've updated the branch based on your comment:

moving some test codes from test_dilated_convolution_2d.py to test_convolution_2d.py,
adding an explanation of a dilate argument of a function class Convolution2DFunction,
removing some unsed codes in a function class DilatedConvolution2DFunction, etc.

Note that I've not added a dilate argument to a link class Convolution2D yet, so I kept test codes for the link class unchanged. I thinks the addition of the dilate argument to the link class should be done at next step.

BTW, no failure has been observed so far when running nosetests with cuDNN 5.1 and 6.0.

ir5 · 2017-06-23T08:44:25Z

chainer/functions/connection/convolution_2d.py

@@ -388,6 +398,8 @@ def convolution_2d(x, W, b=None, stride=1, pad=0, cover_all=False, **kwargs):
            ``pad=p`` and ``pad=(p, p)`` are equivalent.
        cover_all (bool): If ``True``, all spatial locations are convoluted
            into some output pixels.
+        dilate (int or pair of ints): Dilation factor of filter applications.
+            ``dilate=d`` and ``dilate=(d, d)`` are equivalent.


How about putting a reference to DilatedConvolution2DFunction class? I guess dilated convolution is not familiar with beginners.

ir5 · 2017-06-23T08:44:54Z

Thank you. Now the code looks good to me. I put one minor comment.

okuta · 2017-09-18T13:08:42Z

@anaruse Cloud you resolve conflicts?

…on_2d.py

Conflicts: chainer/functions/connection/convolution_2d.py chainer/functions/connection/deconvolution_2d.py tests/chainer_tests/functions_tests/connection_tests/test_dilated_convolution_2d.py

okuta · 2017-10-16T03:15:41Z

jenkins, test this please.

okuta · 2017-10-17T14:52:17Z

I got error on CUDA v5 environment.
Please fix this line.

anaruse · 2017-10-17T22:09:20Z

Thank you for pointing it out. it was fixed.

okuta · 2017-10-20T08:18:24Z

jenkins, test this please.

okuta · 2017-10-21T08:50:07Z

Please merge or rebase master, and fix get_deconv_outsize.

Conflicts: chainer/utils/conv.py

anaruse · 2017-10-23T02:21:02Z

Thanks for your comment. The conflict was resolved.

okuta · 2017-10-23T15:26:21Z

jenkins, test this please.

okuta · 2017-10-25T07:17:52Z

I got following error. Please fix test of dilated conv.

01:13:45 _____ TestDilatedConvolution2DCudnnCall_param_7.test_call_cudnn_backrward ______
01:13:45 
01:13:45 self = <chainer.testing.parameterized.TestDilatedConvolution2DCudnnCall_param_7 testMethod=test_call_cudnn_backrward>
01:13:45 
01:13:45     def test_call_cudnn_backrward(self):
01:13:45         with chainer.using_config('use_cudnn', self.use_cudnn):
01:13:45             y = self.forward()
01:13:45             y.grad = self.gy
01:13:45             name = 'cupy.cudnn.cudnn.convolutionBackwardData_v3'
01:13:45             with mock.patch(name) as func:
01:13:45                 y.backward()
01:13:45 >               self.assertEqual(func.called, self.expect)
01:13:45 E               AssertionError: False != True

anaruse · 2017-10-25T07:31:13Z

Could you tell me CUDA version, cuDNN version, etc. when that error occurred?

okuta · 2017-10-26T08:59:45Z

I got error on CUDA7/cuDNN4 and CUDA8/cuDNN51.

anaruse · 2017-10-26T23:06:43Z

I got following error. Please fix test of dilated conv.

Please run the test again. I think the error is already resolved by the last commit. I've confirmed the error above does not happen with CUDA8 and cuDNN 5.1.

okuta · 2017-10-28T21:41:54Z

jenkins, test this please.

okuta · 2017-10-28T22:14:08Z

LGTM!

anaruse mentioned this pull request Jun 14, 2017

cuDNN v6 dilated convolution cupy/cupy#133

Merged

mitmul added the reviewer-team label Jun 19, 2017

ir5 suggested changes Jun 19, 2017

View reviewed changes

ir5 self-assigned this Jun 19, 2017

ir5 reviewed Jun 22, 2017

View reviewed changes

ir5 reviewed Jun 23, 2017

View reviewed changes

ir5 approved these changes Jun 23, 2017

View reviewed changes

beam2d self-requested a review June 23, 2017 08:48

okuta added the cat:feature Implementation that introduces new interfaces. label Aug 20, 2017

okuta self-assigned this Sep 18, 2017

anaruse added 9 commits September 19, 2017 10:47

cuDNN v6's implementation of dilated convolution is used

28bcfc5

Fix for compatibility

cade412

Add dilated conv support to Convolution2DFunction

0d793e6

Fix test_dilated_convolution_2d.py

7b97be3

Move test codes from test_dilated_convolution_2d.py to test_convoluti…

0cd0393

…on_2d.py

Fix tests for cuDNN 5.x or lower

146249f

Flake8

8519e47

Support dilated convolution in convolution_2d and deconvolution_2d

1105048

Fix tests for convolutions

55530a4

anaruse force-pushed the dilated_convolution_2d branch from 57a868c to 55530a4 Compare September 20, 2017 08:47

anaruse added 3 commits October 16, 2017 08:21

Merge branch 'master' into dilated_convolution_2d

23d0a46

Conflicts: chainer/functions/connection/convolution_2d.py chainer/functions/connection/deconvolution_2d.py tests/chainer_tests/functions_tests/connection_tests/test_dilated_convolution_2d.py

Merge two if statements and fix a bug

61529f3

autopep8

bb774c8

okuta approved these changes Oct 16, 2017

View reviewed changes

Fix issue with cudnn 5 or older

e9ba7df

okuta added this to the v4.0.0b1 milestone Oct 21, 2017

Merge branch 'master' into dilated_convolution_2d

35ae5fa

Conflicts: chainer/utils/conv.py

okuta added st:test-and-merge State indicating that pull request is approved by a reviewer and can be merged after CI passes. and removed st:ready-for-review labels Oct 23, 2017

Fix test so that it works OK with cuDNN 5.1 and older

0d256b0

okuta merged commit 7289901 into chainer:master Oct 28, 2017

niboshi mentioned this pull request Nov 9, 2017

[backport] Use correct module path #3829

Merged

delta2323 mentioned this pull request Nov 22, 2017

Dilated convolution with cuDNN #2693

Closed

takagi mentioned this pull request Nov 22, 2017

Grouped convolution #3494

Merged

This was referenced Dec 5, 2017

Add dilate argument docs #4011

Merged

Fix test failure with cuDNN v6 #4078

Merged

kmaehashi mentioned this pull request Jun 26, 2018

Add notes about relationship between F.dilated_convolution_2d and F.convolution_2d #5010

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuDNN v6 dilated convolution #2858

cuDNN v6 dilated convolution #2858

anaruse commented Jun 14, 2017

ir5 left a comment

ir5 Jun 19, 2017

anaruse Jun 19, 2017

ir5 Jun 19, 2017

anaruse Jun 19, 2017

ir5 Jun 19, 2017

anaruse Jun 19, 2017

anaruse commented Jun 19, 2017 •

edited

ir5 commented Jun 20, 2017

anaruse commented Jun 20, 2017

ir5 commented Jun 21, 2017

anaruse commented Jun 21, 2017

anaruse commented Jun 21, 2017

ir5 commented Jun 22, 2017

ir5 Jun 22, 2017

anaruse commented Jun 23, 2017

ir5 Jun 23, 2017

ir5 commented Jun 23, 2017

okuta commented Sep 18, 2017 •

edited

okuta commented Oct 16, 2017

okuta commented Oct 17, 2017

anaruse commented Oct 17, 2017

okuta commented Oct 20, 2017

okuta commented Oct 21, 2017

anaruse commented Oct 23, 2017

okuta commented Oct 23, 2017

okuta commented Oct 25, 2017

anaruse commented Oct 25, 2017

okuta commented Oct 26, 2017

anaruse commented Oct 26, 2017

okuta commented Oct 28, 2017

okuta commented Oct 28, 2017

cuDNN v6 dilated convolution #2858

cuDNN v6 dilated convolution #2858

Conversation

anaruse commented Jun 14, 2017

ir5 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anaruse commented Jun 19, 2017 • edited

ir5 commented Jun 20, 2017

anaruse commented Jun 20, 2017

ir5 commented Jun 21, 2017

anaruse commented Jun 21, 2017

anaruse commented Jun 21, 2017

ir5 commented Jun 22, 2017

Choose a reason for hiding this comment

anaruse commented Jun 23, 2017

Choose a reason for hiding this comment

ir5 commented Jun 23, 2017

okuta commented Sep 18, 2017 • edited

okuta commented Oct 16, 2017

okuta commented Oct 17, 2017

anaruse commented Oct 17, 2017

okuta commented Oct 20, 2017

okuta commented Oct 21, 2017

anaruse commented Oct 23, 2017

okuta commented Oct 23, 2017

okuta commented Oct 25, 2017

anaruse commented Oct 25, 2017

okuta commented Oct 26, 2017

anaruse commented Oct 26, 2017

okuta commented Oct 28, 2017

okuta commented Oct 28, 2017

anaruse commented Jun 19, 2017 •

edited

okuta commented Sep 18, 2017 •

edited