-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nd convolution and pooling with cuDNN #3983
base: master
Are you sure you want to change the base?
Conversation
Do you plan to add MAX pooling? Also, it appears the checks have failed. |
@jcpeterson I did not include MAX pooling in this PR, as it would change the behaviour of caffe. See the following code snippet from layer_factory.cpp
I don't know if the behaviour of caffe or cudnn changed in the meantime and how to solve this. The automatic build failed because of an issue with travis-ci (boost-1.56 was missing?). I don't know how to initiate another build without committing something. At least on my computer, every test passes. |
I just added tests for 3D cudnn convolution and pooling. The new convolution tests are based on the already existing 3D caffe convolution tests, the 3D pooling tests that are based on the existing 2D tests. I used a matlab script for generating the hard-coded input and output values. |
// dimensions in all spatial dimensions, or once per spatial dimension. | ||
repeated uint32 pad = 4; // The padding size; defaults to 0 | ||
repeated uint32 kernel_size = 2; // The kernel size | ||
repeated uint32 stride = 3; // The stride; defaults to 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a safe transformation (changing from option to repeated) in general? If I have a protobinary-serialized NetParameter and I reload it, does it get correctly deserialized? I understand that text formatted NetParameters are consistent though.
http://stackoverflow.com/questions/20220459/what-do-i-need-to-be-careful-of-when-changing-protobuf-serialized-types-in-order seems to indicate it's not safe for Protobuf. I'm more familiar with Thrift where this is definitely not a safe transformation that led to numerous site issues (to the point where we wrote a linter to verify diffs don't modify Thrift struct types without modifying the field ids as well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is fine, and is the approach taken for N-D conv in #2049. My parse of the protobuf documentation on updating messages is that optional
-> required
is ok since "optional is compatible with repeated." However, it could be more clearly spelled out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah nice, it's clearly correct then, thanks for the link @shelhamer.
Is the compatibility to cudnn-v5 tested by Travis? |
I've compiled this branch with CUDA 8.0 and cuDNN v5.0.5. All tests passed. |
I just tried to rebase against the master branch. There're too many merge conflicts caused by 6028022 (compatibility to cudnn-v5) since the master has already supported cuDNN v5 with #4159 while there's only a single minor conflict to rebase from the commit 46261f7 (fix CUDNN_BAD_PARAM when using InnerProduct layer). It would be much easier to get this branch merged by seprarating the cuDNN v5 support into an independent PR. |
The git diff file produced by cherry-picking 6028022 after first rebasing 46261f7 to master is attached. |
@futurely thanks for testing and your comments. Unfortunately I am on vacation until July 21st, so I'm not able to rebase or make any changes to the code. |
Training a c3d model with the rebased version of this branch and the video data layer of @chuckcho chuckcho/video-caffe#22 didn't converge. The network prototxt is adapted from this one replacing NdConvolution and NdPooling with Convolution and Pooling. I would appreciate a lot if you could add an example to showcase how to effectively train and test your nd convolution and pooling? |
Fixes issue BVLC#3224.
support n-dimensional pooling for cudnn caffe cpu and gpu pooling implementations do not work in this revision!
change height and width parameters of pooling layer to new vector interfaces
the 2 tests copied and adapted from 3D caffe convolution tests
test the same things as the 2D pooling tests
Fix Label Shape Check in LossLayer::Reshape
support n-dimensional pooling for cudnn caffe cpu and gpu pooling implementations do not work in this revision!
change height and width parameters of pooling layer to new vector interfaces
the 2 tests copied and adapted from 3D caffe convolution tests
test the same things as the 2D pooling tests
Thanks @christianpayer this is very helpful; I would like to ask you something, when I use in the protoxt, the MAX method in the pooling using CUDNN engine, it runs normally. Is that ok?, Should I avoid using max pooling and stick with avg? Thanks! |
@rogertrullo you're welcome! By default, caffe does not use CUDNN for max pooling, even if the engine is set to CUDNN (see the code snippet of layer_factory.cpp I posted on Apr 14). I did not look into the comment of this snippet, so I don't know exactly, why CUDNN max pooling is never used. |
@christianpayer see #3574 for why cuDNN max pooling is disabled by default. |
@shelhamer thanks for the link! |
@christianpayer Hi Christian, thank you for implementing Nd-pooling layer. I want to try your code into my Caffe repo for videos with many modifications from main BLVC. Can you recommend an easy way to integrate your changes? Thank you. |
@antran89 Hi! Sorry for the late answer. I don't know how different the code from your repository is to the code from the main repository, so it is hard to estimate how difficult the merging will be. For getting nd convolution and pooling to work, you would need to merge the corresponding files (cudnn, cudnn_conv_layer, pooling_layer, cudnn_pooling_layer), which could result in many conflicts, as there are lots of changes. The remaining files do not contain so many changes, so merging should work fine. |
So caffe can use 3d conv and 3d pooling now? If I want to use 3d conv, how should I write my prototxt? |
@christianpayer: Thanks for support ND Convolution. Does it support ND Deconvolution, ND ReLU also?
For ND Pooling
|
@oyxhust Caffe will use 3D or 2D convolutions depending on the input blob size. So if an input layer creates a 4D blob (image, channel, y, x), the following convolutions will use 2D filters, and if an input layer creates a 5D blob (image, channel, z, y, x), the following convolutions will be 3D. So if you want to use 3D convolutions, you need to use an input layer that creates 5D outputs (e.g. HDF5 data layer). Look into thread #2049 for more discussions. @John1231983 ND convolution and ND deconvolution are already supported without my PR. This PR only adds ND convolutions and ND pooling with CUDNN. So if you use ND deconvolution (with or without my PR), the default caffe implementation is used. |
This branch implements n-dimensional convolution and pooling with cudnn, similar to #2824 and #3515. In contrast to #2824 and #3515 this PR does not create new layers, but is built on top of the existing convolution and pooling layers.
The convolution layer already has an interface for nd convolutions (#2049), but the pooling layer has not. I changed the interface of the pooling layer to support nd pooling similar to #2049. This new interface made changes in the caffe internal pooling and some test files necessary, but I did not change the caffe CPU and GPU pooling to support nd pooling. Nd pooling (2d, 3d and possibly more) is only working for average pooling using cudnn.
I also changed some calls to legacy shape accessors (num(), channels(), etc.) when I needed them. There are still many layers that do not support the new shape() accessors. So if you encounter errors from blob.hpp saying "Cannot use legacy accessors on Blobs with > 4 axes.", check the layer for the legacy accessors and make the necessary changes.
I tested the code with cudnn v4 for 2d and 3d convolutions and poolings. I am not sure, if older/other versions of cudnn would work.
All test cases pass with the new pooling interface. Currently, I did not create any new cases that test nd convolution and pooling, but I'm planning to implement some.