Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add convolution Function #2282

Merged
merged 25 commits into from
Jun 19, 2017
Merged

Conversation

hedaoyuan
Copy link
Contributor

@hedaoyuan hedaoyuan commented May 26, 2017

Follow comments in issue #2196

  • Defines the data structure of the input, output, and filter.
  • Migrate the code based on sgemm to the GemmConvOp.cpp.
  • Refactor the ExpandConvBaseLayer, ExpandConvLayer and ExpandConvTransLayer.

Code refactoring:

number old code new code about
1 ExpandConvTransLayer, ExpandConvLayer ExpandConvLayer Remove ExpandConvTransLayer, both exconv and exconvt are based on ExpandConvLayer.
2 ExpandConvBaseLayer ConvFunctionBase
3 ExpandConvBaseLayer::expandOneFrame, Matrix::convExpand Im2ColFunctor Refactor code.
4 Matrix::convShrink Col2ImFunctor Refactor code.
5 ExpandConvBaseLayer::expandFwdOnce GemmConvFunction Refactor code.
6 ExpandConvBaseLayer::bpropActs GemmConvGradInputFunction Refactor code.
7 ExpandConvBaseLayer::bpropWeights GemmConvGradFilterFunction Refactor code.
8 NaiveConvFunction Add a naive convolution calculation.
9 FunctionCompare Compare2Function, CpuGpuFuncCompare Replace FunctionCompare with the Compare2Function type. The original FunctionCompare is actually a CpuGpuFuncCompare.
10 ConvolutionTest Added unit tests for the implementation of various convolution calculations.

To do:

  1. Remove and Matrix::convExpand and Matrix::convShrink and Refactor BlockExpandLayer Remove and Matrix::convExpand and Matrix::convShrink and Refactor BlockExpandLayer #2424
  2. Output size calculation Whether can remove caffeMode. #2460 .
  3. Migrate the code based on cudnn to the CudnnConvOp.cpp. Refactor CudnnConvLayer, ConvProjection, ConvTransProjection #2425

@hedaoyuan hedaoyuan requested a review from qingqing01 May 27, 2017 09:38
Copy link
Contributor

@qingqing01 qingqing01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

还没看完,后续继续看 :)

FORWARD_TEST = 0,
BACKWARD_INPUT_TEST = 1,
BACKWARD_FILTER_TEST = 2,
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if (padding >= filterSize) break;
size_t outputSize =
(inputSize - filterSize + 2 * padding + stride) / stride;
LOG(INFO) << " batchSize=" << batchSize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOG(INFO) -> VLOG

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

for (size_t inputSize : {7, 14, 54}) {
for (size_t filterSize : {1, 3, 5}) {
for (size_t inputChannels : {3, 64}) {
for (size_t outputChannels : {3, 64, 128}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是否需要增加长方形input,output, filter的单测?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在这里加会让TEST变的时间很长,后续我单独增加针对长方形的测试吧。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

int paddingWidth,
int outputHeight,
int outputWidth,
T* colData) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感觉接口直接传TensorShape会简单一些吧,就不用这么多参数了~

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

修改成TensorShape的话,接口里面还需要做一次input/filter shape对应参数的检查。而且,这里没有batchSize这个维度,还不能直接用Function里面传入的参数的shape。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be fixed in the next PR 2449.

class Im2ColFunctor {
public:
  void operator()(float* colData,
                  const float* imData,
                  const TensorShape& colShape,
                  const TensorShape& imShape,
                  int strideHeight,
                  int strideWidth,
                  int paddingHeight,
                  int paddingWidth);
};

* image channels, C is the number of input image channels,
* H and W is height and width of filter.
*
* If groups is greater than 1, the filter's data format should be GMCHW,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

groups -> groups_

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个注释里面我觉得不用修改成groups_,FuncConfig的配置参数名称上也是groups。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

上面提出修改原因是:If groups is greater than 1 不符合语法,也提醒我以后comments需要解释清楚 :)


protected:
size_t getFilterHeight(const TensorShape& filter) const {
if (filter.ndims() == 5) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(size_t)5

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


protected:
size_t getFilterHeight(const TensorShape& filter) const {
if (filter.ndims() == 5) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(size_t)5

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

size_t getFilterWidth(const TensorShape& filter) const {
if (filter.ndims() == 5) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(size_t)5

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if (filter.ndims() == 5) {
return filter[4];
} else {
return filter[3];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以加个类似filter[-1], filter[-2] 这样的接口?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

没有理解这个comment,filter[-1],filter[-2]指的是啥?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1, -2指的是倒数第一个第二个值这样?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


std::shared_ptr<FunctionBase> getGpuFunction() const { return gpuFunc_; }
std::shared_ptr<FunctionBase> getGpuFunction() const { return function2_; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

从上面看function1_不一定是CPU Type, function2_不一定是GPU Type吧

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯,这里接口名称需要修改。

* output_height, output_width]
*/
template <class T>
class Im2ColFunctor<DEVICE_TYPE_CPU, T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im2Col, Col2Im是不是可以单独一个Function?可以用在其他Function里,比如BlockExpand里。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在PR #2424 里面会移到一个单独的文件里面,但不是Function,被ImageExpandFunction调用。

size_t filterWidth = getFilterWidth(filter);
size_t outputChannels = output[1];
size_t outputHeight = output[2];
size_t outputWidth = output[3];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n, ci, hi, wi, kh, kw, co, ho,wo感觉命名更短些,当然现在这样容易理解 :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

个人觉得Function的代码逻辑可读性比较重要。如果用n, ci, hi...这种缩写的话,对于不熟悉代码逻辑的人并不一定很友好。

std::vector<size_t> paddings = {(size_t)paddingY_[i], (size_t)padding_[i]};
std::vector<size_t> strides = {(size_t)strideY_[i], (size_t)stride_[i]};
createFunction(forward_,
!isDeconv_ ? "GemmConv" : "GemmConvGradInput",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isDeconv_ -> isConvTrans_

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isDeconv_是ConvBaseLayer里面原来的名字,换成isConvTrans_更好?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯,命名和对用户接口一致叫做convolution transpose layer觉得更好些,这样也可以吧~

inputs.addArg(*weights_[i]->getW(), filterShape_[i]);
outputs.addArg(*getOutputValue(),
outputShape_[i],
!isDeconv_ && i == 0 ? ASSIGN_TO : ADD_TO);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isDeconv_为什么需要这个参数?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

明白了。感觉Function是不是应该和blas库类似,加scale/beta这样的系数更合理些,而不是ASSIGN_TOADD_TO~

Copy link
Contributor

@qingqing01 qingqing01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. 清爽了很多~

* image channels, C is the number of input image channels,
* H and W is height and width of filter.
*
* If groups is greater than 1, the filter's data format should be GMCHW,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

上面提出修改原因是:If groups is greater than 1 不符合语法,也提醒我以后comments需要解释清楚 :)

int c_col = int(c * blockH* blockW) + \
(h - h_col * (int)strideH) * (int)blockW +
(w - w_col * (int)strideW);
val += data_col[(c_col * height_col + h_col) * width_col + w_col];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cuda kernel里有多种命名规则~

inputs.addArg(*weights_[i]->getW(), filterShape_[i]);
outputs.addArg(*getOutputValue(),
outputShape_[i],
!isDeconv_ && i == 0 ? ASSIGN_TO : ADD_TO);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

明白了。感觉Function是不是应该和blas库类似,加scale/beta这样的系数更合理些,而不是ASSIGN_TOADD_TO~

std::vector<size_t> paddings = {(size_t)paddingY_[i], (size_t)padding_[i]};
std::vector<size_t> strides = {(size_t)strideY_[i], (size_t)stride_[i]};
createFunction(forward_,
!isDeconv_ ? "GemmConv" : "GemmConvGradInput",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯,命名和对用户接口一致叫做convolution transpose layer觉得更好些,这样也可以吧~

Copy link
Contributor Author

@hedaoyuan hedaoyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

目前来看只有ASSIGN_TO和ADD_TO两种赋值模式。修改成一个scale会增加Function的复杂度,scale需要是Function的参数,ASSIGN_TO和ADD_TO可以是BufferArg的参数;另外,结果是覆盖掉Arg中原来的值还是需要add上去,这个也是BufferArg自身的属性。

@hedaoyuan hedaoyuan merged commit 17fe832 into PaddlePaddle:develop Jun 19, 2017
@hedaoyuan hedaoyuan added this to Convolution optimization in Embedded and Mobile Deployment Jul 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Embedded and Mobile Deployment
Convolution Optimization
Development

Successfully merging this pull request may close these issues.

None yet

2 participants