Add maxout operator. #5571

sweetsky0901 · 2017-11-11T10:32:19Z

为了增加maxout op修改。
但是其中为了编译到在op的math路径里面的代码而修改了op里面的CMakelist文件。（参照的pool）。

resolve #5570
resolve #3737

qingqing01 · 2017-11-13T09:49:15Z

paddle/operators/math/maxouting.cc

+            maxout_process.compute(ele,
+              input_data[(new_bindex+new_cindex) * groups+ph*fea_size+f]);
+          }
+          maxout_process.finalize(ele, (static_cast<T>(groups)));


I think the MaxOutProcess maxout_process is not necessary. The implementation can be expanded here.

前两个，我还是用了，把finalize去掉了，的确没有用

qingqing01 · 2017-11-13T09:50:30Z

paddle/operators/math/maxouting.cc

+class MaxOutFunctor<platform::CPUPlace, MaxOutProcess, T> {
+ public:
+  void operator()(const platform::DeviceContext& context,
+                  const framework::Tensor& input, framework::Tensor& output,


framework::Tensor* for output.

qingqing01 · 2017-11-13T09:52:09Z

paddle/operators/math/maxouting.cc

+ public:
+  void operator()(const platform::DeviceContext& context,
+                  const framework::Tensor& input, framework::Tensor& output,
+                  int groups, int num_channels, MaxOutProcess maxout_process) {


Seems the num_channels can be got from the input tensor. There is no need to pass this argument.

qingqing01 · 2017-11-13T09:54:34Z

paddle/operators/math/maxouting.cc

+    const int batch_size = input.dims()[0];
+    const int input_height = input.dims()[2];
+    const int input_width = input.dims()[3];
+    const int output_channels = num_channels/groups;


The output_channels can be got from the output tensor. Then the groups can be calculated from channel number of input and output.

const int output_channels = output.dims()[1]; const int group = input.dims()[1] / output_channels;

groups是最开始传进来的，然后output是用这个算出来的，所以这个groups我就一直带着了

qingqing01 · 2017-11-13T09:59:00Z

paddle/operators/math/maxouting.cc

+    const int output_channels = num_channels/groups;
+
+    int fea_size = input_height * input_width;
+    int c_size = fea_size * output_channels;


c_size -> out_size ?

qingqing01 · 2017-11-13T10:51:14Z

paddle/operators/maxout_op.h

+
+#pragma once
+
+#include "paddle/framework/eigen.h"


remove this unused header file.

qingqing01 · 2017-11-13T10:51:47Z

paddle/operators/maxout_op.h

+
+#include "paddle/framework/eigen.h"
+#include "paddle/framework/op_registry.h"
+#include "paddle/operators/math/math_function.h"


It seems this header file is also not used.

#include "paddle/framework/op_registry.h"这个也是没用的吗？不是注册的时候用的吗

qingqing01 · 2017-11-13T10:52:32Z

python/paddle/v2/framework/tests/test_maxout_op.py

+
+
+
+class TestMaxOut_Op(OpTest):


TestMaxOut_Op -> TestMaxOutOp

qingqing01 · 2017-11-13T10:54:37Z

python/paddle/v2/framework/tests/test_maxout_op.py

+
+
+
+def maxout_forward_naive_2sweetsky(input, groups, num_channels):


If this function is not used, please to remove it.

qingqing01 · 2017-11-13T10:55:18Z

python/paddle/v2/framework/tests/test_maxout_op.py

+    def test_check_grad(self):
+        print self.inputs
+        print self.outputs
+        self.check_grad(['X'], 'Out', max_relative_error=0.5)


MaxOut Op的梯度检测应该不稳定，+- delta之后，max值可能改变，除非特殊构造输入，可以保持+- delta之后的max值不变。

chengduoZH · 2017-11-13T11:36:42Z

paddle/operators/math/maxouting.cc

+ public:
+  void operator()(const platform::DeviceContext& context,
+                  const framework::Tensor& input, framework::Tensor& output,
+                  int groups, int num_channels, MaxOutProcess maxout_process) {


Please reorder parameter.
Function_Parameter_Ordering

这个顺序，把num_channel去掉了，其他的位置暂时没有调整，整理来说前面是input 后面是output，但是groups属于一个全属性

I think it is better that putting all input-only parameters before any output parameters.

更里面的调了，这个暂时不调了，不然要改的地方也很多，收益并不大。这里不是严格按照输入输出看的，input和output更针对矩阵

chengduoZH · 2017-11-13T11:38:40Z

paddle/operators/math/maxouting.cc

+                  framework::Tensor& input_grad,
+                  const framework::Tensor& output,
+                  const framework::Tensor& output_grad,
+                  int groups, int num_channels) {


Same as above. Function_Parameter_Ordering

这个顺序，把num_channel去掉了，其他的位置暂时没有调整，整理来说前面是input 后面是output，但是groups属于一个全属性

chengduoZH · 2017-11-13T11:40:01Z

paddle/operators/math/maxouting.cc

+                input_grad_data[input_idx] += output_grad_data[output_idx];
+                stop = true;
+              } else {
+                input_grad_data[input_idx] = 0;


Please remove else{...}. You have set value in line 94.

chengduoZH · 2017-11-13T11:46:38Z

paddle/operators/math/maxouting.cc

+template class MaxOutFunctor<platform::CPUPlace,
+                             paddle::operators::math::MaxOut<float>, float>;
+template class MaxOutFunctor<platform::CPUPlace,
+                             paddle::operators::math::MaxOut<double>, double>;


paddle::operators:: can be removed. Because MaxOutFunctor is in the paddle::operators namespace.

chengduoZH · 2017-11-13T11:51:07Z

paddle/operators/math/maxouting.cu

+                             const int input_height, const int input_width,
+                             int groups, MaxOutProcess maxout_process) {
+  int size = input_height * input_width * channels / groups;
+  int featLen = input_height * input_width;


size and featLen should be const int.
Please remain variable name convenient. featLen -> feat_len

chengduoZH · 2017-11-13T11:53:55Z

paddle/operators/math/maxouting.cu

+  for (int index = blockIdx.x * blockDim.x + threadIdx.x; index < nthreads;
+              index += blockDim.x * gridDim.x) {
+    int batch_idx = index / size;
+    int i = index % size;


Please do not use i in there, especially in the loop.
You can use temp

chengduoZH · 2017-11-13T11:54:29Z

paddle/operators/math/maxouting.cu

+    int data_idx =
+      (batch_idx * size + channel_idx * featLen) * groups + feat_idx;
+    T ele = maxout_process.initial();
+    for (int g = 0; g < groups; g++) {


Using ++g is better.

Other parts of code also have the similar using, please correct one by one.

chengduoZH · 2017-11-13T11:54:59Z

paddle/operators/math/maxouting.cu

+    for (int index = blockIdx.x * blockDim.x + threadIdx.x; index < nthreads;
+         index += blockDim.x * gridDim.x) {
+      int batch_idx = index / size;
+      int i = index % size;


Same as above.

chengduoZH · 2017-11-13T12:08:10Z

paddle/operators/math/maxouting.h

+namespace math {
+
+#define FLT_MAX \
+  __FLT_MAX__  // It might need to be placed in another file, but I'm still


#define FLT_MAX is alse defined in pooling.h. And they are in the same namespace(paddle::operators::math).

我企图去掉，但是编译的时候报错，我又加上了

… my_maxout_op

sweetsky0901

已经修改

chengduoZH · 2017-11-14T07:47:34Z

Please read this document first.

… my_maxout_op

sweetsky0901

done thanks

sweetsky0901 · 2017-11-14T07:59:25Z

paddle/operators/math/maxouting.cc

+ * All tensors are in NCHW format.
+ * Ksize, strides, paddings are two elements. These two elements represent
+ * height and width, respectively.
+ */


sweetsky0901 · 2017-11-14T07:59:38Z

paddle/operators/math/maxouting.cc

+class MaxOutFunctor<platform::CPUPlace, MaxOutProcess, T> {
+ public:
+  void operator()(const platform::DeviceContext& context,
+                  const framework::Tensor& input, framework::Tensor& output,


sweetsky0901 · 2017-11-14T07:59:45Z

paddle/operators/math/maxouting.cc

+ public:
+  void operator()(const platform::DeviceContext& context,
+                  const framework::Tensor& input, framework::Tensor& output,
+                  int groups, int num_channels, MaxOutProcess maxout_process) {


sweetsky0901 · 2017-11-14T08:00:59Z

paddle/operators/math/maxouting.cc

+ public:
+  void operator()(const platform::DeviceContext& context,
+                  const framework::Tensor& input, framework::Tensor& output,
+                  int groups, int num_channels, MaxOutProcess maxout_process) {


这个顺序，把num_channel去掉了，其他的位置暂时没有调整，整理来说前面是input 后面是output，但是groups属于一个全属性

sweetsky0901 · 2017-11-14T08:01:51Z

paddle/operators/math/maxouting.cc

+    const int batch_size = input.dims()[0];
+    const int input_height = input.dims()[2];
+    const int input_width = input.dims()[3];
+    const int output_channels = num_channels/groups;


groups是最开始传进来的，然后output是用这个算出来的，所以这个groups我就一直带着了

sweetsky0901 · 2017-11-14T14:00:37Z

paddle/operators/maxout_op.h

+
+#pragma once
+
+#include "paddle/framework/eigen.h"


sweetsky0901 · 2017-11-14T14:01:16Z

paddle/operators/maxout_op.h

+
+#include "paddle/framework/eigen.h"
+#include "paddle/framework/op_registry.h"
+#include "paddle/operators/math/math_function.h"


#include "paddle/framework/op_registry.h"这个也是没用的吗？不是注册的时候用的吗

sweetsky0901 · 2017-11-14T14:01:36Z

python/paddle/v2/framework/tests/test_maxout_op.py

+
+
+
+def maxout_forward_naive_2sweetsky(input, groups, num_channels):


sweetsky0901 · 2017-11-14T14:01:42Z

python/paddle/v2/framework/tests/test_maxout_op.py

+
+
+
+class TestMaxOut_Op(OpTest):


sweetsky0901 · 2017-11-14T14:01:50Z

python/paddle/v2/framework/tests/test_maxout_op.py

+    def test_check_grad(self):
+        print self.inputs
+        print self.outputs
+        self.check_grad(['X'], 'Out', max_relative_error=0.5)


chengduoZH · 2017-11-15T06:03:52Z

paddle/operators/math/maxouting.cu

+    int data_idx =
+      (batch_idx * size + channel_idx * featLen) * groups + feat_idx;
+    T ele = maxout_process.initial();
+    for (int g = 0; g < groups; g++) {


Other parts of code also have the similar using, please correct one by one.

chengduoZH · 2017-11-15T06:09:53Z

paddle/operators/math/maxouting.cc

+ public:
+  void operator()(const platform::DeviceContext& context,
+                  const framework::Tensor& input, framework::Tensor& output,
+                  int groups, int num_channels, MaxOutProcess maxout_process) {


I think it is better that putting all input-only parameters before any output parameters.

chengduoZH · 2017-11-15T06:12:11Z

paddle/operators/math/maxouting.cu

+__global__ void KernelMaxOut(const int nthreads, const T* input_data,
+                             T* output_data, const int channels,
+                             const int input_height, const int input_width,
+                             int groups, MaxOutProcess maxout_process) {


Please note the order of parameters.

chengduoZH · 2017-11-15T06:14:21Z

python/paddle/v2/framework/tests/test_maxout_op.py

+
+    def test_check_grad(self):
+        print self.inputs
+        print self.outputs


Please remove debug code.

chengduoZH · 2017-11-15T06:16:10Z

paddle/operators/maxout_op.h

+      in_x_grad->mutable_data<T>(context.GetPlace());
+      auto temp = framework::EigenVector<T>::Flatten(*in_x_grad);
+      temp.device(context.GetEigenDevice<Place>()) =
+      temp.constant(static_cast<T>(0));


Please use this way to clean in_x_grad memory.

sweetsky0901

done thanks

sweetsky0901 · 2017-11-15T07:10:06Z

python/paddle/v2/framework/tests/test_maxout_op.py

+
+    def test_check_grad(self):
+        print self.inputs
+        print self.outputs


sweetsky0901 · 2017-11-15T07:11:10Z

paddle/operators/math/maxouting.cu

+    int data_idx =
+      (batch_idx * size + channel_idx * featLen) * groups + feat_idx;
+    T ele = maxout_process.initial();
+    for (int g = 0; g < groups; g++) {


sweetsky0901 · 2017-11-15T07:14:32Z

paddle/operators/math/maxouting.cu

+__global__ void KernelMaxOut(const int nthreads, const T* input_data,
+                             T* output_data, const int channels,
+                             const int input_height, const int input_width,
+                             int groups, MaxOutProcess maxout_process) {


sweetsky0901 · 2017-11-15T07:21:09Z

paddle/operators/maxout_op.h

+      in_x_grad->mutable_data<T>(context.GetPlace());
+      auto temp = framework::EigenVector<T>::Flatten(*in_x_grad);
+      temp.device(context.GetEigenDevice<Place>()) =
+      temp.constant(static_cast<T>(0));


sweetsky0901 · 2017-11-15T07:36:09Z

paddle/operators/math/maxouting.cc

+ public:
+  void operator()(const platform::DeviceContext& context,
+                  const framework::Tensor& input, framework::Tensor& output,
+                  int groups, int num_channels, MaxOutProcess maxout_process) {


更里面的调了，这个暂时不调了，不然要改的地方也很多，收益并不大。这里不是严格按照输入输出看的，input和output更针对矩阵

… my_maxout_op

qingqing01 · 2017-11-15T12:22:28Z

paddle/operators/math/maxouting.cc

+                  const framework::Tensor& input,
+                  framework::Tensor * output,
+                  int groups,
+                  MaxOutProcess maxout_process) {


我仍然觉得，这里没有必要使用MaxOutProcess，定义的MaxOut只在这里使用一次，为什么不能展开？直接实现在 line 52，加MaxOutProcess反而复杂了。建议去掉MaxOutProcess.

qingqing01 · 2017-11-15T12:24:06Z

paddle/operators/math/maxouting.cc

+    const int output_channels = output->dims()[1];
+
+    int fea_size = input_height * input_width;
+    // c_size mean output one batch size


c_size means the output size of each sample.

qingqing01 · 2017-11-15T12:24:48Z

paddle/operators/math/maxouting.cc

+            maxout_process.compute(ele,
+              input_data[(new_bindex+new_cindex) * groups+ph*fea_size+f]);
+          }
+          output_data[(new_bindex+new_cindex+f)] = ele;


建议去掉maxout_process，这里直接比大小~

qingqing01 · 2017-11-15T12:26:15Z

paddle/operators/maxout_op.cc

+    :param layer_attr: Extra Layer attribute.
+    :type layer_attr: ExtraLayerAttribute
+    :return: LayerOutput object.
+    :rtype: LayerOutput


去掉line 61 - line 81，其他op目前都没有类似的用例。

qingqing01 · 2017-11-15T12:27:10Z

paddle/operators/math/maxouting.h

+limitations under the License. */
+
+#pragma once
+#include "paddle/framework/eigen.h"


remove the unused header file.

qingqing01 · 2017-11-15T12:31:40Z

paddle/operators/math/maxouting.h

+#include "paddle/framework/eigen.h"
+#include "paddle/framework/tensor.h"
+#include "paddle/platform/device_context.h"
+#include "paddle/platform/hostdevice.h"


If remove line 26 - line 48, please also remove this header.

qingqing01 · 2017-11-15T12:33:25Z

paddle/operators/math/maxouting.cu

+      int feat_idx = batch_offset % feat_len;
+      int data_idx =
+        (batch_idx * size + channel_idx * feat_len) * groups + feat_idx;
+      int maxIndex = -1;


maxIndex -> max_index

qingqing01 · 2017-11-15T12:34:09Z

paddle/operators/maxout_op.h

+
+    int groups = context.template Attr<int>("groups");
+
+


空行太多

qingqing01 · 2017-11-15T12:36:28Z

python/paddle/v2/framework/tests/test_maxout_op.py

+        self.op_type = "maxout"
+        self.init_test_case()
+        input = np.random.random(self.shape).astype("float32")
+        output = self.MaxOut_forward_naive(input, self.groups,


self.MaxOut_forward_naive -> maxOut_forward_naive 就行吧，这样可以去掉line 32。觉得定义 self.MaxOut_forward_naive 必要性不大。

因为我看了一些相似的case，实际上是把计算单独拿出来了。所以才这样把它提出来

luotao1 · 2017-11-15T12:48:42Z

paddle/operators/math/maxouting.cc

+          bool stop = false;
+          int output_idx = blen + clen + f;
+          for (int g = 0; g < groups && !stop; ++g) {
+              input_idx = (blen + clen) * groups + fea_size * g + f;


为了减少多余的计算：

90行改成：int input_idx0 = (blen + clen) * groups + f;
94行：int input_idx = input_idx0 + fea_size * g;

91行改成bool continue = true，并对应修改93行和98行。这样93行每次判断，不用多做一次非操作。

luotao1 · 2017-11-15T12:59:24Z

paddle/operators/math/maxouting.cu

+                             MaxOutProcess maxout_process) {
+  const int size = input_height * input_width * channels / groups;
+  const int feat_len = input_height * input_width;
+  for (int index = blockIdx.x * blockDim.x + threadIdx.x; index < nthreads;


30行的格式要换下，以减少重复计算：

int index = blockIdx.x * blockDim.x + threadIdx.x; int offset = blockDim.x * gridDim.x; for (int i = index; i < nthreads; i += offset) ...

luotao1 · 2017-11-15T12:59:55Z

paddle/operators/math/maxouting.cu

+    const int input_height, const int input_width, int groups) {
+    const int size = input_height * input_width * channels / groups;
+    const int feat_len = input_height * input_width;
+    for (int index = blockIdx.x * blockDim.x + threadIdx.x; index < nthreads;


luotao1 · 2017-11-15T13:01:12Z

paddle/operators/math/maxouting.cu

+        (batch_idx * size + channel_idx * feat_len) * groups + feat_idx;
+      int maxIndex = -1;
+      bool stop = false;
+      for (int g = 0; g < groups && !stop; ++g) {


61行改成bool continue = true; 同上

luotao1 · 2017-11-15T13:02:09Z

paddle/operators/math/maxouting.h

+
+
+
+


删去多余空行

luotao1 · 2017-11-15T13:06:58Z

paddle/operators/maxout_op.cc

+        )DOC");
+    AddComment(R"DOC(
+        - Input: NCHW.
+        - Output: feature map size same as input. Channel is (input channel) / groups.


这段注释要修正下。

feature map size same as input：这句话没谓语。The feature map size of output is the same as the input

The output_channel is (input channel) / groups.

luotao1 · 2017-11-15T13:07:19Z

paddle/operators/maxout_op.cc

+        - Input: NCHW.
+        - Output: feature map size same as input. Channel is (input channel) / groups.
+        So groups should be larger than 1, and the num of channels should be able
+        to devided by groups.


to be devided

luotao1 · 2017-11-15T13:08:03Z

paddle/operators/maxout_op.cc

+    // check groups > 1
+    PADDLE_ENFORCE_GT(
+        groups, 1,
+        "in maxoutop  groups should be larger than 1");


groups should be larger than 1 in maxout op

luotao1 · 2017-11-15T13:08:47Z

paddle/operators/maxout_op.h

+    maxout_forward;
+    paddle::operators::math::MaxOut<T> maxout_process;
+    maxout_forward(context.device_context(), *in_x, out, groups,
+    maxout_process);


请安装pre-commit，36-41行的这段代码格式不对。

luotao1 · 2017-11-15T13:09:46Z

python/paddle/v2/framework/tests/test_maxout_op.py

+
+
+
+


删除多余空行

… my_maxout_op

sweetsky0901

done thanks

sweetsky0901 · 2017-11-19T08:12:40Z

paddle/operators/math/maxouting.cc

+    const int output_channels = output->dims()[1];
+
+    int fea_size = input_height * input_width;
+    // c_size mean output one batch size


sweetsky0901 · 2017-11-19T08:15:37Z

paddle/operators/math/maxouting.cc

+          bool stop = false;
+          int output_idx = blen + clen + f;
+          for (int g = 0; g < groups && !stop; ++g) {
+              input_idx = (blen + clen) * groups + fea_size * g + f;


sweetsky0901 · 2017-11-19T08:18:43Z

paddle/operators/math/maxouting.cu

+                             MaxOutProcess maxout_process) {
+  const int size = input_height * input_width * channels / groups;
+  const int feat_len = input_height * input_width;
+  for (int index = blockIdx.x * blockDim.x + threadIdx.x; index < nthreads;


sweetsky0901 · 2017-11-19T08:19:54Z

paddle/operators/math/maxouting.cu

+    const int input_height, const int input_width, int groups) {
+    const int size = input_height * input_width * channels / groups;
+    const int feat_len = input_height * input_width;
+    for (int index = blockIdx.x * blockDim.x + threadIdx.x; index < nthreads;


sweetsky0901 · 2017-11-19T08:21:51Z

paddle/operators/math/maxouting.cu

+        (batch_idx * size + channel_idx * feat_len) * groups + feat_idx;
+      int maxIndex = -1;
+      bool stop = false;
+      for (int g = 0; g < groups && !stop; ++g) {


sweetsky0901 · 2017-11-19T08:54:45Z

paddle/operators/maxout_op.h

+
+    int groups = context.template Attr<int>("groups");
+
+


sweetsky0901 · 2017-11-19T08:54:56Z

paddle/operators/maxout_op.cc

+        So groups should be larger than 1, and the num of channels should be able
+        to devided by groups.
+
+    .. math::


sweetsky0901 · 2017-11-19T08:55:12Z

paddle/operators/math/maxouting.h

+                             T scale) {
+    dx += dy * (x == y);
+  }
+};


sweetsky0901 · 2017-11-19T08:55:35Z

paddle/operators/math/maxouting.cc

+            maxout_process.compute(ele,
+              input_data[(new_bindex+new_cindex) * groups+ph*fea_size+f]);
+          }
+          output_data[(new_bindex+new_cindex+f)] = ele;


sweetsky0901 · 2017-11-19T08:55:47Z

paddle/operators/math/maxouting.cc

+                  const framework::Tensor& input,
+                  framework::Tensor * output,
+                  int groups,
+                  MaxOutProcess maxout_process) {


… my_maxout_op

qingqing01 · 2017-11-20T05:08:17Z

paddle/operators/math/maxouting.cc

+          // T ele = maxout_process.initial();
+          T ele = static_cast<T>(-FLT_MAX);
+          for (int ph = 0; ph < groups; ++ph) {
+            T x = input_data[(new_bindex+new_cindex) * groups+ph*fea_size+f];


It seems you do not install pre-commit, the code style is not right. The space is needed before and after +, *.

Install pre-ommit by running following command in your PaddlePaddle root path:

pip install pre-commit

qingqing01 · 2017-11-20T05:13:33Z

paddle/operators/math/maxouting.cc

+public:
+  void operator()(const platform::DeviceContext& context,
+                  const framework::Tensor& input,
+                  framework::Tensor& input_grad,


framework::Tensor& input_grad -> framework::Tensor* input_grad, Maybe better to put the output as the last arguments.

qingqing01 · 2017-11-20T05:19:39Z

paddle/operators/math/maxouting.cc

+      for (int c = 0; c < output_channels; ++c) {
+        int new_cindex = fea_size * c;
+        for (int f = 0; f < fea_size; ++f) {
+          // T ele = maxout_process.initial();


remove this line.

qingqing01 · 2017-11-20T05:23:23Z

paddle/operators/math/maxouting.cc

+/*
+ * All tensors are in NCHW format.
+ * groups mustbe > 1
+ */


// All tensors are in NCHW format, and the groups must be greater than 1.

qingqing01 · 2017-11-20T05:40:09Z

paddle/operators/maxout_op.cc

+        "width of feature.");
+    AddAttr<int>(
+        "groups",
+        R"DOC(The group number of input layer.


Specifies how many groups the input tensor will be split in the channel dimension. And the number of output channel is the number of channels divided by groups.

qingqing01 · 2017-11-20T05:47:58Z

paddle/operators/maxout_op.h

+      paddle::operators::math::MaxOutGradFunctor<Place, T>
+      maxout_backward;
+      maxout_backward(context.device_context(), *in_x, *in_x_grad, *out,
+      *out_grad, groups);


indent line 59.

qingqing01 · 2017-11-20T05:48:58Z

python/paddle/v2/fluid/tests/test_maxout_op.py

+from op_test import OpTest
+
+
+def maxout_forward_naive(input, groups,num_channels):


The num_channels is not used, please to remove it.

qingqing01 · 2017-11-20T05:49:46Z

python/paddle/v2/fluid/tests/test_maxout_op.py

+                self.num_channels).astype("float32")
+
+        self.inputs = {'X': input}
+        self.attrs = {'groups': self.groups, 'num_channels': self.num_channels}


The attr num_channels has been removed in C++ code. Please to remove it.

qingqing01 · 2017-11-20T05:50:02Z

python/paddle/v2/fluid/tests/test_maxout_op.py

+        self.MaxOut_forward_naive = maxout_forward_naive
+        self.shape = [100, 6, 2, 2]
+        self.groups=2
+        self.num_channels=6


remove this variable.

qingqing01 · 2017-11-20T05:53:11Z

paddle/operators/maxout_op.cu

+   See the License for the specific language governing permissions and
+   limitations under the License. */
+
+#define EIGEN_USE_GPU


Remove this line.

And since the PR#5573, please to change paddle/operators/maxout_op.cu to paddle/operators/maxout_op.cu.cc

… my_maxout_op

sweetsky0901

done

sweetsky0901 · 2017-11-20T06:40:06Z

paddle/operators/math/maxouting.cc

+/*
+ * All tensors are in NCHW format.
+ * groups mustbe > 1
+ */


sweetsky0901 · 2017-11-20T06:40:15Z

paddle/operators/math/maxouting.cc

+      for (int c = 0; c < output_channels; ++c) {
+        int new_cindex = fea_size * c;
+        for (int f = 0; f < fea_size; ++f) {
+          // T ele = maxout_process.initial();


sweetsky0901 · 2017-11-20T06:40:59Z

paddle/operators/math/maxouting.cc

+          // T ele = maxout_process.initial();
+          T ele = static_cast<T>(-FLT_MAX);
+          for (int ph = 0; ph < groups; ++ph) {
+            T x = input_data[(new_bindex+new_cindex) * groups+ph*fea_size+f];


sweetsky0901 · 2017-11-20T06:41:00Z

paddle/operators/math/maxouting.cc

+public:
+  void operator()(const platform::DeviceContext& context,
+                  const framework::Tensor& input,
+                  framework::Tensor& input_grad,


sweetsky0901 · 2017-11-20T06:41:16Z

paddle/operators/maxout_op.cc

+        "width of feature.");
+    AddAttr<int>(
+        "groups",
+        R"DOC(The group number of input layer.


sweetsky0901 · 2017-11-20T06:42:01Z

paddle/operators/maxout_op.h

+      in_x_grad->mutable_data<T>(context.GetPlace());
+      zero(device_ctx, in_x_grad, static_cast<T>(0.0));
+      paddle::operators::math::MaxOutGradFunctor<Place, T>
+      maxout_backward;


sweetsky0901 · 2017-11-20T06:42:39Z

paddle/operators/maxout_op.h

+      paddle::operators::math::MaxOutGradFunctor<Place, T>
+      maxout_backward;
+      maxout_backward(context.device_context(), *in_x, *in_x_grad, *out,
+      *out_grad, groups);


sweetsky0901 · 2017-11-20T06:42:46Z

python/paddle/v2/fluid/tests/test_maxout_op.py

+from op_test import OpTest
+
+
+def maxout_forward_naive(input, groups,num_channels):


sweetsky0901 · 2017-11-20T06:42:51Z

python/paddle/v2/fluid/tests/test_maxout_op.py

+                self.num_channels).astype("float32")
+
+        self.inputs = {'X': input}
+        self.attrs = {'groups': self.groups, 'num_channels': self.num_channels}


sweetsky0901 · 2017-11-20T06:42:58Z

python/paddle/v2/fluid/tests/test_maxout_op.py

+        self.MaxOut_forward_naive = maxout_forward_naive
+        self.shape = [100, 6, 2, 2]
+        self.groups=2
+        self.num_channels=6


… my_maxout_op

chengduoZH · 2017-11-20T07:37:01Z

paddle/operators/math/maxouting.cc

+          int output_idx = blen + clen + f;
+          for (int g = 0; g < groups && continue_match; ++g) {
+              int input_idx = input_idx0 + fea_size * g;
+              input_grad_data[input_idx] = 0;


Please remove line 88.

初始化为一个值，为什么需要去掉，有时候内存里往往有脏数据呢

你可在循环外面像这样初始化

input_idx在for循环里是变化的。除非在外面再memset了。

chengduoZH · 2017-11-20T07:53:07Z

paddle/operators/math/maxouting.cc

+    const int batch_size = input.dims()[0];
+    const int input_height = input.dims()[2];
+    const int input_width = input.dims()[3];
+    const int output_channels = output->dims()[1];


You should check whether output_channels and input.dims()[1] / groups are equal.

这个在外面是这么初始化出来的，所以才没有再检查

chengduoZH · 2017-11-20T07:59:50Z

paddle/operators/math/maxouting.cc

+              input_grad_data[input_idx] = 0;
+              if (input_data[input_idx] == output_data[output_idx]) {
+                input_grad_data[input_idx] += output_grad_data[output_idx];
+                continue_match = false;


I don't think you should do cumulative operations here.
input_grad_data[input_idx] += output_grad_data[output_idx]

You can replace continue_match = false; with break;.

chengduoZH · 2017-11-20T08:06:47Z

paddle/operators/math/maxouting.cu.cc

+      for (int g = 0; g < groups && continue_match; ++g) {
+        if (input_data[data_idx + g * feat_len] == output_data[i]) {
+          max_index = data_idx + g * feat_len;
+          continue_match = false;


You can replace continue_match = false; with break;.

chengduoZH · 2017-11-20T08:07:57Z

paddle/operators/math/maxouting.cu.cc

+      if (max_index != -1) {
+        // atomic add
+        platform::CudaAtomicAdd(input_grad + max_index, output_grad[index]);
+      }


I don't think atomic operation is necessary.

我看pooling里面用了，又查不到为什么，所以就继承过来了，并且这个被很多人review过，至今也还是这样

这里确实不需要用原子操作，之前是因为没仔细看这一块，抱歉...

chengduoZH · 2017-11-20T08:09:31Z

paddle/operators/maxout_op.cc

+                       float>);
+REGISTER_OP_CPU_KERNEL(maxout_grad,
+                       ops::MaxOutGradKernel<paddle::platform::CPUPlace,
+                       float>);


You also need to register double of type kernel.

chengduoZH · 2017-11-20T08:10:26Z

paddle/operators/maxout_op.cu.cc

+                       float>);
+REGISTER_OP_GPU_KERNEL(maxout_grad,
+                       ops::MaxOutGradKernel<paddle::platform::GPUPlace,
+                       float>);


The same as mentioned above.

sweetsky0901

done

sweetsky0901 · 2017-11-20T08:37:33Z

paddle/operators/maxout_op.cu.cc

+                       float>);
+REGISTER_OP_GPU_KERNEL(maxout_grad,
+                       ops::MaxOutGradKernel<paddle::platform::GPUPlace,
+                       float>);


sweetsky0901 · 2017-11-20T08:37:40Z

paddle/operators/maxout_op.cc

+                       float>);
+REGISTER_OP_CPU_KERNEL(maxout_grad,
+                       ops::MaxOutGradKernel<paddle::platform::CPUPlace,
+                       float>);


sweetsky0901 · 2017-11-20T08:38:42Z

paddle/operators/math/maxouting.cu.cc

+      if (max_index != -1) {
+        // atomic add
+        platform::CudaAtomicAdd(input_grad + max_index, output_grad[index]);
+      }


我看pooling里面用了，又查不到为什么，所以就继承过来了，并且这个被很多人review过，至今也还是这样

sweetsky0901 · 2017-11-20T08:39:20Z

paddle/operators/math/maxouting.cu.cc

+      for (int g = 0; g < groups && continue_match; ++g) {
+        if (input_data[data_idx + g * feat_len] == output_data[i]) {
+          max_index = data_idx + g * feat_len;
+          continue_match = false;


sweetsky0901 · 2017-11-20T08:39:41Z

paddle/operators/math/maxouting.cc

+              input_grad_data[input_idx] = 0;
+              if (input_data[input_idx] == output_data[output_idx]) {
+                input_grad_data[input_idx] += output_grad_data[output_idx];
+                continue_match = false;


… my_maxout_op

sweetsky0901

done，还是希望可以一次把问题都review出来，这样不定时的找出一些问题，再分别修改提交，编译特别占用时间，每次至少半个小时能进行完都是快的！！！！！！

sweetsky0901 · 2017-11-20T10:26:10Z

paddle/operators/math/maxouting.cc

+    const int batch_size = input.dims()[0];
+    const int input_height = input.dims()[2];
+    const int input_width = input.dims()[3];
+    const int output_channels = output->dims()[1];


这个在外面是这么初始化出来的，所以才没有再检查

sweetsky0901 · 2017-11-20T10:26:27Z

paddle/operators/math/maxouting.cu.cc

+      if (max_index != -1) {
+        // atomic add
+        platform::CudaAtomicAdd(input_grad + max_index, output_grad[index]);
+      }


wanghaox added 2 commits November 11, 2017 18:13

this for maxout op new add

4a428c8

this for maxout op new add

058bdd3

qingqing01 changed the title ~~My maxout op~~ Add Maxout operator. Nov 11, 2017

qingqing01 changed the title ~~Add Maxout operator.~~ Add maxout operator. Nov 11, 2017

resolve conflicts

784fd82

qingqing01 requested review from luotao1, qingqing01 and chengduoZH November 13, 2017 02:37

sweetsky0901 added 2 commits November 13, 2017 11:28

Merge branch 'develop' into my_maxout_op

6c7e136

Merge branch 'develop' into my_maxout_op

fe1e16b

qingqing01 added the OpPorting label Nov 13, 2017

qingqing01 reviewed Nov 13, 2017

View reviewed changes

chengduoZH reviewed Nov 13, 2017

View reviewed changes

wanghaox added 5 commits November 13, 2017 22:16

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

ab9c71d

… my_maxout_op

modify for maxoutop code review

bd773b9

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

494edc6

… my_maxout_op

merge cmakelist

bb1be5d

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

9954496

… my_maxout_op

sweetsky0901 commented Nov 14, 2017

View reviewed changes

wanghaox added 2 commits November 14, 2017 18:06

del a err comments

f57cd1e

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

f319fb1

… my_maxout_op

sweetsky0901 commented Nov 14, 2017

View reviewed changes

chengduoZH reviewed Nov 15, 2017

View reviewed changes

sweetsky0901 commented Nov 15, 2017

View reviewed changes

wanghaox added 2 commits November 15, 2017 15:47

maxout code review 2nd

8d9babf

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

3ef776e

… my_maxout_op

qingqing01 reviewed Nov 15, 2017

View reviewed changes

luotao1 reviewed Nov 15, 2017

View reviewed changes

update maxoutop for code review 3

5802880

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

63f8c5f

… my_maxout_op

sweetsky0901 commented Nov 19, 2017

View reviewed changes

wanghaox added 7 commits November 19, 2017 17:47

add test_maxout_op framework to fluis

a6a01c1

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

4c113cc

… my_maxout_op

modify for add a space in maxout op

25d76bc

del framework test_maxout_op

2d7a652

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

13d39ea

… my_maxout_op

add a space + *

c645d06

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

52f2366

… my_maxout_op

qingqing01 reviewed Nov 20, 2017

View reviewed changes

wanghaox added 2 commits November 20, 2017 14:33

for code review 4

76fc1a8

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

6ac4237

… my_maxout_op

sweetsky0901 commented Nov 20, 2017

View reviewed changes

sweetsky0901 and others added 2 commits November 20, 2017 15:25

rename back

4e5c989

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

350cc61

… my_maxout_op

chengduoZH reviewed Nov 20, 2017

View reviewed changes

sweetsky0901 commented Nov 20, 2017

View reviewed changes

sweetsky0901 added 3 commits November 20, 2017 16:41

for code review 5

3fbff1e

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

95cbbd7

… my_maxout_op

for code review 6

04fd989

sweetsky0901 commented Nov 20, 2017

View reviewed changes

del num_channels

9cb2ff6

qingqing01 approved these changes Nov 20, 2017

View reviewed changes

sweetsky0901 merged commit 7ce06c8 into PaddlePaddle:develop Nov 20, 2017

sweetsky0901 deleted the my_maxout_op branch November 20, 2017 13:34




		def maxout_forward_naive_2sweetsky(input, groups, num_channels):

		from op_test import OpTest


		def maxout_forward_naive(input, groups,num_channels):

Add maxout operator. #5571

Add maxout operator. #5571

Conversation

sweetsky0901 commented Nov 11, 2017 • edited by qingqing01 Loading

qingqing01 Nov 13, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sweetsky0901 left a comment

Choose a reason for hiding this comment

chengduoZH commented Nov 14, 2017

sweetsky0901 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sweetsky0901 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sweetsky0901 commented Nov 11, 2017 •

edited by qingqing01

Loading

qingqing01 Nov 13, 2017 •

edited

Loading