Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dropout operator. #3817

Merged
merged 10 commits into from
Sep 19, 2017
Merged

Add dropout operator. #3817

merged 10 commits into from
Sep 19, 2017

Conversation

xinghai-sun
Copy link
Contributor

Resolve #3816

@xinghai-sun
Copy link
Contributor Author

Dropout behaves differently in train and test mode: e.g. in the test mode, the dropout is often turned off. How can the dropout operator know whether it's in the test mode or the train mode?

DropoutOpMaker(framework::OpProto *proto,
framework::OpAttrChecker *op_checker)
: OpProtoAndCheckerMaker(proto, op_checker) {
AddAttr<float>("dropout_prob", "Dropout probability.").SetDefault(.5f);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need more detail comments, 是以概率probability置0,还是以(1-probability)置0,这块需要解释清楚。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

PADDLE_ENFORCE_NOT_NULL(ctx.InputVar("X"), "Input(X) must not be null.");
auto dims = ctx.Input<Tensor>("X")->dims();
ctx.Output<Tensor>("Out")->Resize(dims);
ctx.Output<Tensor>("Mask")->Resize(dims);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要检查属性context.GetAttr<float>("dropout_prob"); 大于0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

y_data[i] = 0;
} else {
mask_data[i] = 1;
y_data[i] = (1 - dropout_prob) * x_data[i];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对于training应该是:y_data[i] =x_data[i]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

const T* x_data = x->data<T>();

float dropout_prob = context.op_.GetAttr<float>("dropout_prob");
int seed = context.op_.GetAttr<int>("seed");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

context.op_.GetAttr -> context.GetAttr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

y->mutable_data<T>(context.GetPlace());

float dropout_prob = context.op_.GetAttr<float>("dropout_prob");
int seed = context.op_.GetAttr<int>("seed");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

context.op_.GetAttr -> context.GetAttr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

template <typename T>
struct MaskGenerator {
float dropout_prob_;
int seed_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no _ for the data member of struct

https://google.github.io/styleguide/cppguide.html#Variable_Names

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

template <typename T>
struct MaskGenerator {
float dropout_prob_;
int seed_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please follow the code styles, dropout_prob_ --> dropout_prob, seed_ --> seed

DropoutOpMaker(framework::OpProto *proto,
framework::OpAttrChecker *op_checker)
: OpProtoAndCheckerMaker(proto, op_checker) {
AddAttr<float>("dropout_prob", "Probability for dropping out units.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The template type is needed for attr.

template <typename AttrType>
class DropoutOpMaker : public framework::OpProtoAndCheckerMaker {
 public:
  DropoutOpMaker(framework::OpProto *proto,
                  framework::OpAttrChecker *op_checker)
       : OpProtoAndCheckerMaker(proto, op_checker) {
     AddAttr<AttrType>("dropout_prob", "Probability for dropping out units.")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

T* y_data = y->mutable_data<T>(context.GetPlace());
const T* x_data = x->data<T>();

float dropout_prob = context.GetAttr<float>("dropout_prob");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

template type is needed for attr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed above.

T* mask_data = mask->mutable_data<T>(context.GetPlace());
thrust::transform(index_sequence_begin, index_sequence_begin + size,
thrust::device_ptr<T>(mask_data),
MaskGenerator<T>(dropout_prob, seed));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the CPUKernel can be implemented with same way, std::transform can be used.

auto dims = grad_x->dims();
int size = static_cast<int>(framework::product(dims));
auto new_dims = framework::make_ddim({dims[0], size / dims[0]});
auto M = EigenMatrix<T>::From(*mask, new_dims);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EigenMatrix<int>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think any of int, float, or T for mask data type is OK. T might be better because it avoids implicit type conversion when mask is multiplied with X.

When considering memory usage, it would be better to use a bool (single byte) or float16 (2 bytes), but that will incur the following error:

129: .terminate called after throwing an instance of 'paddle::platform::EnforceNotMet'
129:   what():  This type of tensor cannot be expose to Python at [/home/work/sunxinghai/git/Paddle/paddle/pybind/tensor_py.h:36]

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my comment is wrong. Just use type T. Since the tensor in pybind doest not support to set and get tensor with bool type, the error for bool type is caused.

qingqing01
qingqing01 previously approved these changes Sep 11, 2017
@qingqing01 qingqing01 added this to Doing in Port Operators Sep 14, 2017
// resize
auto dims = ctx.Input<Tensor>("X")->dims();
ctx.Output<Tensor>("Out")->Resize(dims);
ctx.Output<Tensor>("Mask")->Resize(dims);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LoDTensor合入后, InferShape里的Output需要改成:Output< framework::LoDTensor>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

PADDLE_ENFORCE_EQ(x_dims, mask_dims,
"Dimensions of Input(X) and Mask must be the same.");
// resize
auto *x_grad = ctx.Output<Tensor>(framework::GradVarName("X"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LoDTensor合入后, InferShape里的Output需要改成:Output< framework::LoDTensor>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

import unittest
import numpy as np
from gradient_checker import GradientChecker, create_op
from op_test_util import OpTestMeta
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要使用新的单测框架

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

qingqing01
qingqing01 previously approved these changes Sep 19, 2017
@xinghai-sun xinghai-sun merged commit c7f91a9 into PaddlePaddle:develop Sep 19, 2017
@xinghai-sun xinghai-sun deleted the dropout branch September 19, 2017 09:31
@qingqing01 qingqing01 moved this from Doing to Done in Port Operators Sep 20, 2017
heavengate pushed a commit to heavengate/Paddle that referenced this pull request Aug 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants