-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dropout operator. #3817
Add dropout operator. #3817
Conversation
Dropout behaves differently in train and test mode: e.g. in the test mode, the dropout is often turned off. How can the dropout operator know whether it's in the test mode or the train mode? |
paddle/operators/dropout_op.cc
Outdated
DropoutOpMaker(framework::OpProto *proto, | ||
framework::OpAttrChecker *op_checker) | ||
: OpProtoAndCheckerMaker(proto, op_checker) { | ||
AddAttr<float>("dropout_prob", "Dropout probability.").SetDefault(.5f); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need more detail comments, 是以概率probability置0,还是以(1-probability)置0,这块需要解释清楚。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/dropout_op.cc
Outdated
PADDLE_ENFORCE_NOT_NULL(ctx.InputVar("X"), "Input(X) must not be null."); | ||
auto dims = ctx.Input<Tensor>("X")->dims(); | ||
ctx.Output<Tensor>("Out")->Resize(dims); | ||
ctx.Output<Tensor>("Mask")->Resize(dims); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要检查属性context.GetAttr<float>("dropout_prob");
大于0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/dropout_op.h
Outdated
y_data[i] = 0; | ||
} else { | ||
mask_data[i] = 1; | ||
y_data[i] = (1 - dropout_prob) * x_data[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
对于training应该是:y_data[i] =x_data[i]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/dropout_op.h
Outdated
const T* x_data = x->data<T>(); | ||
|
||
float dropout_prob = context.op_.GetAttr<float>("dropout_prob"); | ||
int seed = context.op_.GetAttr<int>("seed"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
context.op_.GetAttr
-> context.GetAttr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/dropout_op.h
Outdated
y->mutable_data<T>(context.GetPlace()); | ||
|
||
float dropout_prob = context.op_.GetAttr<float>("dropout_prob"); | ||
int seed = context.op_.GetAttr<int>("seed"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
context.op_.GetAttr
-> context.GetAttr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/dropout_op.h
Outdated
template <typename T> | ||
struct MaskGenerator { | ||
float dropout_prob_; | ||
int seed_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no _
for the data member of struct
https://google.github.io/styleguide/cppguide.html#Variable_Names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/dropout_op.h
Outdated
template <typename T> | ||
struct MaskGenerator { | ||
float dropout_prob_; | ||
int seed_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please follow the code styles, dropout_prob_
--> dropout_prob
, seed_
--> seed
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dropout.
paddle/operators/dropout_op.cc
Outdated
DropoutOpMaker(framework::OpProto *proto, | ||
framework::OpAttrChecker *op_checker) | ||
: OpProtoAndCheckerMaker(proto, op_checker) { | ||
AddAttr<float>("dropout_prob", "Probability for dropping out units.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The template type is needed for attr.
template <typename AttrType>
class DropoutOpMaker : public framework::OpProtoAndCheckerMaker {
public:
DropoutOpMaker(framework::OpProto *proto,
framework::OpAttrChecker *op_checker)
: OpProtoAndCheckerMaker(proto, op_checker) {
AddAttr<AttrType>("dropout_prob", "Probability for dropping out units.")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/dropout_op.h
Outdated
T* y_data = y->mutable_data<T>(context.GetPlace()); | ||
const T* x_data = x->data<T>(); | ||
|
||
float dropout_prob = context.GetAttr<float>("dropout_prob"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
template type is needed for attr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed above.
paddle/operators/dropout_op.h
Outdated
T* mask_data = mask->mutable_data<T>(context.GetPlace()); | ||
thrust::transform(index_sequence_begin, index_sequence_begin + size, | ||
thrust::device_ptr<T>(mask_data), | ||
MaskGenerator<T>(dropout_prob, seed)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the CPUKernel can be implemented with same way, std::transform
can be used.
paddle/operators/dropout_op.h
Outdated
auto dims = grad_x->dims(); | ||
int size = static_cast<int>(framework::product(dims)); | ||
auto new_dims = framework::make_ddim({dims[0], size / dims[0]}); | ||
auto M = EigenMatrix<T>::From(*mask, new_dims); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EigenMatrix<int>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think any of int
, float
, or T
for mask data type is OK. T
might be better because it avoids implicit type conversion when mask is multiplied with X.
When considering memory usage, it would be better to use a bool
(single byte) or float16
(2 bytes), but that will incur the following error:
129: .terminate called after throwing an instance of 'paddle::platform::EnforceNotMet'
129: what(): This type of tensor cannot be expose to Python at [/home/work/sunxinghai/git/Paddle/paddle/pybind/tensor_py.h:36]
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, my comment is wrong. Just use type T. Since the tensor in pybind doest not support to set and get tensor with bool type, the error for bool
type is caused.
paddle/operators/dropout_op.cc
Outdated
// resize | ||
auto dims = ctx.Input<Tensor>("X")->dims(); | ||
ctx.Output<Tensor>("Out")->Resize(dims); | ||
ctx.Output<Tensor>("Mask")->Resize(dims); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LoDTensor合入后, InferShape里的Output需要改成:Output< framework::LoDTensor>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/dropout_op.cc
Outdated
PADDLE_ENFORCE_EQ(x_dims, mask_dims, | ||
"Dimensions of Input(X) and Mask must be the same."); | ||
// resize | ||
auto *x_grad = ctx.Output<Tensor>(framework::GradVarName("X")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LoDTensor合入后, InferShape里的Output需要改成:Output< framework::LoDTensor>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
import unittest | ||
import numpy as np | ||
from gradient_checker import GradientChecker, create_op | ||
from op_test_util import OpTestMeta |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要使用新的单测框架
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Change type of dropout_prob to template typename.
Resolve #3816