-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using LODTensor instead of Tensor in every operator. #3717
Comments
Just to add a link to #3693, where I have a protobuf message description of the LODTensor's shape. |
Current operators are using grep -n '<Tensor>' `find paddle/operators/ -name '*_op.h' -o -name '*_op.cc'` we found 128 usages. I just copy usage of minus_op.h and minus_op.cc here. // In compute method
auto* left_tensor = context.Input<framework::Tensor>("X");
auto* right_tensor = context.Input<framework::Tensor>("Y");
auto* out_tensor = context.Output<framework::Tensor>("Out");
out_tensor->mutable_data<T>(context.GetPlace());
auto& dev = context.GetEigenDevice<Place>();
framework::EigenVector<T>::Flatten(*out_tensor).device(dev) =
framework::EigenVector<T>::Flatten(*left_tensor) -
framework::EigenVector<T>::Flatten(*right_tensor); // In InferShape method
auto *left_tensor = ctx.Input<framework::Tensor>("X");
auto *right_tensor = ctx.Input<framework::Tensor>("Y");
PADDLE_ENFORCE_EQ(
framework::product(left_tensor->dims()),
framework::product(right_tensor->dims()),
"Minus operator must take two tensor with same num of elements");
ctx.Output<framework::Tensor>("Out")->Resize(left_tensor->dims()); On another hand, only |
Paddle's existing solutionPaddle has implemented highly optimized recurrent networks which can handle variable-length sequence without padding. And I believe that we can learn something from the original design. The solution is as follows: Paddle use Argument as the input and output of Layer. And Argument is a struct of value and sequence info.
And different layer will handle Argument respectively. For layer who does not need sequence info, it will just use data value and pass sequence info if necessary; for layer who need sequence info, it will both use value and sequence info. Let's take SequencePoolLayer and FullyConnectedLayer as examples: forward method of SequencePoolLayer:
forward method of FullyConnectedLayer:
SequencePoolLayer will use sequence info whereas FullyConnectedLayer will just pass sequence info to next layer. And for the transmission of sequence info, every derived layer class have to call forward method in base class, Just like FullyConnectedLayer does.
In Python api, there are actually 4 * 3 = 12 kinds of input data. Four data types:
Three sequence types:
And DataProviderConverter is defined to convert Python input data to C++ Argument. New solutionIf we follows the design of Paddle formerly, we can have LOD Tensor like this:
After we replace Tensor in current codes with LODTensor, only sequence related Op will handle both lod_ and data_ field in both InferShape and Run method, and other Op will handle data_ field and pass lod_ field. LODTensor will exposed to Python, users can set sequence info and data directly. But for the consistency with v2 api, we need to implement a converter which takes data reader in, and produce LODTensor out. To use composition, we can unify the data type and avoid potential type deduction. The cost is that there will be an additional LOD pointer field, which takes 4 bytes. If we have 1000 tensors which do not contain sequence info, it will use 4KB memory more. |
所有的layer都有sequence info。在上面的例子中,fc输出的sequence info默认是和输入的一样,见
这样设计的目的是:比如maxlayer1->fc->maxlayer2。
所以所有的Op都需要带有lod_和data_两个field。 |
@luotao1 Thanks for pointing out! I will updated my comments accordingly. |
@wangkuiyi @reyoung @Superjom 举例如下:
这些特定的Op,比如Op3,输入必须是LODTensor,输出数据类型必须是Tensor。而对于其他的Op,比如Op1,Op2,则输出类型与输入类型保持一致!即如果输入是Tensor,输出也是Tensor;如果输入是LODTensor,输出也是LODTensor。 所以,需要有一个InferType的过程。在InferType的时候,需要判断输入数据的类型。
而且InferType必须要在InferShape之前完成,因为InferShape给数据设置size的时候,需要知道数据的类型是什么 但是Type信息实际上是在运行时,根据用户输入数据的类型才能拿到的。而variable的GetMutable接口,必须在编译期确定一个类型。 这不是一个矛盾吗? |
不需要有infertype的过程。之前听 @wangkuiyi @Superjom 的讨论,我们要将所有Op的输入和输出都改成 |
那现在要做的事情有三个:
@wangkuiyi @reyoung @Superjom 我理解的对吗? |
Since we try to use
LODTensor
instead ofTensor
as our basic data type for Input/Output variables of operators, we should:Using
LODTensor
instead ofTensor
in current code. For example, in current implementation, like here, we should useInput<LODTensor>()
instead ofInput<Tensor>()
.How does output inherit
LOD
information from one of the operator's inputs?OpInfo
. e.g. RegisterOutput["out"] inherits input["X"]'s lod information
?The text was updated successfully, but these errors were encountered: