Skip to content

Commit

Permalink
Update new_op_cn.md
Browse files Browse the repository at this point in the history
fix  #132
  • Loading branch information
tink2123 committed Oct 11, 2018
1 parent 9b15d29 commit f2972fb
Showing 1 changed file with 9 additions and 6 deletions.
15 changes: 9 additions & 6 deletions doc/fluid/dev/new_op_cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,8 +150,9 @@ class MulOp : public framework::OperatorWithKernel {
protected:
void InferShape(const framework::InferShapeContext &ctx) const override {
auto dim0 = ctx.Input<Tensor>("X")->dims();
auto dim1 = ctx.Input<Tensor>("Y")->dims();
//never use Input or Output if you want a to get a LoDTensor.
auto dim0 = ctx.Input<LoDTensor>("X")->dims();
auto dim1 = ctx.Input<LoDTensor>("Y")->dims();
PADDLE_ENFORCE_EQ(dim0.size(), 2,
"input X(%s) should be a tensor with 2 dims, a matrix",
ctx.op_.Input("X"));
Expand All @@ -161,7 +162,7 @@ class MulOp : public framework::OperatorWithKernel {
PADDLE_ENFORCE_EQ(
dim0[1], dim1[0],
"First matrix's width must be equal with second matrix's height.");
ctx.Output<Tensor>("Out")->Resize({dim0[0], dim1[1]});
ctx.Output<LoDTensor>("Out")->Resize({dim0[0], dim1[1]});
}
};
```
Expand Down Expand Up @@ -201,16 +202,18 @@ MulOp(const std::string &type, const framework::VariableNameMap &inputs,
- 与`InferShapeContext`相比,`ExecutionContext`增加了设备类型,同样可获取到输入输出和属性参数。
- `Compute`函数里实现`OpKernel`的具体计算逻辑。
Op的输入和输出可分别通过ExecutionContext::Input()和ExecutionContext::Output()获得。注意:若op的输入/输出的变量类型是LoDTensor(fluid默认所有的Tensor默认都是LoDTensor类型),请写成ExecutionContext::Input()和ExecutionContext::Output(),不要写ExecutionContext::Input()和ExecutionContext::Output()。因为若实际的变量类型为SelectedRows,Input()和Output()方法会将SelectedRows类型特化为Tensor,导致潜在的错误。
下面是 `MulKernel` `Compute`的实现:
```cpp
template <typename DeviceContext, typename T>
class MulKernel : public framework::OpKernel {
public:
void Compute(const framework::ExecutionContext& context) const override {
auto* X = context.Input<Tensor>("X");
auto* Y = context.Input<Tensor>("Y");
auto* Z = context.Output<Tensor>("Out");
auto* X = context.Input<LoDTensor>("X");
auto* Y = context.Input<LoDTensor>("Y");
auto* Z = context.Output<LoDTensor>("Out");
Z->mutable_data<T>(context.GetPlace());
auto& device_context = context.template device_context<DeviceContext>();
math::matmul<DeviceContext, T>(*X, false, *Y, false, 1, Z, 0, device_context);
Expand Down

0 comments on commit f2972fb

Please sign in to comment.