Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify backward when inserting a sum operator to accumulate all duplicated variables #4703

Merged
merged 1 commit into from
Oct 11, 2017

Conversation

Xreki
Copy link
Contributor

@Xreki Xreki commented Oct 11, 2017

Since sum_op is used instead of add_op, and it supports to accumulate a vector of variables at one time, we do not need to insert multiple sum_op to add those inputs one by one. Just add all inputs in one sum_op.

I make a simple test based on interp_op. See diff of interp_op:

$ git diff interp_op.cc
diff --git a/paddle/operators/interp_op.cc b/paddle/operators/interp_op.cc
index d02b01c..08302b7 100644
--- a/paddle/operators/interp_op.cc
+++ b/paddle/operators/interp_op.cc
@@ -54,6 +54,10 @@ class InterpOp : public NetOp {
     // Out = MulOut + Y = (X - Y) * W + Y = X * W + Y * (1 - W)
     AppendOp(framework::OpRegistry::CreateOp("elementwise_add",
                                              {{"X", {mul_out}}, {"Y", {y}}},
+                                             {{"Out", {Output("SubOut")}}}, {}));
+
+    AppendOp(framework::OpRegistry::CreateOp("elementwise_add",
+                                             {{"X", {sub_out}}, {"Y", {y}}},
                                              {{"Out", {Output("Out")}}}, {}));
 
     CompleteAddOp(false);

The DebugString of interp_op:

147: Op(interp), inputs:{W[W], X[X], Y[Y]}, outputs:{MulOut[MulOut], Out[Out], SubOut[SubOut]}.
147: Op(elementwise_sub), inputs:{X[X], Y[Y]}, outputs:{Out[SubOut]}.
147: Op(elementwise_mul), inputs:{X[SubOut], Y[W]}, outputs:{Out[MulOut]}.
147: Op(elementwise_add), inputs:{X[MulOut], Y[Y]}, outputs:{Out[SubOut]}.
147: Op(elementwise_add), inputs:{X[SubOut], Y[Y]}, outputs:{Out[Out]}.

The DebugString of interp_grad_op in previous implementation:

147: Op(plain_net), inputs:{}, outputs:{}.
147: Op(plain_net), inputs:{}, outputs:{}.
147: Op(elementwise_add_grad), inputs:{Out[Out], Out@GRAD[Out@GRAD], X[SubOut], Y[Y]}, outputs:{X@GRAD[SubOut@GRAD], Y@GRAD[Y@GRAD]}.
147: Op(plain_net), inputs:{}, outputs:{}.
147: Op(elementwise_add_grad), inputs:{Out[SubOut], Out@GRAD[SubOut@GRAD], X[MulOut], Y[Y]}, outputs:{X@GRAD[MulOut@GRAD], Y@GRAD[Y@GRAD]}.
147: Op(plain_net), inputs:{}, outputs:{}.
147: Op(elementwise_mul_grad), inputs:{Out[MulOut], Out@GRAD[MulOut@GRAD], X[SubOut], Y[W]}, outputs:{X@GRAD[SubOut@GRAD], Y@GRAD[W@GRAD]}.
147: Op(plain_net), inputs:{}, outputs:{}.
147: Op(elementwise_sub_grad), inputs:{Out[SubOut], Out@GRAD[SubOut@GRAD], X[X], Y[Y]}, outputs:{X@GRAD[X@GRAD], Y@GRAD[Y@GRAD]}.
147: Op(sum), inputs:{X[Y@GRAD@RENAME@0@0, Y@GRAD@RENAME@0@1]}, outputs:{Out[Y@GRAD@SHARED@0]}.
147: Op(sum), inputs:{X[Y@GRAD@RENAME@0@1, Y@GRAD@SHARED@0]}, outputs:{Out[Y@GRAD]}.
147: Op(sum), inputs:{X[SubOut@GRAD@RENAME@0@0, SubOut@GRAD@RENAME@0@1]}, outputs:{Out[SubOut@GRAD]}.

The DebugString of interp_grad_op in this PR:

147: Op(plain_net), inputs:{}, outputs:{}.
147: Op(plain_net), inputs:{}, outputs:{}.
147: Op(elementwise_add_grad), inputs:{Out[Out], Out@GRAD[Out@GRAD], X[SubOut], Y[Y]}, outputs:{X@GRAD[SubOut@GRAD], Y@GRAD[Y@GRAD]}.
147: Op(plain_net), inputs:{}, outputs:{}.
147: Op(elementwise_add_grad), inputs:{Out[SubOut], Out@GRAD[SubOut@GRAD], X[MulOut], Y[Y]}, outputs:{X@GRAD[MulOut@GRAD], Y@GRAD[Y@GRAD]}.
147: Op(plain_net), inputs:{}, outputs:{}.
147: Op(elementwise_mul_grad), inputs:{Out[MulOut], Out@GRAD[MulOut@GRAD], X[SubOut], Y[W]}, outputs:{X@GRAD[SubOut@GRAD], Y@GRAD[W@GRAD]}.
147: Op(plain_net), inputs:{}, outputs:{}.
147: Op(elementwise_sub_grad), inputs:{Out[SubOut], Out@GRAD[SubOut@GRAD], X[X], Y[Y]}, outputs:{X@GRAD[X@GRAD], Y@GRAD[Y@GRAD]}.
147: Op(sum), inputs:{X[Y@GRAD@RENAME@0@0, Y@GRAD@RENAME@0@1, Y@GRAD@RENAME@0@2]}, outputs:{Out[Y@GRAD]}.
147: Op(sum), inputs:{X[SubOut@GRAD@RENAME@0@0, SubOut@GRAD@RENAME@0@1]}, outputs:{Out[SubOut@GRAD]}.

Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. much simpler!

@luotao1 luotao1 merged commit 1cafe7b into PaddlePaddle:develop Oct 11, 2017
@Xreki Xreki deleted the core_optimize_backward branch November 14, 2018 02:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants