Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix reduce min max backward bug #5651

Merged
merged 6 commits into from
Jul 31, 2021
Merged

Conversation

simonJJJ
Copy link
Contributor

When I referred to some implementations in reduce_min backward, I found there is different behaviour between oneflow and pytorch:
image
So I fix it!
image

const auto& bcast_like =
JUST(OpInterpUtil::Dispatch<Tensor>(*bcast_like_op_, {output, input}, bcast_attrs));
const auto& bcast_eq =
JUST(OpInterpUtil::Dispatch<Tensor>(*bcast_equal_op_, {input, bcast_like}));
const auto& cast_like = JUST(OpInterpUtil::Dispatch<Tensor>(*cast_like_op_, {bcast_eq, input}));
const auto& reduce_sum_ =
JUST(OpInterpUtil::Dispatch<Tensor>(*reduce_sum_op_, {cast_like}, reduce_sum_attrs));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里用functional接口重写一下吧,

const auto& reduce_sum_ = JUST(functional::ReduceSum(cast_like, ctx->axis, ctx->keepdims));

其他地方也一样改一下,CastLike目前还没有functional接口,也一起加一个。

@simonJJJ simonJJJ requested a review from hjchen2 July 29, 2021 05:10
@@ -19,6 +19,7 @@ limitations under the License.
#include "oneflow/core/framework/op_expr.h"
#include "oneflow/core/framework/op_expr_helper.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

op_expr_helper.h 这个头文件可以删了

};

Maybe<void> ReduceMaxOrMinOp::Init(const OpExpr& op) {
const auto* fw_op_expr = dynamic_cast<const UserOpExpr*>(&op);
CHECK_NOTNULL_OR_RETURN(fw_op_expr);
base_attrs_ = MakeAttrMapFromUserOpConf(fw_op_expr->proto());
const std::string& op_name = fw_op_expr->op_name();
bcast_like_op_ =
JUST(op_expr_helper::BroadcastLikeOp(/*axis=*/{-1}, GradientOpName(op_name + "_bcast_like")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

顺便把这个文件中其他地方的op_expr_helper::XXXOp也一起改成functional的吧

const auto& bcast_eq = JUST(functional::BroadcastEqual(input, bcast_like));
const auto& cast_like = JUST(functional::CastLike(bcast_eq, input));
const auto& reduce_sum_ = JUST(functional::ReduceSum(cast_like, ctx->axis, ctx->keepdims));
const auto& bcast_div_ = JUST(functional::BroadcastDiv(dy, reduce_sum_));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bcast_div_ -> bcast_div

@oneflow-ci-bot oneflow-ci-bot self-requested a review July 31, 2021 13:55
@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

PyTorch resnet50 time: 139.2ms (= 6958.9ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 126.8ms (= 6338.5ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
Relative speed: 1.10 (= 139.2ms / 126.8ms)

PyTorch resnet50 time: 84.3ms (= 4217.3ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 74.1ms (= 3705.4ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
Relative speed: 1.14 (= 84.3ms / 74.1ms)

PyTorch resnet50 time: 58.8ms (= 2937.9ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 47.2ms (= 2360.5ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
Relative speed: 1.24 (= 58.8ms / 47.2ms)

PyTorch resnet50 time: 49.1ms (= 2453.5ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 41.6ms (= 2078.5ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
Relative speed: 1.18 (= 49.1ms / 41.6ms)

PyTorch resnet50 time: 42.6ms (= 2128.2ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 46.0ms (= 2302.3ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
Relative speed: 0.92 (= 42.6ms / 46.0ms)

@oneflow-ci-bot oneflow-ci-bot merged commit c620c55 into master Jul 31, 2021
@oneflow-ci-bot oneflow-ci-bot deleted the fix_reduce_minmax_backward branch July 31, 2021 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants