Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MKLDNN layout: Support for sum operator #11102

Merged

Conversation

mozga-intel
Copy link
Contributor

@mozga-intel mozga-intel commented May 31, 2018

Waiting for supports of the mkldnn’s layout.

Please have a look at this sum operator supported by the MKLDNN’s layout and assess if we can keep this design of the code.

This code uses the implementation of layout which is available in the last pull-request. Therefore some of the function in this code are not available in this pull-request, but are available here

This version of code can be merged into the main branch when the pull-request with layout is accepted.

The concept of splits of the long code was proposed by @luotao1

Pull-request is related to #11040

@mozga-intel mozga-intel requested a review from luotao1 May 31, 2018 14:42
@mozga-intel mozga-intel mentioned this pull request May 31, 2018
@mozga-intel mozga-intel force-pushed the mozga-intel/Sum_mkldnn_layout branch from da0b3f4 to 6c7c210 Compare June 1, 2018 10:40
@mozga-intel mozga-intel force-pushed the mozga-intel/Sum_mkldnn_layout branch 7 times, most recently from d99777e to fcd0dd1 Compare June 12, 2018 08:42
@mozga-intel
Copy link
Contributor Author

@luotao1, @tensor-tang The code is prepared to the code-review process.

@mozga-intel
Copy link
Contributor Author

mozga-intel commented Jun 12, 2018

@tensor-tang Creates out-of-place sum_primitive_desc for sum of n inputs multiplied by scale with resulting output_desc memory descriptor.

This code supports only LODTensor as a format of input. Otherwise I have done the rolling back to the naive version of this code. In the nearly future I will implement the support for the other case.

Copy link
Contributor

@tensor-tang tensor-tang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any unit test for this changes since added use_mkldnn flag.

if (src_tz.size() == 1 && (input_format == memory::format::nchw ||
input_format == memory::format::nhwc)) {
input_format = memory::format::x;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about else? Then do not set input_format which should be undef or any?

Copy link
Contributor Author

@mozga-intel mozga-intel Jun 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, The format is set at least once, it means that I expect that the size of format is different than 1 or 2, mainly is 4. The code is triggered in this place

auto& input0 = in_vars[0]->Get<LoDTensor>();
PADDLE_ENFORCE(input0.layout() == DataLayout::kMKLDNN &&
input0.format() != memory::format::format_undef,
"Wrong layout/format for inputs[0]");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about other inputs? Why only check the first one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other inputs are checking inside the loop: line 96-98

auto& input = in_vars[i]->Get<LoDTensor>();
PADDLE_ENFORCE(input.layout() == DataLayout::kMKLDNN &&
input.format() != memory::format::format_undef,
"Wrong layout/format for inputs");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it will throw error if do not set format. But it seems you only set one input, or does it mean you only accept one input?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to have the best performance and the fluent flow in our graph of primitives, the format of input should be compatible with the others primitives. When we have mkldnn's primitives we can assume that the format is always defined as a mkldnn's format.


std::shared_ptr<memory> dst_mem;
if (in_place)
dst_mem.reset(new memory(sum_pd.dst_primitive_desc()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you should keep the style

if (...) {
  ...
} else {
  ...
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


auto sum_prim = mkldnn::sum(sum_pd, inputs, *dst_mem);
output_format =
(memory::format)sum_pd.dst_primitive_desc().desc().data.format;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe could use a function to get format from pd

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


#ifdef PADDLE_WITH_MKLDNN
if (library == framework::LibraryType::kPlain &&
platform::CanMKLDNNBeUsed(ctx)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this CanMKLDNNBeUsed could directly return false ifndef PADDLE_WITH_MKLDNN. then we do not need add https://github.com/PaddlePaddle/Paddle/pull/11102/files#diff-a8d897ab383e5245127bb03da3d9830cR74 everywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tensor-tang. I think that is a splendid way to gets a little bit more clean checking when the primitive can be executed. If it is not problem I will take advantage of this idea, and I make it as other pull-request.

@mozga-intel mozga-intel force-pushed the mozga-intel/Sum_mkldnn_layout branch 4 times, most recently from 707e7af to 60623a6 Compare June 18, 2018 13:36
@mozga-intel
Copy link
Contributor Author

@luotao1 Could you continue the checking of this code, please.

Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add unit-test in test_sum_mkldnn_op.py?

@mozga-intel mozga-intel force-pushed the mozga-intel/Sum_mkldnn_layout branch from 60623a6 to 6715524 Compare June 19, 2018 07:31
@mozga-intel mozga-intel force-pushed the mozga-intel/Sum_mkldnn_layout branch from 6715524 to b88cda8 Compare June 19, 2018 10:02
@mozga-intel
Copy link
Contributor Author

@luotao1 Done

Copy link
Contributor

@tensor-tang tensor-tang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thanks @mozga-intel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

None yet

3 participants