-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Work in progress] MKL-DNN 1.0 update #19510
[Work in progress] MKL-DNN 1.0 update #19510
Conversation
@luotao1 We posted this PR to let You know about current state of this topic and answer questions if there are any. |
Hope you could divide it, not just for files or code lines. If this new PR has some CE/benchmark/Memory problems, you should revert it at all. And it's hard for you to track and debug it. I have some findings just for this PR:
|
1619258
to
daec347
Compare
With We cannot change vectorize2int to vectorize since mkldnn v0.2 uses int in its primitives and v1.0 uses int64_t. There are other places where this function is used so can't change its behavior also. However, there is possibility to wok around this by creating separate function calling vectorize and use it in mkldnn ops (as a separate PR) but it will generate additional, redundant code. @luotao1 Additionaly could You please test this version on Your machine for some models in spare time? I would like to know if everything works just fine on Baidu's machines. |
@@ -96,10 +95,10 @@ void eltwise_forward(const framework::ExecutionContext &ctx, | |||
x->dims().size() == 2 || x->dims().size() == 3 || x->dims().size() == 4, | |||
"Input dim must be with 2, 3 or 4"); | |||
|
|||
std::vector<int> src_tz = framework::vectorize2int(x->dims()); | |||
std::vector<int64_t> src_tz = paddle::framework::vectorize(x->dims()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about?
auto src_tz = paddle::framework::vectorize(x->dims());
Maybe we can unify vectorize
and vectorize2int
at first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code You proposed does work for v1.0 but will crash in v0.2 because vectorize() returns int64_t vector and mkldnn v0.2 kernel expects int vector. We can't unify this as cuda kernels also use vectorize2int() function and vectorize() is used in some other places not maintained by us. Improvents like using auto will be made during code cleanup and refactor in the near future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we have template parametrised the vectorize function, so that by default it returns int64_t but it can be chosen through a template parameter what data type we want our dimensions to be returned in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Sand3r- I think it's OK. Could you create a separate PR to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll do that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in #19611
It's OK. |
Could you update this PR since there are a lot of conflicts here? |
Is it important for You to have them resolved right now? This update is scheduled for 1.7 version so we are now preparing ground for that by developing series of mkldnn handler update PRs in order to clean a lot of code. But this will affect almost every mkldnn operator and make conflicts again. If it's not needed by You right now, I would wait till we stabilize mkldnn operator changes. @luotao1 |
Got it. |
Closing this one since there is newer version of this PR in #20162 |
Final version target: Release 1.7
Current state: First functional version has been made. Its performance was tested on AVX2 and AVX512 machines (containing 6248 machine) and compared to current develop branch. Performance of FP32 is mostly the same but Int8 has improved a little (~5%).
Significant changes: This version does not contain any significant changes touching PaddlePaddle core. There are mostly cosmetic and MKL-DNN core changes.
Future plans: We will maintain and resolve conflicts if needed. Refactoring of some mkldnn ops are also in future plans.
This PR exceeds files and code lines limit but we cannot divide it because whole mkl-dnn infrastructure has been rebuilt.