Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[oneDNN] Improving Graph Rewrite Performance #50932

Conversation

mahmoud-abuzaina
Copy link
Contributor

This PR improves the latency of graph rewrite by avoiding doing kernel registry lookup more than once per kernel. Similar fix was to done before to eager mode, so the refactoring is done to manage and maintain that kernel registry info in one place (mkl_graph_util.h)

@google-ml-butler google-ml-butler bot added the size:M CL Change Size: Medium label Jul 24, 2021
@google-cla google-cla bot added the cla: yes label Jul 24, 2021
@gbaned gbaned self-assigned this Jul 26, 2021
@gbaned gbaned added the comp:core issues related to core part of tensorflow label Jul 26, 2021
@gbaned gbaned added this to Assigned Reviewer in PR Queue via automation Jul 26, 2021
@kkimdev kkimdev requested a review from penpornk July 27, 2021 20:44
@gbaned gbaned added the awaiting review Pull request awaiting review label Jul 29, 2021
@gbaned gbaned requested a review from bmzhao August 18, 2021 14:15
@gbaned
Copy link
Contributor

gbaned commented Sep 1, 2021

@penpornk Can you please review this PR ? Thanks!

Copy link
Member

@penpornk penpornk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for the fix and I'm so sorry for my delay!

tensorflow/core/graph/mkl_graph_util.h Outdated Show resolved Hide resolved
bool kernel_registered = false;

if (kernel_element == registered_kernels_map.end()) {
string kernel = KernelsRegisteredForOp(op_name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Please make the variable name more meaningful for readability.

Suggested change
string kernel = KernelsRegisteredForOp(op_name);
string registered_kernels = KernelsRegisteredForOp(op_name);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 234 to 236
string search_string = label;
search_string += string(";") + string(" T in [");
search_string += DataType_Name(T) + string("]");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This can be done in one command. (Please help format the source. The line is probably too long.)

Suggested change
string search_string = label;
search_string += string(";") + string(" T in [");
search_string += DataType_Name(T) + string("]");
string search_string = label + string("; T in [") + DataType_Name(T) + string("]");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 151 to 153
// Map used to store kernel registration status
thread_local static auto* registered_kernels_map =
new absl::flat_hash_map<string, bool>();
Copy link
Member

@penpornk penpornk Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be a global static variable? We try to only have static variable when they are trivially destructible. Can this map be part of some class? Or be inside of IsMklOp()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it can be in IsMklOp(). I made that change.

PR Queue automation moved this from Assigned Reviewer to Reviewer Requested Changes Sep 14, 2021
@mahmoud-abuzaina
Copy link
Contributor Author

Thank you for reviewing the PR. I have addressed your comments.

Copy link
Member

@penpornk penpornk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes!

PR Queue automation moved this from Reviewer Requested Changes to Approved by Reviewer Sep 17, 2021
@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Sep 17, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Sep 17, 2021
@penpornk penpornk removed the awaiting review Pull request awaiting review label Sep 17, 2021
@copybara-service copybara-service bot merged commit 2e17f5d into tensorflow:master Sep 20, 2021
@google-ml-butler google-ml-butler bot removed the ready to pull PR ready for merge process label Sep 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes comp:core issues related to core part of tensorflow size:M CL Change Size: Medium
Projects
PR Queue
  
Approved by Reviewer
Development

Successfully merging this pull request may close these issues.

None yet

4 participants