Skip to content

Conversation

@Srini511
Copy link
Contributor

@Srini511 Srini511 commented Mar 4, 2021

This PR introduces a simple cost metric to determine if dnnl_sgemm needs to be run with 1 thread or the whole threadpool.

Please note: Please dont merge this PR until #47543 is merged

@Srini511 Srini511 requested a review from penpornk as a code owner March 4, 2021 22:08
@google-ml-butler google-ml-butler bot added the size:S CL Change Size: Small label Mar 4, 2021
@google-cla google-cla bot added the cla: yes label Mar 4, 2021
@gbaned gbaned self-assigned this Mar 5, 2021
@gbaned gbaned added the comp:mkl MKL related issues label Mar 5, 2021
@google-cla
Copy link

google-cla bot commented Mar 9, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added cla: no and removed cla: yes labels Mar 9, 2021
@Srini511
Copy link
Contributor Author

Srini511 commented Mar 9, 2021

@penpornk I tried to fix the merge conflict online and it has cancelled the cla check. Can you please help resolve this?

@Srini511
Copy link
Contributor Author

Srini511 commented Mar 9, 2021

@googlebot I fixed it

@google-cla
Copy link

google-cla bot commented Mar 9, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@penpornk
Copy link
Member

penpornk commented Mar 9, 2021

Manually changing CLA to yes because the two commits in this PR are from the same github account which has CLA.

@penpornk penpornk added cla: yes and removed cla: no labels Mar 9, 2021
@gbaned gbaned added the awaiting review Pull request awaiting review label Mar 19, 2021
Copy link
Member

@penpornk penpornk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR and I'm sorry for the delay!

Comment on lines +162 to +163
// the kernel single threaded. Here we are coming up with a cost model based
// on based on L1 sizes. If we find that matrices are small enough, we will
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: based on appears twice.

Suggested change
// the kernel single threaded. Here we are coming up with a cost model based
// on based on L1 sizes. If we find that matrices are small enough, we will
// the kernel single threaded. Here we are coming up with a cost model based
// on L1 sizes. If we find that matrices are small enough, we will

// the kernel single threaded. Here we are coming up with a cost model based
// on based on L1 sizes. If we find that matrices are small enough, we will
// execute single threaded. This may need tuning.
bool single_threaded = ExecuteSingleThreadedGemm(m, n, k);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: The call is short enough and the result is only used once. I don't think we need a variable for it.

Comment on lines +166 to +173
if (!single_threaded) {
dnnl::threadpool_interop::sgemm(char_transa, char_transb, m, n, k, alpha,
a, lda, b, ldb, beta, c, ldc, &eigen_tp);
} else {
// for now call single threaded gemm.
dnnl::threadpool_interop::sgemm(char_transa, char_transb, m, n, k, alpha,
a, lda, b, ldb, beta, c, ldc, nullptr);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Can we swap the logic? E.g., start with single-threaded first so we don't have to negate the condition?

if (ExecuteSingleThreadedGemm(m, n, k)) {
  ...  // nullptr for thread pool
} else {
  ...  // &eigen_tp
}


namespace tensorflow {

#define L1_SIZE 32 * 1024
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI, TensorFlow can read cache sizes through Eigen, although that may come with some latency costs.

const auto cache_sizes = Eigen::internal::CacheSizes();

// a heuristic but what we are targetting are very small models whose
// total size is < few L1's. So we will do this simple calculation
// to determine if the MM should be run on a single thread.
return ((sizeof(float) * (m * n + k * (m + n))) < L1_SIZE * 8);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why is L1_SiZE multiplied by 8? If it's some magic constant, please say so in the comment (in the code).

@penpornk penpornk removed the awaiting review Pull request awaiting review label Mar 20, 2021
Copy link
Member

@penpornk penpornk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Srini511 Since the branch cut is in two days, I'll just pull the PR in and make the modifications myself to save time. I'd appreciate if you could help answer the question about L1_SIZE * 8 later when you have time though. :)

@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Mar 23, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Mar 23, 2021
@Srini511
Copy link
Contributor Author

@penpornk , Yes 8 is just a magic number that worked with internal workloads. I had made the changes internally and was testing them on specific models. Thanks!

@copybara-service copybara-service bot merged commit 178d7f8 into tensorflow:master Mar 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes comp:mkl MKL related issues ready to pull PR ready for merge process size:S CL Change Size: Small

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants