Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternate implementation of Intel PRK DGEMM #15835

Merged
merged 22 commits into from
Jul 20, 2020

Conversation

rahulghangas
Copy link
Member

@rahulghangas rahulghangas commented Jun 13, 2020

An alternate implementation of Intel PRK DGEMM

  • Add perf Testing
  • Add compopts for native and BLAS

NOTE - Currently does not perform pipelined communication, which is inherent to SUMMA

@rahulghangas
Copy link
Member Author

@e-kayrakli

Copy link
Contributor

@e-kayrakli e-kayrakli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some in-detail comments in-line. Couple of higher-level comments:

  • I think this shouldn't create a new dir under test/studies/prk. You can create a dir under test/studies/prk/DGEMM named distributed, summa etc. Or you can also dump these files with some suffix in test/studies/prk
  • We discussed with @ben-albrecht that we want to have this to be a part of nightly performance tracking suite. For example see XC suite here. Also see multilocale performance testing doc. But doing that can be a follow up. The sooner we begin measuring the performance, the better. You can track the impact of your future optimizations there.

test/studies/prk/DGEMM2/dgemm.chpl Outdated Show resolved Hide resolved
test/studies/prk/DGEMM2/dgemm.chpl Outdated Show resolved Hide resolved
test/studies/prk/DGEMM2/dgemm.chpl Outdated Show resolved Hide resolved
test/studies/prk/DGEMM2/dgemm.chpl Outdated Show resolved Hide resolved
@rahulghangas
Copy link
Member Author

rahulghangas commented Jul 3, 2020

@e-kayrakli I think it's ready for a review
@dgarvit @LouisJenkinsCS @ben-albrecht

Copy link
Contributor

@e-kayrakli e-kayrakli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good. I asked few things that need to be adjusted for better testing runs.

Moreover, it is important to talk about the deviation from SUMMA in a comment somewhere.

I tested this on our correctness testing machine and 16-locale XC performance testing machine. @ben-albrecht could you also do a quick sanity check especially w.r.t the multilocale testing files and BLAS testing? ML tests are running only on Crays, so we don't need a skipif I think. But I have been wrong in similar topics before...

test/studies/prk/DGEMM/SUMMA/dgemm.chpl Show resolved Hide resolved
test/studies/prk/DGEMM/SUMMA/dgemm.compopts Outdated Show resolved Hide resolved
test/studies/prk/DGEMM/SUMMA/dgemm.compopts Outdated Show resolved Hide resolved
test/studies/prk/DGEMM/SUMMA/dgemm.ml-execopts Outdated Show resolved Hide resolved
@rahulghangas
Copy link
Member Author

If someone can do a sanity check on the testing, I think this PR can be merged now

@e-kayrakli e-kayrakli merged commit 19bc766 into chapel-lang:master Jul 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants