-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SGEMM benchmark config for Vega FE? #122
Comments
If you mean you were using sgemm_5760.yaml, that produces HIP kernels and 5 TFlops does sound like the best that our compiler can do at this time.
If you use sgemm_asm.yaml, then Tensile will produce assembly kernels and you should see over 90% efficiency.
David E. Tanner
…_________________________________________________________________________________
MTS Software Engineer | Radeon Technologies Group – Open Compute
From: eqy [mailto:notifications@github.com]
Sent: Monday, October 16, 2017 2:26 PM
To: ROCmSoftwarePlatform/Tensile <Tensile@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Subject: [ROCmSoftwarePlatform/Tensile] SGEMM benchmark config for Vega FE? (#122)
Hi,
I'm experimenting with SGEMM performance tuning on Vega FE and get around 5 GFLOP/s max with the 5760 benchmark config. I was wondering if there was a pointer to a current best config for Vega/Vega FE that I could use as a starting point that was closer to peak performance?
Thanks,
Eddie
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#122>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ACiWnhO_IKbkErK8Q9PNm7G_MkwngTmaks5ss63fgaJpZM4P7HuS>.
|
It is possible to run an apples-to-apples comparison with the sgemm_5760 config (in terms of input size) with the sgemm_asm.yaml config? |
sgemm_5760.yaml is an example of non-batched gemm while sgemm_asm.yaml is an example of batched gemm. In the sgemm_asm.yaml file, find all instances of “Batched: True” and change to False. This should make the two much more similar.
David E. Tanner
…_________________________________________________________________________________
MTS Software Engineer | Radeon Technologies Group – Open Compute
From: eqy [mailto:notifications@github.com]
Sent: Monday, October 16, 2017 3:32 PM
To: ROCmSoftwarePlatform/Tensile <Tensile@noreply.github.com>
Cc: Tanner, David <David.Tanner@amd.com>; Comment <comment@noreply.github.com>
Subject: Re: [ROCmSoftwarePlatform/Tensile] SGEMM benchmark config for Vega FE? (#122)
It is possible to run an apples-to-apples comparison with the sgemm_5760 config (in terms of input size) with the sgemm_asm.yaml config?
I'm getting terminate called after throwing an instance of 'std::bad_alloc' so I'm not sure if that's due to an input size that is too large.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#122 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ACiWnp5jVJc8nMf3rNm1ziQMuimk2wNIks5ss71dgaJpZM4P7HuS>.
|
Great, thanks! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
I'm experimenting with SGEMM performance tuning on Vega FE and get around 5 GFLOP/s max with the 5760 benchmark config. I was wondering if there was a pointer to a current best config for Vega/Vega FE that I could use as a starting point that was closer to peak performance?
Thanks,
Eddie
The text was updated successfully, but these errors were encountered: