-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TPL oneMKL GEMV support #1912
Add TPL oneMKL GEMV support #1912
Conversation
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jczhang07 thanks for helping with this, I have some comments regarding the execution space that needs to be addressed, the reinterpret_cast is fine as is, just wanted to offer a different option without use of the macro
sycl::event event; \ | ||
\ | ||
Kokkos::Profiling::pushRegion(label); \ | ||
const STDSCALAR* a = reinterpret_cast<const STDSCALAR*>(A.data()); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is okay, but I would have looked into a compile time check like:
if constexpr (Kokkos::ArithTraits<SCALAR>::is_complex) {
const std::complex<Kokkos::ArithTraits<SCALAR>::mag_type>* a = reinterpret_cast<...>(A.data());
} else {
// no reinterpret cast, just use the pointers directly
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried but found this problem: a
is undefined in "CODE USING a"
if constexpr (Kokkos::ArithTraits<SCALAR>::is_complex) {
const std::complex<Kokkos::ArithTraits<SCALAR>::mag_type>* a = reinterpret_cast<...>(A.data());
} else {
const SCALAR *a = A.data();
}
CODE USING a;
I have to duplicate "CODE USING a" to the if-else clauses to solve the problem. Thus I had another approach using kokkos_to_std_type_map
6c11c96
to
9fd938d
Compare
exec.sycl_queue(), trans, M, N, alpha, a, LDA, x, 1, beta, y, 1); \ | ||
} else { \ | ||
event = oneapi::mkl::blas::column_major::gemv( \ | ||
exec.sycl_queue(), trans, M, N, alpha, a, LDA, x, 1, beta, y, 1); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lucbv I am unsure of this one. The oneMKL gemv returned event
, but I don't wait on that because I don't want to synchronize the GPU. But I saw you did that in SpMV. In other words, the event
variable is unused here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe in the case of SpMV I need to wait because the handle would otherwise be destroyed before the release
function is called on it.
The SpMV TPL path will be optimized a bit for the handle destruction to be done in a non-blocking way but I have not worked on that part yet...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pipeline was clean, but anyway I removed the use of event
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks : )
Let us retest and this should be good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably do not need to capture the return of gemv
as this could lead to a compiler warning, let's see what testing reports.
Status Flag 'Pre-Test Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED by label AT: PRE-TEST INSPECTED! Autotester is Removing Label; this inspection will remain valid until a new commit to source branch is performed. |
Status Flag 'Pull Request AutoTester' - Failure: Timed out waiting for job KokkosKernels_PullRequest_GNU1021 to start: Total Wait = 3603
|
9fd938d
to
e2162fd
Compare
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
Status Flag 'Pre-Test Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED by label AT: PRE-TEST INSPECTED! Autotester is Removing Label; this inspection will remain valid until a new commit to source branch is performed. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930_Tpls_CLANG13CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GNU1021
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GNU1021_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GNU1021
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL19_solo
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001_solo
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_ROCM520
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_Tpls_ROCM520
Jenkins Parameters
Using Repos:
Pull Request Author: jczhang07 |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930_Tpls_CLANG13CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GNU1021
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GNU1021_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GNU1021
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL19_solo
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001_solo
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_ROCM520
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_Tpls_ROCM520
Jenkins Parameters
Console Output (last 100 lines) : KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight # 765 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930_Tpls_CLANG13CUDA10 # 356 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GNU1021 # 29 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GNU1021_Light_LayoutRight # 28 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_GNU1021 # 28 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_INTEL19_solo # 34 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_CLANG1001_solo # 27 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110 # 539 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_A64FX_GCC1020 # 534 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_VEGA908_ROCM520 # 533 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_VEGA908_Tpls_ROCM520 # 55 (click to expand)
|
@jczhang07 the oneapi header needs to be guarded with
then you can remove lines 798 and 890 |
e2162fd
to
171ef7d
Compare
Status Flag 'Pre-Test Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED by label AT: PRE-TEST INSPECTED! Autotester is Removing Label; this inspection will remain valid until a new commit to source branch is performed. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930_Tpls_CLANG13CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GNU1021
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GNU1021_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GNU1021
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL19_solo
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001_solo
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_ROCM520
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_Tpls_ROCM520
Jenkins Parameters
Using Repos:
Pull Request Author: jczhang07 |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930_Tpls_CLANG13CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GNU1021
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GNU1021_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GNU1021
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL19_solo
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001_solo
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_ROCM520
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_Tpls_ROCM520
Jenkins Parameters
|
Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ lucbv ]! |
Status Flag 'Pull Request AutoTester' - Pull Request MUST BE MERGED MANUALLY BY Project Team - This Repo does not support Automerge |
…NENTS Workaround for kokkos#2023 Cross-reference kokkos#1912
…NENTS Workaround for kokkos#2023 Cross-reference kokkos#1912
…NENTS Workaround for kokkos#2023 Cross-reference kokkos#1912
…NENTS Workaround for kokkos#2023 Cross-reference kokkos#1912
No description provided.