-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes for new HIP compiler / runtime #3067
Changes for new HIP compiler / runtime #3067
Conversation
…nce of HIP-Clang compiler as proxy for version
…ce, instead use __HIP_DEVICE_COMPILE__ or the memory space macros - Add Host/Device overloading PP-define, and add dummy overloads of some of the shared alloc's needed to for compilation
Can one of the admins verify this patch? |
OK to test |
I should also note that this is PR was made to be compatible w/ the current HCC / ROCm 3.3 build. In the future, we may want to simply remove |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. There are a lot less changes than what I expected.
If you can't find the problem, I am fine with disabling this test to get this PR merged. I know that we have problems with |
You also need to run |
@masterleinad -- any particular reason it has to be either way, I applied it in 775f9d4 |
No, we just picked one version when we started indenting with |
f7d6751
to
0375300
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks OK to me.
0375300
to
6f7e8b6
Compare
Marking this as WIP, because even though HCC accepts
I need to test on a HPE machine where we actually have the compiler w/o a corresponding GPU. Will fix / finish tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some more minor things.
6f7e8b6
to
451bc0f
Compare
I am seeing
when compiling in |
@masterleinad -- with Strangely I can seem to get both to work. |
I see it when passing |
The |
Interestingly that test passes on my 3.5 system (hence why I didn't see it):
I'll disable for now, and we can re-evaluate skipped tests when you guys get access to a 3.5 install. |
Codecov Report
@@ Coverage Diff @@
## develop #3067 +/- ##
=========================================
- Coverage 82.6% 82.4% -0.3%
=========================================
Files 122 122
Lines 8074 8095 +21
=========================================
- Hits 6673 6672 -1
- Misses 1401 1423 +22
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks OK to me. Since the LocalDeepCopy
tests never worked for me anyway, disabling them here, for now, is fine with me.
Sorry for bumping a merged PR, but I was just wondering if there has been any progress on the HIP side for asynchronous kernel launches (same question as over here: ROCm/HIP#2066 (comment))? Are there possibly WIP changes to HIP that already make this possible? |
Hi Mikael, we’ve implemented a scheme to make a good portion of launches
asynchronous internally at AMD. I’m anticipating upstreaming it in the
next week or so.
…-Nick
On Fri, Nov 13, 2020 at 6:28 AM Mikael Simberg ***@***.***> wrote:
*Message sent from a system outside of UConn.*
Sorry for bumping a merged PR, but I was just wondering if there has been
any progress on the HIP side for asynchronous kernel launches (same
question as over here: ROCm/HIP#2066 (comment)
<ROCm/HIP#2066 (comment)>)?
Are there possibly WIP changes to HIP that already make this possible?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3067 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABRKDCJHLRH25BG3ONN2B33SPUQ47ANCNFSM4NMOEVJQ>
.
|
@arghdos thanks for the quick reply, and that's great news! Looking forward to seeing that in action. |
Changes required to compile / run with the new compiler/runtime.
Major changes:
__host__
/__device__
overloading, while HCC was closer to NVCC (i.e., this document is largely applicable). This requires implementing a newKOKKOS_ENABLE_OVERLOAD_HOST_DEVICE
which mostly serves the same purpose as theKOKKOS_IMPL_CUDA_CLANG_WORKAROUND
. It may be worth discussing whether to unify these two.SharedAllocationRecord
that are essentially no-ops. I am not 100% why these are not required for your testing of CUDA-Clang. I suspect it is because our compiler tracks fairly close to upstream master (i.e., we're reporting clang 11.0), while from the Jenkins CI it appears you guys are typically using LLVM_VERSION=8.0 for CUDA-Clang.Known issues:
Minor changes:
--cuda-host-only
for GTest compilation.__constant__
must be global.Minor notes:
__HIP__
implies HIP-Clang is the front-end compiler