-
Notifications
You must be signed in to change notification settings - Fork 10.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clang 9 - Tripple amdgcn-amd-amdpal produces incomplete binary #79
Comments
amdpal kernels aren't intended to work with rocm. These use different binary metadata formats, so they aren't expected to be identically sized. Why are you trying to use amdpal with rocm? |
Sry for the confusion. To make it short: no, I am not trying to run the pal
kernel on rocm.
Using the pal tripple I want to load the kernel with amdgpu-pro 19.30, but
that fails even for simple kernels while the hsa tripple kernels seem fine
for me inspecting their disassembly (not able to test that yet on rocm, but
will soon. Only connection to rocm is that I used the bitcode files I found
there since I found no other source for them.
The code itself generated seems ok as said above, but the amdgpu-pro
runtime fails loading the binary and looking to kernel dissassembly the
metadata with kernel information do not appear, different to pal binaries
compiled with the inregrated OpenCL offline compiler.
I can provide a minimal example later today.
Matt Arsenault <notifications@github.com> schrieb am Fr., 10. Jan. 2020,
00:26:
… amdpal kernels aren't intended to work with rocm. These use different
binary metadata formats, so they aren't expected to be identically sized.
Why are you trying to use amdpal with rocm?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#79?email_source=notifications&email_token=AJS63R23Z355LQWUJSDSVZ3Q46XDRA5CNFSM4J7GBMH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEISEVDQ#issuecomment-572803726>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJS63R7SAKVL7K2Y7EBJPLTQ46XDRANCNFSM4J7GBMHQ>
.
|
searlmc1
referenced
this issue
in ROCm/llvm-project
Aug 23, 2023
Disable Auto provides for CPACK RPM
jeffreytan81
pushed a commit
to jeffreytan81/llvm-project
that referenced
this issue
Sep 21, 2023
… provider We noticed some performance issue while in lldb-vscode for grabing the name of the SBValue. Profiling shows SBValue::GetName() can cause synthetic children provider of shared/unique_ptr to deference underlying object and complete it type. This patch lazily moves the dereference from synthetic child provider's Update() method to GetChildAtIndex() so that SBValue::GetName() won't trigger the slow code path. Here is the culprit slow code path: ``` ... frame llvm#59: 0x00007ff4102e0660 liblldb.so.15`SymbolFileDWARF::CompleteType(this=<unavailable>, compiler_type=0x00007ffdd9829450) at SymbolFileDWARF.cpp:1567:25 [opt] ... frame llvm#67: 0x00007ff40fdf9bd4 liblldb.so.15`lldb_private::ValueObject::Dereference(this=0x0000022bb5dfe980, error=0x00007ffdd9829970) at ValueObject.cpp:2672:41 [opt] frame llvm#68: 0x00007ff41011bb0a liblldb.so.15`(anonymous namespace)::LibStdcppSharedPtrSyntheticFrontEnd::Update(this=0x000002298fb94380) at LibStdcpp.cpp:403:40 [opt] frame llvm#69: 0x00007ff41011af9a liblldb.so.15`lldb_private::formatters::LibStdcppSharedPtrSyntheticFrontEndCreator(lldb_private::CXXSyntheticChildren*, std::shared_ptr<lldb_private::ValueObject>) [inlined] (anonymous namespace)::LibStdcppSharedPtrSyntheticFrontEnd::LibStdcppSharedPtrSyntheticFrontEnd(this=0x000002298fb94380, valobj_sp=<unavailable>) at LibStdcpp.cpp:371:5 [opt] ... frame llvm#78: 0x00007ff40fdf6e42 liblldb.so.15`lldb_private::ValueObject::CalculateSyntheticValue(this=0x000002296c66a500) at ValueObject.cpp:1836:27 [opt] frame llvm#79: 0x00007ff40fdf1939 liblldb.so.15`lldb_private::ValueObject::GetSyntheticValue(this=<unavailable>) at ValueObject.cpp:1867:3 [opt] frame llvm#80: 0x00007ff40fc89008 liblldb.so.15`ValueImpl::GetSP(this=0x0000022c71b90de0, stop_locker=0x00007ffdd9829d00, lock=0x00007ffdd9829d08, error=0x00007ffdd9829d18) at SBValue.cpp:141:46 [opt] frame llvm#81: 0x00007ff40fc7d82a liblldb.so.15`lldb::SBValue::GetSP(ValueLocker&) const [inlined] ValueLocker::GetLockedSP(this=0x00007ffdd9829d00, in_value=<unavailable>) at SBValue.cpp:208:21 [opt] frame llvm#82: 0x00007ff40fc7d817 liblldb.so.15`lldb::SBValue::GetSP(this=0x00007ffdd9829d90, locker=0x00007ffdd9829d00) const at SBValue.cpp:1047:17 [opt] frame llvm#83: 0x00007ff40fc7da6f liblldb.so.15`lldb::SBValue::GetName(this=0x00007ffdd9829d90) at SBValue.cpp:294:32 [opt] ... ``` Differential Revision: https://reviews.llvm.org/D159542
jeffreytan81
pushed a commit
that referenced
this issue
Sep 21, 2023
#67069) We noticed some performance issue while in lldb-vscode for grabing the name of the SBValue. Profiling shows SBValue::GetName() can cause synthetic children provider of shared/unique_ptr to deference underlying object and complete it type. This patch lazily moves the dereference from synthetic child provider's Update() method to GetChildAtIndex() so that SBValue::GetName() won't trigger the slow code path. Here is the culprit slow code path: ``` ... frame #59: 0x00007ff4102e0660 liblldb.so.15`SymbolFileDWARF::CompleteType(this=<unavailable>, compiler_type=0x00007ffdd9829450) at SymbolFileDWARF.cpp:1567:25 [opt] ... frame #67: 0x00007ff40fdf9bd4 liblldb.so.15`lldb_private::ValueObject::Dereference(this=0x0000022bb5dfe980, error=0x00007ffdd9829970) at ValueObject.cpp:2672:41 [opt] frame #68: 0x00007ff41011bb0a liblldb.so.15`(anonymous namespace)::LibStdcppSharedPtrSyntheticFrontEnd::Update(this=0x000002298fb94380) at LibStdcpp.cpp:403:40 [opt] frame #69: 0x00007ff41011af9a liblldb.so.15`lldb_private::formatters::LibStdcppSharedPtrSyntheticFrontEndCreator(lldb_private::CXXSyntheticChildren*, std::shared_ptr<lldb_private::ValueObject>) [inlined] (anonymous namespace)::LibStdcppSharedPtrSyntheticFrontEnd::LibStdcppSharedPtrSyntheticFrontEnd(this=0x000002298fb94380, valobj_sp=<unavailable>) at LibStdcpp.cpp:371:5 [opt] ... frame #78: 0x00007ff40fdf6e42 liblldb.so.15`lldb_private::ValueObject::CalculateSyntheticValue(this=0x000002296c66a500) at ValueObject.cpp:1836:27 [opt] frame #79: 0x00007ff40fdf1939 liblldb.so.15`lldb_private::ValueObject::GetSyntheticValue(this=<unavailable>) at ValueObject.cpp:1867:3 [opt] frame #80: 0x00007ff40fc89008 liblldb.so.15`ValueImpl::GetSP(this=0x0000022c71b90de0, stop_locker=0x00007ffdd9829d00, lock=0x00007ffdd9829d08, error=0x00007ffdd9829d18) at SBValue.cpp:141:46 [opt] frame #81: 0x00007ff40fc7d82a liblldb.so.15`lldb::SBValue::GetSP(ValueLocker&) const [inlined] ValueLocker::GetLockedSP(this=0x00007ffdd9829d00, in_value=<unavailable>) at SBValue.cpp:208:21 [opt] frame #82: 0x00007ff40fc7d817 liblldb.so.15`lldb::SBValue::GetSP(this=0x00007ffdd9829d90, locker=0x00007ffdd9829d00) const at SBValue.cpp:1047:17 [opt] frame #83: 0x00007ff40fc7da6f liblldb.so.15`lldb::SBValue::GetName(this=0x00007ffdd9829d90) at SBValue.cpp:294:32 [opt] ... ``` Differential Revision: https://reviews.llvm.org/D159542
RevySR
pushed a commit
to revyos/llvm-project
that referenced
this issue
Apr 3, 2024
* [Clang][XTHeadVector] implement 12.8 `vmin/vmax` * [Clang][XTHeadVector] test 12.8 `vmin/vmax`
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hello,
I am used to compile my kernels for AMD ROCm platform (Vega 64) with clang-9 using the following call:
The kernel builds and runs without a problem on ROCm 2.10 (Ubuntu 18.4). Changing the target to amdgcn-amd-amdpal works without any warning or error message, but the resulting binary file is only approx 2/3 the size (196.4 kb vs. 128.1 kb) and the kernels do not run (tested with amdgpu-pro 19.30, also Ubuntu). Disassembling the .so file shows valid gcn asm, but it seems all metadata, kernel names and so on are missing.
I am not sure if this is a bug or if I am doing something wrong - but tried to follow the instructions of the OpenCL back end as close as possible.
Ps: The kernels do not use any inline ASM or similar - its mostly stock OpenCL 1.2.
The text was updated successfully, but these errors were encountered: