Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clang 9 - Tripple amdgcn-amd-amdpal produces incomplete binary #79

Closed
Lolliedieb opened this issue Dec 25, 2019 · 2 comments
Closed

Clang 9 - Tripple amdgcn-amd-amdpal produces incomplete binary #79

Lolliedieb opened this issue Dec 25, 2019 · 2 comments

Comments

@Lolliedieb
Copy link

Lolliedieb commented Dec 25, 2019

Hello,

I am used to compile my kernels for AMD ROCm platform (Vega 64) with clang-9 using the following call:

clang-9 -x cl -Xclang -finclude-default-header
-target amdgcn-amd-amdhsa -mcpu=gfx900
-Xclang -mlink-bitcode-file -Xclang ./opencl/bitcode/opencl.amdgcn.bc
-Xclang -mlink-bitcode-file -Xclang ./opencl/bitcode/ocml.amdgcn.bc
-Xclang -mlink-bitcode-file -Xclang ./opencl/bitcode/ockl.amdgcn.bc
-Xclang -mlink-bitcode-file -Xclang ./opencl/bitcode/oclc_correctly_rounded_sqrt_off.amdgcn.bc
-Xclang -mlink-bitcode-file -Xclang ./opencl/bitcode/oclc_daz_opt_off.amdgcn.bc
-Xclang -mlink-bitcode-file -Xclang ./opencl/bitcode/oclc_finite_only_off.amdgcn.bc
-Xclang -mlink-bitcode-file -Xclang ./opencl/bitcode/oclc_unsafe_math_off.amdgcn.bc
-Xclang -mlink-bitcode-file -Xclang ./opencl/bitcode/oclc_wavefrontsize64_off.amdgcn.bc
-Xclang -mlink-bitcode-file -Xclang ./opencl/bitcode/oclc_isa_version_900.amdgcn.bc
kernel.cl -o kernel.so

The kernel builds and runs without a problem on ROCm 2.10 (Ubuntu 18.4). Changing the target to amdgcn-amd-amdpal works without any warning or error message, but the resulting binary file is only approx 2/3 the size (196.4 kb vs. 128.1 kb) and the kernels do not run (tested with amdgpu-pro 19.30, also Ubuntu). Disassembling the .so file shows valid gcn asm, but it seems all metadata, kernel names and so on are missing.

I am not sure if this is a bug or if I am doing something wrong - but tried to follow the instructions of the OpenCL back end as close as possible.

Ps: The kernels do not use any inline ASM or similar - its mostly stock OpenCL 1.2.

@arsenm
Copy link
Contributor

arsenm commented Jan 9, 2020

amdpal kernels aren't intended to work with rocm. These use different binary metadata formats, so they aren't expected to be identically sized. Why are you trying to use amdpal with rocm?

@arsenm arsenm closed this as completed Jan 9, 2020
@Lolliedieb
Copy link
Author

Lolliedieb commented Jan 10, 2020 via email

searlmc1 referenced this issue in ROCm/llvm-project Aug 23, 2023
jeffreytan81 pushed a commit to jeffreytan81/llvm-project that referenced this issue Sep 21, 2023
… provider

We noticed some performance issue while in lldb-vscode for grabing the name of the SBValue.
Profiling shows SBValue::GetName() can cause synthetic children provider of shared/unique_ptr
to deference underlying object and complete it type.

This patch lazily moves the dereference from synthetic child provider's Update() method to
GetChildAtIndex() so that SBValue::GetName() won't trigger the slow code path.

Here is the culprit slow code path:
```
...
frame llvm#59: 0x00007ff4102e0660 liblldb.so.15`SymbolFileDWARF::CompleteType(this=<unavailable>, compiler_type=0x00007ffdd9829450) at SymbolFileDWARF.cpp:1567:25 [opt]
...
frame llvm#67: 0x00007ff40fdf9bd4 liblldb.so.15`lldb_private::ValueObject::Dereference(this=0x0000022bb5dfe980, error=0x00007ffdd9829970) at ValueObject.cpp:2672:41 [opt]
frame llvm#68: 0x00007ff41011bb0a liblldb.so.15`(anonymous namespace)::LibStdcppSharedPtrSyntheticFrontEnd::Update(this=0x000002298fb94380) at LibStdcpp.cpp:403:40 [opt]
frame llvm#69: 0x00007ff41011af9a liblldb.so.15`lldb_private::formatters::LibStdcppSharedPtrSyntheticFrontEndCreator(lldb_private::CXXSyntheticChildren*, std::shared_ptr<lldb_private::ValueObject>) [inlined] (anonymous namespace)::LibStdcppSharedPtrSyntheticFrontEnd::LibStdcppSharedPtrSyntheticFrontEnd(this=0x000002298fb94380, valobj_sp=<unavailable>) at LibStdcpp.cpp:371:5 [opt]
...
frame llvm#78: 0x00007ff40fdf6e42 liblldb.so.15`lldb_private::ValueObject::CalculateSyntheticValue(this=0x000002296c66a500) at ValueObject.cpp:1836:27 [opt]
frame llvm#79: 0x00007ff40fdf1939 liblldb.so.15`lldb_private::ValueObject::GetSyntheticValue(this=<unavailable>) at ValueObject.cpp:1867:3 [opt]
frame llvm#80: 0x00007ff40fc89008 liblldb.so.15`ValueImpl::GetSP(this=0x0000022c71b90de0, stop_locker=0x00007ffdd9829d00, lock=0x00007ffdd9829d08, error=0x00007ffdd9829d18) at SBValue.cpp:141:46 [opt]
frame llvm#81: 0x00007ff40fc7d82a liblldb.so.15`lldb::SBValue::GetSP(ValueLocker&) const [inlined] ValueLocker::GetLockedSP(this=0x00007ffdd9829d00, in_value=<unavailable>) at SBValue.cpp:208:21 [opt]
frame llvm#82: 0x00007ff40fc7d817 liblldb.so.15`lldb::SBValue::GetSP(this=0x00007ffdd9829d90, locker=0x00007ffdd9829d00) const at SBValue.cpp:1047:17 [opt]
frame llvm#83: 0x00007ff40fc7da6f liblldb.so.15`lldb::SBValue::GetName(this=0x00007ffdd9829d90) at SBValue.cpp:294:32 [opt]
...
```

Differential Revision: https://reviews.llvm.org/D159542
jeffreytan81 pushed a commit that referenced this issue Sep 21, 2023
#67069)

We noticed some performance issue while in lldb-vscode for grabing the
name of the SBValue. Profiling shows SBValue::GetName() can cause
synthetic children provider of shared/unique_ptr to deference underlying
object and complete it type.

This patch lazily moves the dereference from synthetic child provider's
Update() method to GetChildAtIndex() so that SBValue::GetName() won't
trigger the slow code path.

Here is the culprit slow code path:
```
...
frame #59: 0x00007ff4102e0660 liblldb.so.15`SymbolFileDWARF::CompleteType(this=<unavailable>, compiler_type=0x00007ffdd9829450) at SymbolFileDWARF.cpp:1567:25 [opt]
...
frame #67: 0x00007ff40fdf9bd4 liblldb.so.15`lldb_private::ValueObject::Dereference(this=0x0000022bb5dfe980, error=0x00007ffdd9829970) at ValueObject.cpp:2672:41 [opt]
frame #68: 0x00007ff41011bb0a liblldb.so.15`(anonymous namespace)::LibStdcppSharedPtrSyntheticFrontEnd::Update(this=0x000002298fb94380) at LibStdcpp.cpp:403:40 [opt]
frame #69: 0x00007ff41011af9a liblldb.so.15`lldb_private::formatters::LibStdcppSharedPtrSyntheticFrontEndCreator(lldb_private::CXXSyntheticChildren*, std::shared_ptr<lldb_private::ValueObject>) [inlined] (anonymous namespace)::LibStdcppSharedPtrSyntheticFrontEnd::LibStdcppSharedPtrSyntheticFrontEnd(this=0x000002298fb94380, valobj_sp=<unavailable>) at LibStdcpp.cpp:371:5 [opt]
...
frame #78: 0x00007ff40fdf6e42 liblldb.so.15`lldb_private::ValueObject::CalculateSyntheticValue(this=0x000002296c66a500) at ValueObject.cpp:1836:27 [opt]
frame #79: 0x00007ff40fdf1939 liblldb.so.15`lldb_private::ValueObject::GetSyntheticValue(this=<unavailable>) at ValueObject.cpp:1867:3 [opt]
frame #80: 0x00007ff40fc89008 liblldb.so.15`ValueImpl::GetSP(this=0x0000022c71b90de0, stop_locker=0x00007ffdd9829d00, lock=0x00007ffdd9829d08, error=0x00007ffdd9829d18) at SBValue.cpp:141:46 [opt]
frame #81: 0x00007ff40fc7d82a liblldb.so.15`lldb::SBValue::GetSP(ValueLocker&) const [inlined] ValueLocker::GetLockedSP(this=0x00007ffdd9829d00, in_value=<unavailable>) at SBValue.cpp:208:21 [opt]
frame #82: 0x00007ff40fc7d817 liblldb.so.15`lldb::SBValue::GetSP(this=0x00007ffdd9829d90, locker=0x00007ffdd9829d00) const at SBValue.cpp:1047:17 [opt]
frame #83: 0x00007ff40fc7da6f liblldb.so.15`lldb::SBValue::GetName(this=0x00007ffdd9829d90) at SBValue.cpp:294:32 [opt]
...
```

Differential Revision: https://reviews.llvm.org/D159542
RevySR pushed a commit to revyos/llvm-project that referenced this issue Apr 3, 2024
* [Clang][XTHeadVector] implement 12.8 `vmin/vmax`

* [Clang][XTHeadVector] test 12.8 `vmin/vmax`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants