-
Notifications
You must be signed in to change notification settings - Fork 534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hipify PARBOIL benchmark suite #3
Comments
I don't understund, PARBOIL seems already feature complete in it's openCL branch |
Good question. Porting to hip offers several advantages over porting to Opencl- you can use c++ features including templates, the port requires fewer lines of code, and hip is an easier port from cuda. In this case parboil already has been ported to cuda, Opencl, and openmp so the hip port would be for comparison with other targets (on performance, coding conplexity, features) and would be useful as an example of what we might see with other programs. |
Wow impressive, so HIP is not Just an openCL translator ? You say c++ features, you do hint to syCL ? If a HIP software is better than an openCL software, you should promote more this fact ! |
RIght - on the AMD path HIP code is compiled with "HCC" (Heterogeneous Compute Compiler), which is a C++ compiler that generates code directly for the GPU. HCC does not translate through OpenCL kernel language - the CLANG front-end generates HSAIL or GCN ISA directly. |
Oh thanks a lot ! |
Where can i find the Parboil benchmark ported to HIP?. Thanks |
Hi trimaran- this is still in the "up for grabs" category. The 0.86 hip package next week will include a clang version of hipify that will fully automate many of the kernel signatures that require manually adding hipLaunchParm today, could be interesting to try (if you are interested) On May 13, 2016, at 3:23 PM, trinayan <notifications@github.commailto:notifications@github.com> wrote: Where can i find the Parboil benchmark ported to HIP?. Thanks You are receiving this because you authored the thread. |
I'm going to give this a try. So far, I got the sgemm_base hip version to compile on my Kaveri desktop machine running ROCm-1.1 (https://github.com/briansp2020/Parboil/tree/hip_sgemm). However, it is producing incorrect result. I'm a bit puzzled as to what maybe wrong since it is a simple algorithm and the porting work was fairly straight forward. I guess it's time for me to get familiar with ROCm debugging tools. :) Does HIP or hcc support printing to screen/saving to a file from GPU kernel? The compiler did not like printf in GPU code. A few issues I have ran into so far
|
Could you verify that if all the samples and direct tests for HIP could be executed correctly on your system? I don't think Kaveri desktop is in the support list of chipsets for ROCm: |
@sunway513 |
@sunway513
I was running my iGPU at 900MHz. Maybe that caused some damage. When I backed off GPU frequencies to 750, test #21 stared passing. But it's still intermittent. :( BTW, I did not know about this test and did not run it with ROCm-1.0. So, maybe it never worked reliably on my machine... |
Just like HIP test 21 is intermittent on my machine, sgemm hip port is intermittent now on my machine. So, I think I'm having problems with hipStream and my sgemm hip port is OK... I wish I had a Fury Nano so I can continue my porting work. wink. wink. @bensander |
Can someone help me with converting stencile cuda to hip? Its kernel uses dynamic shared memory variable whose size is determined at kernel launch time.
Hipify took care of the launch code using lp.groupMemBytes. But I'm not sure how to convert the kernel code properly. Hipify left the extern shared as is but hcc complains about it.
I could not find documentation on how to do it. Can someone help? |
HIP test still fails for me using ROCm-1.1.1. I assume that I'll have to wait till ROCm-1.2 which will support Hawaii since Kaveri & Hawaii both use second generation GCN? |
Dynamic LDS not yet supported. ROCm is working fine on Kaveri. |
@adityaatluri ,
|
Yes |
@adityaatluri Thanks! |
Hi, |
Whatever is causing test 21 to fail is causing my sgemm port to fail. I want to make sure that the cause is not my configuration/BIOS version before going out to get new hardware. Can you tell me your BIOS version? Sometimes, regressions do happen. Thanks. |
Can you put the output in git gist and share the link? Thanks |
Output of what? Test output is already posted above. |
sgemm port fail. |
https://gist.github.com/briansp2020/e9cb4f7a2dcb6be3f4d67bb2d6b412e7 I put the output file for sgemm. The output values change each run. I don't think it's sgemm code. As I said earlier, it sometimes pass just like test 21 sometimes passes. My hipified sgemm code is at |
Different problem.
Can some one take a look? My code is here. |
@briansp2020 I'm suspecting this relates to an issue I've been working on recently. Could you help do:
|
@briansp2020 it does look pretty similar to the issue I've been working on. before there's a fix in the compiler side, could you try the following workaround? The code in problem is around: Please change it to:
|
dump.promote.ll on gist here. I also tried your suggested work around and I still get the same errer.
Thanks. |
This is indeed the issue I've been investigating lately. The PHI node in the IR contains incoming pointers from different address spaces. It's because I'm still investigating way to come up with a proper fix. Before that we need to change the application to get around this issue. May I understand the way to reproduce this issue is to invoke:
in |
@briansp2020 Here's a patch which could get around the issue. I've verified it could get pass the error you were facing. However, the program still won't compile though, and we'll hit another known limitation in the compiler backend where values in global segment (ex: |
@whchung
in I'll try your patch when I go home today. Thanks for your help. |
I set up Intel i7-6700 + dual Fiji Nano system and tried my sgemm hip port. To my surprise, it failed exactly the same way it did on my Kaveri system. I'll wait for the next release and try it again. But I don't think I'm dealing with hardware failure any more. Also, hcc/hip seems a bit more buggy when using it with Fiji. I have not done through testing yet. I'll wait for the next release and do more through testing with what I have tried so far. In particular, I want to make sure that Udacity problem set I ported to HIP works. At the moment, the PS1 fails with with a core dump. |
Hi Brian – our next HIP release will include several stability fixes, specifically in the areas of signal management and managing inflight dependencies. From: Brian notifications@github.com I set up Intel i7-6700 + dual Fiji Nano system and tried my sgemm hip port. To my surprise, it failed exactly the same way it did on my Kaveri system. I'll wait for the next release and try it again. But I don't think I'm dealing with hardware failure any more. Also, hcc/hip seems a bit more buggy when using it with Fiji. I have not done through testing yet. I'll wait for the next release and do more through testing with what I have tried so far. In particular, I want to make sure that Udacity problem sethttps://github.com/briansp2020/cs344/tree/master/Problem%20Sets/Problem%20Set%201 I ported to HIP works. At the moment, the PS1 fails with with a core dump. — |
My codes are in my github account. I'll test when the next version is released. |
It looks like ROCm 1.2 fixed my sgemm problem. I still get core dump when running cs344 problem set port. |
Hi,
Could some one help? I also ran into some other issues that I reported here It seems the steps to run HIP test has changed. Following the steps here. In fact, there is no CMakefiles.txt in src directory any more. |
Is 'new' keyword supported? Malloc/free way work fine, but not new/delete. If lines 45-46 added, it compiler error is the following: ndr@ndr-ROCM16:~/Desktop/square/new$ make clean && make rm -f *.o square /opt/rocm/hip/bin/hipcc --amdgpu-target=gfx900 square.cpp -o square Referencing function in another module! %call6.i.i = tail call i8* @_Znam(i64 1024) ROCm#3 ; ModuleID = '<stdin>' i8* (i64)* @_Znam ; ModuleID = '#0 0x000000000142b5ea llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0x142b5ea) ROCm#1 0x000000000142968e llvm::sys::RunSignalHandlers() (/opt/rocm/hcc-1.0/compiler/bin/opt+0x142968e) ROCm#2 0x00000000014297dc SignalHandler(int) (/opt/rocm/hcc-1.0/compiler/bin/opt+0x14297dc) ROCm#3 0x00007f22f9e4c390 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x11390) ROCm#4 0x0000000000f81eb9 void llvm::VerifierSupport::CheckFailed<llvm::Instruction*, llvm::Module const*, llvm::GlobalValue*, llvm::Module*>(llvm::Twine const&, llvm::Instruction* const&, llvm::Module const* const&, llvm::GlobalValue* const&, llvm::Module* const&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf81eb9) ROCm#5 0x0000000000f8c8bc (anonymous namespace)::Verifier::visitInstruction(llvm::Instruction&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf8c8bc) ROCm#6 0x0000000000f8f7b2 (anonymous namespace)::Verifier::verifyCallSite(llvm::CallSite) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf8f7b2) #7 0x0000000000f919f5 (anonymous namespace)::Verifier::visitCallInst(llvm::CallInst&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf919f5) #8 0x0000000000f95381 llvm::InstVisitor<(anonymous namespace)::Verifier, void>::visit(llvm::Function&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf95381) #9 0x0000000000f97264 (anonymous namespace)::Verifier::verify(llvm::Function const&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf97264) ROCm#10 0x0000000000f9831d (anonymous namespace)::VerifierLegacyPass::runOnFunction(llvm::Function&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf9831d) ROCm#11 0x0000000000f4459a llvm::FPPassManager::runOnFunction(llvm::Function&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf4459a) ROCm#12 0x0000000000f44643 llvm::FPPassManager::runOnModule(llvm::Module&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf44643) ROCm#13 0x0000000000f44104 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf44104) ROCm#14 0x0000000000643b74 main (/opt/rocm/hcc-1.0/compiler/bin/opt+0x643b74) ROCm#15 0x00007f22f8ba9830 __libc_start_main /build/glibc-bfm8X4/glibc-2.23/csu/../csu/libc-start.c:325:0 ROCm#16 0x000000000068f729 _start (/opt/rocm/hcc-1.0/compiler/bin/opt+0x68f729) Stack dump: 0. Program arguments: /opt/rocm/hcc-1.0/compiler/bin/opt -load /opt/rocm/hcc-1.0/compiler/bin/../lib/LLVMEraseNonkernel.so -inline -inline-threshold=1048576 -erase-nonkernels -dce -globaldce -o /tmp/tmp.vqMJlUNjk9/kernel-gfx900.hsaco.promote.bc 1. Running pass 'Function Pass Manager' on module '<stdin>'. 2. Running pass 'Module Verifier' on function '@_ZZ4mainEN67HIP_kernel_functor_name_begin_unnamed_HIP_kernel_functor_name_end_419__cxxamp_trampolineEPfS0_m' /opt/rocm/hcc-1.0/compiler/bin/clamp-device: line 140: 18412 Segmentation fault (core dumped) $OPT -load $LIB/LLVMEraseNonkernel.so -inline -inline-threshold=1048576 -erase-nonkernels -dce -globaldce -o $2.promote.bc < $1 Generating AMD GCN kernel failed in HCC-specific opt passes for target: gfx900 /opt/rocm/hcc/bin/hcc(_ZN4llvm3sys15PrintStackTraceERNS_11raw_ostreamE+0x2a)[0x1674f1a] /opt/rocm/hcc/bin/hcc(_ZN4llvm3sys17RunSignalHandlersEv+0x3e)[0x1672fbe] /opt/rocm/hcc/bin/hcc[0x167310c] /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f69bbc98390] [0x7f69bc0c8a10] Stack dump: 0. Program arguments: /opt/rocm/hcc/bin/hcc -hc -D__HIPCC__ -I/opt/rocm/hcc/include -I/opt/rocm/hip/include/hip/hcc_detail/cuda -I/opt/rocm/hsa/include -Wno-deprecated-register -I/opt/rocm/profiler/CXLActivityLogger/include -I/opt/rocm/hip/include -DHIP_VERSION_MAJOR=1 -DHIP_VERSION_MINOR=2 -DHIP_VERSION_PATCH=17284 -D__HIP_ARCH_GFX900__=1 -Wl,--rpath=/opt/rocm/hip/lib /opt/rocm/hip/lib/libhip_hcc.so /opt/rocm/hip/lib/libhip_device.a -hc -std=c++amp -L/opt/rocm/hcc-1.0/lib -Wl,--rpath=/opt/rocm/hcc-1.0/lib -ldl -lm -lpthread -lunwind -Wl,--whole-archive -lmcwamp -Wl,--no-whole-archive -lsupc++ -L/opt/rocm/hsa/lib -L/opt/rocm/lib -lhsa-runtime64 -lhc_am -lhsakmt -L/opt/rocm/profiler/CXLActivityLogger/bin/x86_64 -lCXLActivityLogger -Wl,--rpath=/opt/rocm/profiler/CXLActivityLogger/bin/x86_64 -lm --amdgpu-target=gfx900 --amdgpu-target=gfx900 square.cpp -o square Died at /opt/rocm/hip/bin/hipcc line 452. Makefile:19: recipe for target 'square' failed make: *** [square] Error 255 With delete [] , the error is ndr@ndr-ROCM16:~/Desktop/square/new$ make clean && make rm -f *.o square /opt/rocm/hip/bin/hipcc --amdgpu-target=gfx900 square.cpp -o square Referencing function in another module! %call6.i.i = tail call i8* @_Znam(i64 1024) ROCm#3 ; ModuleID = '<stdin>' i8* (i64)* @_Znam ; ModuleID = '#0 0x000000000142b5ea llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0x142b5ea) ROCm#1 0x000000000142968e llvm::sys::RunSignalHandlers() (/opt/rocm/hcc-1.0/compiler/bin/opt+0x142968e) ROCm#2 0x00000000014297dc SignalHandler(int) (/opt/rocm/hcc-1.0/compiler/bin/opt+0x14297dc) ROCm#3 0x00007f84d4a09390 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x11390) ROCm#4 0x0000000000f81eb9 void llvm::VerifierSupport::CheckFailed<llvm::Instruction*, llvm::Module const*, llvm::GlobalValue*, llvm::Module*>(llvm::Twine const&, llvm::Instruction* const&, llvm::Module const* const&, llvm::GlobalValue* const&, llvm::Module* const&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf81eb9) ROCm#5 0x0000000000f8c8bc (anonymous namespace)::Verifier::visitInstruction(llvm::Instruction&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf8c8bc) ROCm#6 0x0000000000f8f7b2 (anonymous namespace)::Verifier::verifyCallSite(llvm::CallSite) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf8f7b2) #7 0x0000000000f919f5 (anonymous namespace)::Verifier::visitCallInst(llvm::CallInst&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf919f5) #8 0x0000000000f95381 llvm::InstVisitor<(anonymous namespace)::Verifier, void>::visit(llvm::Function&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf95381) #9 0x0000000000f97264 (anonymous namespace)::Verifier::verify(llvm::Function const&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf97264) ROCm#10 0x0000000000f9831d (anonymous namespace)::VerifierLegacyPass::runOnFunction(llvm::Function&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf9831d) ROCm#11 0x0000000000f4459a llvm::FPPassManager::runOnFunction(llvm::Function&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf4459a) ROCm#12 0x0000000000f44643 llvm::FPPassManager::runOnModule(llvm::Module&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf44643) ROCm#13 0x0000000000f44104 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf44104) ROCm#14 0x0000000000643b74 main (/opt/rocm/hcc-1.0/compiler/bin/opt+0x643b74) ROCm#15 0x00007f84d3766830 __libc_start_main /build/glibc-bfm8X4/glibc-2.23/csu/../csu/libc-start.c:325:0 ROCm#16 0x000000000068f729 _start (/opt/rocm/hcc-1.0/compiler/bin/opt+0x68f729) Stack dump: 0. Program arguments: /opt/rocm/hcc-1.0/compiler/bin/opt -load /opt/rocm/hcc-1.0/compiler/bin/../lib/LLVMEraseNonkernel.so -inline -inline-threshold=1048576 -erase-nonkernels -dce -globaldce -o /tmp/tmp.LeiH3VuY4Q/kernel-gfx900.hsaco.promote.bc 1. Running pass 'Function Pass Manager' on module '<stdin>'. 2. Running pass 'Module Verifier' on function '@_ZZ4mainEN67HIP_kernel_functor_name_begin_unnamed_HIP_kernel_functor_name_end_419__cxxamp_trampolineEPfS0_m' /opt/rocm/hcc-1.0/compiler/bin/clamp-device: line 140: 18860 Segmentation fault (core dumped) $OPT -load $LIB/LLVMEraseNonkernel.so -inline -inline-threshold=1048576 -erase-nonkernels -dce -globaldce -o $2.promote.bc < $1 Generating AMD GCN kernel failed in HCC-specific opt passes for target: gfx900 clang-5.0: error: command failed with exit code 139 (use -v to see invocation) Died at /opt/rocm/hip/bin/hipcc line 452. Makefile:19: recipe for target 'square' failed make: *** [square] Error 139
Hi @bensander, this ticket has been opened since 2016. Are there still work left to be done on this ticket? Thanks. |
Closing this ticket as it is stale. Please open a new ticket if work is still needed to hipify PARBOIL. Thanks. |
Port PARBOIL benchmark suite from CUDA to HIP.
http://impact.crhc.illinois.edu/parboil/parboil.aspx
http://impact.crhc.illinois.edu/parboil/parboil_download_page.aspx
The text was updated successfully, but these errors were encountered: