Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hipify PARBOIL benchmark suite #3

Closed
bensander opened this issue Jan 23, 2016 · 38 comments
Closed

Hipify PARBOIL benchmark suite #3

bensander opened this issue Jan 23, 2016 · 38 comments

Comments

@bensander
Copy link
Contributor

Port PARBOIL benchmark suite from CUDA to HIP.

http://impact.crhc.illinois.edu/parboil/parboil.aspx
http://impact.crhc.illinois.edu/parboil/parboil_download_page.aspx

@bensander bensander changed the title Port PARBOIL implementation to use HIP Hipify PARBOIL benchmark suite Jan 23, 2016
@LifeIsStrange
Copy link

I don't understund, PARBOIL seems already feature complete in it's openCL branch

@bensander
Copy link
Contributor Author

Good question. Porting to hip offers several advantages over porting to Opencl- you can use c++ features including templates, the port requires fewer lines of code, and hip is an easier port from cuda. In this case parboil already has been ported to cuda, Opencl, and openmp so the hip port would be for comparison with other targets (on performance, coding conplexity, features) and would be useful as an example of what we might see with other programs.

@LifeIsStrange
Copy link

Wow impressive, so HIP is not Just an openCL translator ? You say c++ features, you do hint to syCL ? If a HIP software is better than an openCL software, you should promote more this fact !

@bensander
Copy link
Contributor Author

RIght - on the AMD path HIP code is compiled with "HCC" (Heterogeneous Compute Compiler), which is a C++ compiler that generates code directly for the GPU. HCC does not translate through OpenCL kernel language - the CLANG front-end generates HSAIL or GCN ISA directly.
We talk a bit about the pros and cons in the HIP FAQ, see for example here:
https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/blob/master/docs/markdown/hip_faq.md#how-does-hip-compare-with-opencl

@LifeIsStrange
Copy link

Oh thanks a lot !

@trinayan
Copy link

Where can i find the Parboil benchmark ported to HIP?. Thanks

@bensander
Copy link
Contributor Author

Hi trimaran- this is still in the "up for grabs" category. The 0.86 hip package next week will include a clang version of hipify that will fully automate many of the kernel signatures that require manually adding hipLaunchParm today, could be interesting to try (if you are interested)

On May 13, 2016, at 3:23 PM, trinayan <notifications@github.commailto:notifications@github.com> wrote:

Where can i find the Parboil benchmark ported to HIP?. Thanks

You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHubhttps://github.com//issues/3#issuecomment-219149833

@briansp2020
Copy link

I'm going to give this a try. So far, I got the sgemm_base hip version to compile on my Kaveri desktop machine running ROCm-1.1 (https://github.com/briansp2020/Parboil/tree/hip_sgemm). However, it is producing incorrect result. I'm a bit puzzled as to what maybe wrong since it is a simple algorithm and the porting work was fairly straight forward. I guess it's time for me to get familiar with ROCm debugging tools. :) Does HIP or hcc support printing to screen/saving to a file from GPU kernel? The compiler did not like printf in GPU code.

A few issues I have ran into so far

  1. The ported code produces incorrect result. This is a bit puzzling since the code looks correct to me.
  2. parboil.h assumes that the environment has cl.h (parboil.h assumes that the environment has CL/cl.h abduld/Parboil#1).
  3. hipcc can't be used to compile c code. So, by default, parboil_hip.c is compiled with a gcc and the rest of the files are compiled with hipcc. Having hipcc like front end for c code would also make it simpler to work around the second issue above since I won't have to add __HIP_PLATFORM_HCC__ manually in the Makefile.

@sunway513
Copy link
Contributor

Could you verify that if all the samples and direct tests for HIP could be executed correctly on your system?

I don't think Kaveri desktop is in the support list of chipsets for ROCm:
ROCm Supported Chipsets for Compute Tasks
GPU Family Chipset Name
Volcanic Islands Fiji R9 Fury X, R9 Fury, R9 Nano, S9300x2

@briansp2020
Copy link

@sunway513
Thanks for your suggestion. I had to do some manual tweaking to get Kaveri working since the ROCm packages now install by default with native GCN ISA backend (ROCm/ROCm#4). When I did that, I tested other applications and they seemed to run fine. But that was ROCm-1.0. When I go back home, I'll try samples and tests again.

@briansp2020
Copy link

@sunway513
Samples seem to run fine. But some of the tests are failing. I'll report it to ROCm project and see what they say.

Running tests...
Test project /home/briansp/git/HIP/tests/build
      Start  1: hip_ballot.tst
 1/31 Test  #1: hip_ballot.tst .........................***Failed  Required regular expression not found.Regex=[PASSED
]  0.03 sec
      Start  2: hip_anyall.tst
 2/31 Test  #2: hip_anyall.tst .........................***Failed  Required regular expression not found.Regex=[PASSED
]  0.03 sec
      Start  3: hip_popc.tst
 3/31 Test  #3: hip_popc.tst ...........................   Passed    0.03 sec
      Start  4: hip_brev.tst
 4/31 Test  #4: hip_brev.tst ...........................   Passed    0.03 sec
      Start  5: hip_clz.tst
 5/31 Test  #5: hip_clz.tst ............................   Passed    0.04 sec
      Start  6: hip_ffs.tst
 6/31 Test  #6: hip_ffs.tst ............................   Passed    0.03 sec
      Start  7: hipEventRecord--iterations10.tst
 7/31 Test  #7: hipEventRecord--iterations10.tst .......   Passed    0.15 sec
      Start  8: hipMemset.tst
 8/31 Test  #8: hipMemset.tst ..........................   Passed    0.05 sec
      Start  9: hipMemset--N10--memsetval0x42.tst
 9/31 Test  #9: hipMemset--N10--memsetval0x42.tst ......   Passed    0.03 sec
      Start 10: hipMemset--N10013--memsetval0x5a.tst
10/31 Test #10: hipMemset--N10013--memsetval0x5a.tst ...   Passed    0.03 sec
      Start 11: hipMemset--N256M--memsetval0xa6.tst
11/31 Test #11: hipMemset--N256M--memsetval0xa6.tst ....   Passed    1.01 sec
      Start 12: hipGridLaunch.tst
12/31 Test #12: hipGridLaunch.tst ......................   Passed    0.13 sec
      Start 13: hipEnvVarDriver.tst
13/31 Test #13: hipEnvVarDriver.tst ....................   Passed    0.21 sec
      Start 14: hipMultiThreadStreams2.tst
14/31 Test #14: hipMultiThreadStreams2.tst .............   Passed    0.06 sec
      Start 15: hipMemcpy_simple.tst
15/31 Test #15: hipMemcpy_simple.tst ...................   Passed    0.34 sec
      Start 16: hipMemcpy-modes
16/31 Test #16: hipMemcpy-modes ........................   Passed    6.04 sec
      Start 17: hipMemcpy-size
17/31 Test #17: hipMemcpy-size .........................   Passed    9.81 sec
      Start 18: hipMemcpy-multithreaded
18/31 Test #18: hipMemcpy-multithreaded ................   Passed    0.66 sec
      Start 19: hipHostAlloc.tst
19/31 Test #19: hipHostAlloc.tst .......................   Passed    0.04 sec
      Start 20: hipHcc.tst
20/31 Test #20: hipHcc.tst .............................   Passed    0.03 sec
      Start 21: hipStreamL5.tst
21/31 Test #21: hipStreamL5.tst ........................***Exception: Other  1.41 sec
      Start 22: hipRandomMemcpyAsync.tst
22/31 Test #22: hipRandomMemcpyAsync.tst ...............   Passed    0.05 sec
      Start 23: hipMemoryAllocate.tst
23/31 Test #23: hipMemoryAllocate.tst ..................   Passed    0.21 sec
      Start 24: hipFuncSetDeviceFlags.tst
24/31 Test #24: hipFuncSetDeviceFlags.tst ..............   Passed    0.03 sec
      Start 25: hipFuncGetDevice.tst
25/31 Test #25: hipFuncGetDevice.tst ...................   Passed    0.03 sec
      Start 26: hipFuncSetDevice.tst
26/31 Test #26: hipFuncSetDevice.tst ...................   Passed    0.03 sec
      Start 27: hipFuncDeviceSynchronize.tst
27/31 Test #27: hipFuncDeviceSynchronize.tst ...........   Passed    0.04 sec
      Start 28: hipMultiThreadDevice-serial
28/31 Test #28: hipMultiThreadDevice-serial ............   Passed    0.52 sec
      Start 29: hipMultiThreadDevice-pyramid
29/31 Test #29: hipMultiThreadDevice-pyramid ...........   Passed    2.20 sec
      Start 30: hipMultiThreadDevice-nearzero
30/31 Test #30: hipMultiThreadDevice-nearzero ..........   Passed    6.27 sec
      Start 31: specialFunc.cu.tst
31/31 Test #31: specialFunc.cu.tst .....................   Passed    0.02 sec

90% tests passed, 3 tests failed out of 31

Total Test time (real) =  29.64 sec

The following tests FAILED:
      1 - hip_ballot.tst (Failed)
      2 - hip_anyall.tst (Failed)
     21 - hipStreamL5.tst (OTHER_FAULT)

I was running my iGPU at 900MHz. Maybe that caused some damage. When I backed off GPU frequencies to 750, test #21 stared passing. But it's still intermittent. :( BTW, I did not know about this test and did not run it with ROCm-1.0. So, maybe it never worked reliably on my machine...

@briansp2020
Copy link

Just like HIP test 21 is intermittent on my machine, sgemm hip port is intermittent now on my machine. So, I think I'm having problems with hipStream and my sgemm hip port is OK...

I wish I had a Fury Nano so I can continue my porting work. wink. wink. @bensander

@briansp2020
Copy link

Can someone help me with converting stencile cuda to hip? Its kernel uses dynamic shared memory variable whose size is determined at kernel launch time.

extern __shared__ float sh_A0[];

Hipify took care of the launch code using lp.groupMemBytes. But I'm not sure how to convert the kernel code properly. Hipify left the extern shared as is but hcc complains about it.

Parboil parallel benchmark suite, version 0.2

/opt/rocm/hip/bin/hipcc  -I/home/briansp/git/Parboil/common/include -I/opt/rocm/hip//include  -c src/hip/kernels.cc -o build/hip_default/kernels.o
src/hip/kernels.cc:24:12: error: cannot combine with previous 'extern'
      declaration specifier
    extern __shared__ float sh_A0[];
           ^
/opt/rocm/hip//include/hip/hcc_detail/host_defines.h:53:24: note: expanded from
      macro '__shared__'
#define __shared__     tile_static
                       ^
/home/briansp/git/hcc-build/include/hc_defines.h:46:21: note: expanded from
      macro 'tile_static'
#define tile_static static __attribute__((section("clamp_opencl_local")))
                    ^
1 error generated.
Died at /opt/rocm/hip/bin/hipcc line 230.
make: *** [build/hip_default/kernels.o] Error 2
Run failed!

I could not find documentation on how to do it. Can someone help?

@briansp2020
Copy link

HIP test still fails for me using ROCm-1.1.1. I assume that I'll have to wait till ROCm-1.2 which will support Hawaii since Kaveri & Hawaii both use second generation GCN?
Does the test also fail on Kaveri at AMD? I want to eliminate my hardware as the source of the problem. If it is my hardware, I want to get new hardware so I can continue my work.

@aditya4d1
Copy link
Contributor

Dynamic LDS not yet supported. ROCm is working fine on Kaveri.

@briansp2020
Copy link

@adityaatluri ,
Thanks for your response. Just to double check. On other Kaveri machine the following 3 test are passing, correct?

The following tests FAILED:
      1 - hip_ballot.tst (Failed)
      2 - hip_anyall.tst (Failed)
     21 - hipStreamL5.tst (OTHER_FAULT)

@aditya4d1
Copy link
Contributor

Yes

@briansp2020
Copy link

@adityaatluri
Can you tell me the motherboard and BIOS version used for your Kaveri testing? I'm currently using A88X-PRO with BIOS 2603 (released 2016/04/08). I just want to check all the variables before getting new hardware.

Thanks!

@aditya4d1
Copy link
Contributor

Hi,
The one I am using is older than yours. Seems like most of the tests are running fine. What issue are you trying to solve?

@briansp2020
Copy link

briansp2020 commented Jun 8, 2016

Whatever is causing test 21 to fail is causing my sgemm port to fail. I want to make sure that the cause is not my configuration/BIOS version before going out to get new hardware. Can you tell me your BIOS version? Sometimes, regressions do happen.

Thanks.

@aditya4d1
Copy link
Contributor

Can you put the output in git gist and share the link? Thanks

@briansp2020
Copy link

Output of what? Test output is already posted above.

@aditya4d1
Copy link
Contributor

sgemm port fail.

@briansp2020
Copy link

https://gist.github.com/briansp2020/e9cb4f7a2dcb6be3f4d67bb2d6b412e7

I put the output file for sgemm. The output values change each run. I don't think it's sgemm code. As I said earlier, it sometimes pass just like test 21 sometimes passes. My hipified sgemm code is at
https://github.com/briansp2020/Parboil/tree/hip_sgemm/benchmarks/sgemm/src/hip_base

@briansp2020
Copy link

Different problem.
This time it's with bfs port to hip. Initially, I had problems which is described here. I tried the fix, since the issue was closed. But, now I'm getting the following error.

@Kaveri-Ubuntu:~/git/Parboil$ ./parboil run bfs hip 1M
Parboil parallel benchmark suite, version 0.2

mkdir -p build/hip_default
/opt/rocm/hip/bin/hipcc -ffast-math -I/home/briansp/git/Parboil/common/include -I/opt/rocm/hip//include  -c src/hip/main.cc -o build/hip_default/main.o
/opt/rocm/hip/bin/hipcc -ffast-math -I/home/briansp/git/Parboil/common/include -I/opt/rocm/hip//include  -c src/hip/kernel.cc -o build/hip_default/kernel.o
gcc -ffast-math -D __HIP_PLATFORM_HCC__ -ffast-math -I/home/briansp/git/Parboil/common/include -I/opt/rocm/hip//include  -c /home/briansp/git/Parboil/common/src/parboil_hip.c -o build/hip_default/parboil_hip.o
/home/briansp/git/Parboil/common/src/parboil_hip.c: In function ‘pb_AddSubTimer’:
/home/briansp/git/Parboil/common/src/parboil_hip.c:555:28: warning: embedded ‘\0’ in format [-Wformat-contains-nul]
   sprintf(subtimer->label, "%s\0", label);
                            ^
/opt/rocm/hip/bin/hipcc build/hip_default/main.o build/hip_default/kernel.o build/hip_default/parboil_hip.o -o build/hip_default/bfs -L/opt/rocm/hip/lib/ -lm -lpthread 
/home/briansp/git/hcc/build/compiler/bin/llvm-as: /tmp/tmp.TXt5g3Xhsc/kernel.brig.promote.ll:880:41: error: '%95' defined with type 'i32 addrspace(1)*'
  %pid.0.in.i = phi i32 addrspace(3)* [ %95, %94 ], [ %97, %96 ]
                                        ^
input LLVM IR /tmp/tmp.TXt5g3Xhsc/kernel.brig is not valid
error: failed to load brig /tmp/tmp.TXt5g3Xhsc/kernel.brig
Error 2 (No such file or directory) opening "/tmp/tmp.TXt5g3Xhsc/kernel.brig"
clang-3.5: error: linker command failed with exit code 1 (use -v to see invocation)
Died at /opt/rocm/hip/bin/hipcc line 237.
/home/briansp/git/Parboil/common/mk/hip.mk:85: recipe for target 'build/hip_default/bfs' failed
make: *** [build/hip_default/bfs] Error 2
Run failed!

Can some one take a look? My code is here.

@whchung
Copy link
Contributor

whchung commented Jun 25, 2016

@briansp2020 I'm suspecting this relates to an issue I've been working on recently. Could you help do:

  1. export KMDUMPLLVM=1
  2. rebuild your application
  3. post dump.promote.ll on gist ?

@whchung
Copy link
Contributor

whchung commented Jun 25, 2016

@briansp2020 it does look pretty similar to the issue I've been working on. before there's a fix in the compiler side, could you try the following workaround?

The code in problem is around:
https://github.com/briansp2020/Parboil/blob/hip_bfs/benchmarks/bfs/src/hip/kernel.cc#L225

Please change it to:

    int tid = hipBlockIdx_x*MAX_THREADS_PER_BLOCK + hipThreadIdx_x;
    int pid = q1[tid];
    if( tid<no_of_nodes)
    {
      if(tot_sum == 0)//this is the first BFS level of current kernel call
        ;
      else
        pid = next_wf[tid];//read the current frontier info from last level's propagation

      // Visit a node from the current frontier; update costs, colors, and
      // output queue
      visit_node(pid, hipThreadIdx_x & MOD_OP, g_graph_nodes, g_graph_edges,
                 local_q, overflow, g_color, g_cost, gray_shade);
    }

@briansp2020
Copy link

dump.promote.ll on gist here.

I also tried your suggested work around and I still get the same errer.

@Kaveri-Ubuntu:~/git/Parboil$ ./parboil run bfs hip 1M
Parboil parallel benchmark suite, version 0.2

/opt/rocm/hip/bin/hipcc -ffast-math -I/home/briansp/git/Parboil/common/include -I/opt/rocm/hip//include  -c src/hip/kernel.cc -o build/hip_default/kernel.o
/opt/rocm/hip/bin/hipcc build/hip_default/main.o build/hip_default/kernel.o build/hip_default/parboil_hip.o -o build/hip_default/bfs -L/opt/rocm/hip/lib/ -lm -lpthread 
/home/briansp/git/hcc/build/compiler/bin/llvm-as: /tmp/tmp.w7GcMSgjwp/kernel.brig.promote.ll:877:41: error: '%92' defined with type 'i32 addrspace(1)*'
  %pid.0.in.i = phi i32 addrspace(3)* [ %92, %91 ], [ %96, %95 ]
                                        ^
cp: cannot stat '/tmp/tmp.w7GcMSgjwp/kernel.brig': No such file or directory
input LLVM IR /tmp/tmp.w7GcMSgjwp/kernel.brig is not valid
error: failed to load brig /tmp/tmp.w7GcMSgjwp/kernel.brig
Error 2 (No such file or directory) opening "/tmp/tmp.w7GcMSgjwp/kernel.brig"
clang-3.5: error: linker command failed with exit code 1 (use -v to see invocation)
Died at /opt/rocm/hip/bin/hipcc line 237.
/home/briansp/git/Parboil/common/mk/hip.mk:85: recipe for target 'build/hip_default/bfs' failed
make: *** [build/hip_default/bfs] Error 2
Run failed!

Thanks.

@whchung
Copy link
Contributor

whchung commented Jun 27, 2016

This is indeed the issue I've been investigating lately. The PHI node in the IR contains incoming pointers from different address spaces. It's because pid may contain values either from global segment (q1), or group segment (next_wf). And unfortunately InstCombinePHI is executed prior to address space annotation in the compiler, so that causes incorrect PHI nodes being produced.

I'm still investigating way to come up with a proper fix. Before that we need to change the application to get around this issue. May I understand the way to reproduce this issue is to invoke:

./parboil run bfs hip 1M

in hip_bfs branch?

@whchung
Copy link
Contributor

whchung commented Jun 27, 2016

@briansp2020 Here's a patch which could get around the issue. I've verified it could get pass the error you were facing.

However, the program still won't compile though, and we'll hit another known limitation in the compiler backend where values in global segment (ex: no_of_nodes_vol) can't be initialized yet. I believe a patch to LLVM had been submitted and is waiting for review now. Perhaps @tstellarAMD could chime in here.

@briansp2020
Copy link

@whchung
Yes. You should be able to reproduce the issue by invoking:

./parboil run bfs hip 1M

in hip_bfs branch.

I'll try your patch when I go home today.

Thanks for your help.

@briansp2020
Copy link

I set up Intel i7-6700 + dual Fiji Nano system and tried my sgemm hip port. To my surprise, it failed exactly the same way it did on my Kaveri system. I'll wait for the next release and try it again. But I don't think I'm dealing with hardware failure any more.

Also, hcc/hip seems a bit more buggy when using it with Fiji. I have not done through testing yet. I'll wait for the next release and do more through testing with what I have tried so far. In particular, I want to make sure that Udacity problem set I ported to HIP works. At the moment, the PS1 fails with with a core dump.

@bensander
Copy link
Contributor Author

Hi Brian – our next HIP release will include several stability fixes, specifically in the areas of signal management and managing inflight dependencies.
Are you able to share your code ?

From: Brian notifications@github.com
Reply-To: GPUOpen-ProfessionalCompute-Tools/HIP reply@reply.github.com
Date: Monday, July 18, 2016 at 9:41 PM
To: GPUOpen-ProfessionalCompute-Tools/HIP HIP@noreply.github.com
Cc: Benjamin Sander ben.sander@amd.com, Mention mention@noreply.github.com
Subject: Re: [GPUOpen-ProfessionalCompute-Tools/HIP] Hipify PARBOIL benchmark suite (#3)

I set up Intel i7-6700 + dual Fiji Nano system and tried my sgemm hip port. To my surprise, it failed exactly the same way it did on my Kaveri system. I'll wait for the next release and try it again. But I don't think I'm dealing with hardware failure any more.

Also, hcc/hip seems a bit more buggy when using it with Fiji. I have not done through testing yet. I'll wait for the next release and do more through testing with what I have tried so far. In particular, I want to make sure that Udacity problem sethttps://github.com/briansp2020/cs344/tree/master/Problem%20Sets/Problem%20Set%201 I ported to HIP works. At the moment, the PS1 fails with with a core dump.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com//issues/3#issuecomment-233515491, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACYSAuTPNev_8PVQAllFEWkMKZl5drXVks5qXDlrgaJpZM4HK-Ql.

@briansp2020
Copy link

My codes are in my github account.
Parboil : https://github.com/briansp2020/Parboil/tree/hip_port
cs344 Problem Set 1 : https://github.com/briansp2020/cs344/tree/master/Problem%20Sets/Problem%20Set%201

I'll test when the next version is released.

@briansp2020
Copy link

briansp2020 commented Aug 14, 2016

It looks like ROCm 1.2 fixed my sgemm problem. I still get core dump when running cs344 problem set port.

@briansp2020
Copy link

Hi,
I updated may parboil to use ROCm 1.5 and have noticed some problems.
sgemm hip port is occasionally failing. This reminds me of the problem I had when using 1.1 & 1.2.
Usually running it the second time passes.

briansp@FijiX2:~/git/Parboil$ ./parboil run sgemm hip small
Parboil parallel benchmark suite, version 0.2

make: Nothing to be done for 'default'.
Resolving HIP runtime library...
#@ LD_LIBRARY_PATH=/opt/rocm/hip/lib/ ldd build/hip_default/sgemm | grep hip
Opening file:/home/briansp/git/Parboil/datasets/sgemm/small/input/matrix1.txt
Matrix dimension: 128x96
Opening file:/home/briansp/git/Parboil/datasets/sgemm/small/input/matrix2t.txt
Matrix dimension: 160x96
Opening file:/home/briansp/git/Parboil/benchmarks/sgemm/run/small/matrix3.txt for write.
Matrix dimension: 128x160
GFLOPs = 21.0276
IO        : 0.017469
Kernel    : 0.000187
Copy      : 0.000125
Driver    : 0.000187
Compute   : 0.000212
CPU/Kernel Overlap: 0.000187
Timer Wall Time: 0.018012
Computed values do not match the expected values

Mismatch
Output checking tool detected a mismatch
briansp@FijiX2:~/git/Parboil$ ./parboil run sgemm hip small
Parboil parallel benchmark suite, version 0.2

make: Nothing to be done for 'default'.
Resolving HIP runtime library...
#@ LD_LIBRARY_PATH=/opt/rocm/hip/lib/ ldd build/hip_default/sgemm | grep hip
Opening file:/home/briansp/git/Parboil/datasets/sgemm/small/input/matrix1.txt
Matrix dimension: 128x96
Opening file:/home/briansp/git/Parboil/datasets/sgemm/small/input/matrix2t.txt
Matrix dimension: 160x96
Opening file:/home/briansp/git/Parboil/benchmarks/sgemm/run/small/matrix3.txt for write.
Matrix dimension: 128x160
GFLOPs = 21.0276
IO        : 0.018051
Kernel    : 0.000187
Copy      : 0.000129
Driver    : 0.000188
Compute   : 0.000281
CPU/Kernel Overlap: 0.000187
Timer Wall Time: 0.018670
Pass
briansp@FijiX2:~/git/Parboil$ 

Could some one help?

I also ran into some other issues that I reported here

It seems the steps to run HIP test has changed. Following the steps here. In fact, there is no CMakefiles.txt in src directory any more.

MikalaiDrabovich added a commit to MikalaiDrabovich/HIP that referenced this issue Aug 4, 2017
Is 'new' keyword supported? Malloc/free way work fine, but not new/delete.

If lines 45-46 added, it compiler error is the following:

ndr@ndr-ROCM16:~/Desktop/square/new$ make clean && make
rm -f *.o square
/opt/rocm/hip/bin/hipcc --amdgpu-target=gfx900 square.cpp -o square
Referencing function in another module!
  %call6.i.i = tail call i8* @_Znam(i64 1024) ROCm#3
; ModuleID = '<stdin>'
i8* (i64)* @_Znam
; ModuleID = '#0 0x000000000142b5ea llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0x142b5ea)
ROCm#1 0x000000000142968e llvm::sys::RunSignalHandlers() (/opt/rocm/hcc-1.0/compiler/bin/opt+0x142968e)
ROCm#2 0x00000000014297dc SignalHandler(int) (/opt/rocm/hcc-1.0/compiler/bin/opt+0x14297dc)
ROCm#3 0x00007f22f9e4c390 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x11390)
ROCm#4 0x0000000000f81eb9 void llvm::VerifierSupport::CheckFailed<llvm::Instruction*, llvm::Module const*, llvm::GlobalValue*, llvm::Module*>(llvm::Twine const&, llvm::Instruction* const&, llvm::Module const* const&, llvm::GlobalValue* const&, llvm::Module* const&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf81eb9)
ROCm#5 0x0000000000f8c8bc (anonymous namespace)::Verifier::visitInstruction(llvm::Instruction&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf8c8bc)
ROCm#6 0x0000000000f8f7b2 (anonymous namespace)::Verifier::verifyCallSite(llvm::CallSite) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf8f7b2)
#7 0x0000000000f919f5 (anonymous namespace)::Verifier::visitCallInst(llvm::CallInst&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf919f5)
#8 0x0000000000f95381 llvm::InstVisitor<(anonymous namespace)::Verifier, void>::visit(llvm::Function&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf95381)
#9 0x0000000000f97264 (anonymous namespace)::Verifier::verify(llvm::Function const&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf97264)
ROCm#10 0x0000000000f9831d (anonymous namespace)::VerifierLegacyPass::runOnFunction(llvm::Function&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf9831d)
ROCm#11 0x0000000000f4459a llvm::FPPassManager::runOnFunction(llvm::Function&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf4459a)
ROCm#12 0x0000000000f44643 llvm::FPPassManager::runOnModule(llvm::Module&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf44643)
ROCm#13 0x0000000000f44104 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf44104)
ROCm#14 0x0000000000643b74 main (/opt/rocm/hcc-1.0/compiler/bin/opt+0x643b74)
ROCm#15 0x00007f22f8ba9830 __libc_start_main /build/glibc-bfm8X4/glibc-2.23/csu/../csu/libc-start.c:325:0
ROCm#16 0x000000000068f729 _start (/opt/rocm/hcc-1.0/compiler/bin/opt+0x68f729)
Stack dump:
0.	Program arguments: /opt/rocm/hcc-1.0/compiler/bin/opt -load /opt/rocm/hcc-1.0/compiler/bin/../lib/LLVMEraseNonkernel.so -inline -inline-threshold=1048576 -erase-nonkernels -dce -globaldce -o /tmp/tmp.vqMJlUNjk9/kernel-gfx900.hsaco.promote.bc 
1.	Running pass 'Function Pass Manager' on module '<stdin>'.
2.	Running pass 'Module Verifier' on function '@_ZZ4mainEN67HIP_kernel_functor_name_begin_unnamed_HIP_kernel_functor_name_end_419__cxxamp_trampolineEPfS0_m'
/opt/rocm/hcc-1.0/compiler/bin/clamp-device: line 140: 18412 Segmentation fault      (core dumped) $OPT -load $LIB/LLVMEraseNonkernel.so -inline -inline-threshold=1048576 -erase-nonkernels -dce -globaldce -o $2.promote.bc < $1
Generating AMD GCN kernel failed in HCC-specific opt passes for target: gfx900
/opt/rocm/hcc/bin/hcc(_ZN4llvm3sys15PrintStackTraceERNS_11raw_ostreamE+0x2a)[0x1674f1a]
/opt/rocm/hcc/bin/hcc(_ZN4llvm3sys17RunSignalHandlersEv+0x3e)[0x1672fbe]
/opt/rocm/hcc/bin/hcc[0x167310c]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f69bbc98390]
[0x7f69bc0c8a10]
Stack dump:
0.	Program arguments: /opt/rocm/hcc/bin/hcc -hc -D__HIPCC__ -I/opt/rocm/hcc/include -I/opt/rocm/hip/include/hip/hcc_detail/cuda -I/opt/rocm/hsa/include -Wno-deprecated-register -I/opt/rocm/profiler/CXLActivityLogger/include -I/opt/rocm/hip/include -DHIP_VERSION_MAJOR=1 -DHIP_VERSION_MINOR=2 -DHIP_VERSION_PATCH=17284 -D__HIP_ARCH_GFX900__=1 -Wl,--rpath=/opt/rocm/hip/lib /opt/rocm/hip/lib/libhip_hcc.so /opt/rocm/hip/lib/libhip_device.a -hc -std=c++amp -L/opt/rocm/hcc-1.0/lib -Wl,--rpath=/opt/rocm/hcc-1.0/lib -ldl -lm -lpthread -lunwind -Wl,--whole-archive -lmcwamp -Wl,--no-whole-archive -lsupc++ -L/opt/rocm/hsa/lib -L/opt/rocm/lib -lhsa-runtime64 -lhc_am -lhsakmt -L/opt/rocm/profiler/CXLActivityLogger/bin/x86_64 -lCXLActivityLogger -Wl,--rpath=/opt/rocm/profiler/CXLActivityLogger/bin/x86_64 -lm --amdgpu-target=gfx900 --amdgpu-target=gfx900 square.cpp -o square 
Died at /opt/rocm/hip/bin/hipcc line 452.
Makefile:19: recipe for target 'square' failed
make: *** [square] Error 255

 With delete [] , the error is 


ndr@ndr-ROCM16:~/Desktop/square/new$ make clean && make
rm -f *.o square
/opt/rocm/hip/bin/hipcc --amdgpu-target=gfx900 square.cpp -o square
Referencing function in another module!
  %call6.i.i = tail call i8* @_Znam(i64 1024) ROCm#3
; ModuleID = '<stdin>'
i8* (i64)* @_Znam
; ModuleID = '#0 0x000000000142b5ea llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0x142b5ea)
ROCm#1 0x000000000142968e llvm::sys::RunSignalHandlers() (/opt/rocm/hcc-1.0/compiler/bin/opt+0x142968e)
ROCm#2 0x00000000014297dc SignalHandler(int) (/opt/rocm/hcc-1.0/compiler/bin/opt+0x14297dc)
ROCm#3 0x00007f84d4a09390 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x11390)
ROCm#4 0x0000000000f81eb9 void llvm::VerifierSupport::CheckFailed<llvm::Instruction*, llvm::Module const*, llvm::GlobalValue*, llvm::Module*>(llvm::Twine const&, llvm::Instruction* const&, llvm::Module const* const&, llvm::GlobalValue* const&, llvm::Module* const&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf81eb9)
ROCm#5 0x0000000000f8c8bc (anonymous namespace)::Verifier::visitInstruction(llvm::Instruction&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf8c8bc)
ROCm#6 0x0000000000f8f7b2 (anonymous namespace)::Verifier::verifyCallSite(llvm::CallSite) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf8f7b2)
#7 0x0000000000f919f5 (anonymous namespace)::Verifier::visitCallInst(llvm::CallInst&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf919f5)
#8 0x0000000000f95381 llvm::InstVisitor<(anonymous namespace)::Verifier, void>::visit(llvm::Function&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf95381)
#9 0x0000000000f97264 (anonymous namespace)::Verifier::verify(llvm::Function const&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf97264)
ROCm#10 0x0000000000f9831d (anonymous namespace)::VerifierLegacyPass::runOnFunction(llvm::Function&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf9831d)
ROCm#11 0x0000000000f4459a llvm::FPPassManager::runOnFunction(llvm::Function&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf4459a)
ROCm#12 0x0000000000f44643 llvm::FPPassManager::runOnModule(llvm::Module&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf44643)
ROCm#13 0x0000000000f44104 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/opt/rocm/hcc-1.0/compiler/bin/opt+0xf44104)
ROCm#14 0x0000000000643b74 main (/opt/rocm/hcc-1.0/compiler/bin/opt+0x643b74)
ROCm#15 0x00007f84d3766830 __libc_start_main /build/glibc-bfm8X4/glibc-2.23/csu/../csu/libc-start.c:325:0
ROCm#16 0x000000000068f729 _start (/opt/rocm/hcc-1.0/compiler/bin/opt+0x68f729)
Stack dump:
0.	Program arguments: /opt/rocm/hcc-1.0/compiler/bin/opt -load /opt/rocm/hcc-1.0/compiler/bin/../lib/LLVMEraseNonkernel.so -inline -inline-threshold=1048576 -erase-nonkernels -dce -globaldce -o /tmp/tmp.LeiH3VuY4Q/kernel-gfx900.hsaco.promote.bc 
1.	Running pass 'Function Pass Manager' on module '<stdin>'.
2.	Running pass 'Module Verifier' on function '@_ZZ4mainEN67HIP_kernel_functor_name_begin_unnamed_HIP_kernel_functor_name_end_419__cxxamp_trampolineEPfS0_m'
/opt/rocm/hcc-1.0/compiler/bin/clamp-device: line 140: 18860 Segmentation fault      (core dumped) $OPT -load $LIB/LLVMEraseNonkernel.so -inline -inline-threshold=1048576 -erase-nonkernels -dce -globaldce -o $2.promote.bc < $1
Generating AMD GCN kernel failed in HCC-specific opt passes for target: gfx900
clang-5.0: error:  command failed with exit code 139 (use -v to see invocation)
Died at /opt/rocm/hip/bin/hipcc line 452.
Makefile:19: recipe for target 'square' failed
make: *** [square] Error 139
@nartmada
Copy link

Hi @bensander, this ticket has been opened since 2016. Are there still work left to be done on this ticket? Thanks.

@nartmada
Copy link

nartmada commented Mar 2, 2024

Closing this ticket as it is stale. Please open a new ticket if work is still needed to hipify PARBOIL. Thanks.

@nartmada nartmada closed this as not planned Won't fix, can't repro, duplicate, stale Mar 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants