Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime error: aqlprofile API table load failed #29

Closed
cgleggett opened this issue Jun 26, 2020 · 7 comments
Closed

runtime error: aqlprofile API table load failed #29

cgleggett opened this issue Jun 26, 2020 · 7 comments

Comments

@cgleggett
Copy link

After updating rocm from 3.3.0 to 3.5.1, rebuilding rocprofiler and roctracer, I get the following error when profiling an executable (which uses an AMD Vega 56 GPU):

> rocprof --stats -o rocpf_stat.csv the_prog
RPL: on '200626_131451' from '/opt/rocm-3.5.1/rocprofiler/rocprofiler' in '/home/leggett/work/fcs/bk_hip'
RPL: profiling '"runTFCSSimulation"'
RPL: input file ''
RPL: output dir '/tmp/rpl_data_200626_131451_50543'
RPL: result dir '/tmp/rpl_data_200626_131451_50543/input_results_200626_131451'
ROCProfiler: input from "/tmp/rpl_data_200626_131451_50543/input.xml"
  0 metrics
aqlprofile API table load failed: HSA_STATUS_ERROR: A generic error has occurred.
( program exits )

I see a similar error when doing --hsa-trace

rocprof --hsa-trace -o rocpf_hsa.csv the_prog
RPL: on '200626_131810' from '/opt/rocm-3.5.1/rocprofiler/rocprofiler' in '/home/leggett/work/fcs/bk_hip'
RPL: profiling '"runTFCSSimulation"'
RPL: input file ''
RPL: output dir '/tmp/rpl_data_200626_131810_50607'
RPL: result dir '/tmp/rpl_data_200626_131810_50607/input_results_200626_131810'
ROCProfiler: input from "/tmp/rpl_data_200626_131810_50607/input.xml"
  0 metrics
ROCTracer (pid=50626): 
    HSA-trace()
    HSA-activity-trace()
aqlprofile API table load failed: HSA_STATUS_ERROR: A generic error has occurred.
File 'rocpf_hsa.hsa_stats.csv' is generating

File 'rocpf_hsa.json' is generating

File 'rocpf_hsa.json' is generating

this is on a centos7 host.

@eshcherb
Copy link

Please set:
$ export LD_LIBRARY_PATH=/opt/rocm/hsa-amd-aqlprofile/lib

It will be fixed in 3.6 release

@cgleggett
Copy link
Author

there is no /opt/rocm/hsa-amd-aqlprofile/lib directory. Which package is supposed to install it?

@eshcherb
Copy link

eshcherb commented Jun 27, 2020

The package is 'hsa-amd-aqlprofile'.
Do you have /opt/rocm? - you might have /opt/rocm-<rev>, something like: /opt/roccm-3.5.1
So then set path to /opt/rocm-3.5.1/hsa-amd-aqlprofile/lib

@cgleggett
Copy link
Author

ah, ok. It was installed by hsa-amd-aqlprofile-1.0.0-1.x86_64, but got wiped out when I removed the 3.3 release of rocm to upgrade to 3.5.

@cgleggett
Copy link
Author

ok, looks good now.

thanks!

@eshcherb
Copy link

Could you close the ticket?

@cgleggett
Copy link
Author

BTW, I did need to turn on object-tracking, otherwise I got a core dump.

error(4096) "QueryKernelName(), Error: V3 code object detected - code objects tracking should be enabled
"
/opt/rocm-3.5.1/bin/rocprof: line 275: 32039 Aborted                 (core dumped) "runTFCSSimulation"

not the most elegant exit scenario, but at least the error message is clear ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants