Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot assemble correct binary for Polaris 10 #6

Closed
OhGodAPet opened this issue Sep 18, 2016 · 29 comments
Closed

Cannot assemble correct binary for Polaris 10 #6

OhGodAPet opened this issue Sep 18, 2016 · 29 comments

Comments

@OhGodAPet
Copy link

Since I'm unable to use the LLVM toolchain, this is the only way I can properly work on kernels - it'd be nice to have Polaris 10 support.

@matszpk
Copy link
Member

matszpk commented Sep 18, 2016

Can you give some failed examples (source code or excerpts)?

@OhGodAPet
Copy link
Author

Anything I disassemble with CLRX - the only example I have right now is long, but I could make a simple anything and compile it using the AMDGPU OCL runtime, and then disassemble, reassemble, and try to load it - says failed to load HSA code object.

@matszpk
Copy link
Member

matszpk commented Sep 19, 2016

Thank you. Can you give me a driver kind and driver version which you are using? I didn't check all devices types given by latest Windows/Linux drivers, so problem can be occurred.

@OhGodAPet
Copy link
Author

16.30 AMDGPU drivers - Linux, but since the binaries are cross compatible, I would expect the same from Windows. Polaris 10 (Ellesmere) was my target.

@matszpk
Copy link
Member

matszpk commented Sep 19, 2016

Thank you for answer. I got this same error while trying to run simple example (GenericPtr and VectorAdd) when OpenCL 2.0 code enabled. clinfo got info that my device (Bonaire) doesn't support OpenCL 2.0 (only OpenCL 1.2 C compiler supported). Check whether AMD GPU PRO supports OpenCL 2.0 for your device. I will be trying to other configs to know how these drivers behaves with CLRX.

@OhGodAPet
Copy link
Author

It works with -cl-std=CL2.0, but I can try different options. From what I see in your disassembler's output, it's using 2.0 automatically.

@matszpk
Copy link
Member

matszpk commented Sep 20, 2016

Few days I ran OpenCL 2.0 example (GenericPtr) on Win 8.1 with latest Crimson 16.9.1 beta drivers. It worked correctly (I was testing older version of CLRX). This is problem only on AMDGPU-PRO driver.

@matszpk
Copy link
Member

matszpk commented Oct 2, 2016

Answer at https://community.amd.com/thread/205476. AMDGPU-PRO doesn't support OpenCL 2.0. This same info is in Releases Notes:
....
Highlights

    Provides support for Radeon™ RX 480, Radeon™ RX 470 and Radeon™ RX 460 series products
    Supported APIs:
        OpenGL 4.5 and GLX 1.4
        OpenCL 1.2
        Vulkan 1.0
        VDPAU
        Vulkan support for DOTA2

....
CLRX already support OpenCL 2.0 format for AMDGPU-PRO, however driver does not.
I removed my old gfx card (Pitcairn GCN 1.0) from motherboard and I ran ubuntu with AMDGPU-PRO. clinfo returned that my Bonaire supports only OpenCL 1.2.

@OhGodAPet
Copy link
Author

So how do I get this to disassemble/assemble as OpenCL 1.2?

@matszpk
Copy link
Member

matszpk commented Oct 5, 2016

I ran VectorAdd example for OpenCL 1.2 and it works good on AMDGPU-PRO. Just, you can try (without build binaries without option '-cl-std=CL2.0' or force use OpenCL 1.2 '-cl-std=CL1.2').

@OhGodAPet
Copy link
Author

I just forced -cl-std=CL1.2 - and it disassembles as .amdcl2.

@matszpk
Copy link
Member

matszpk commented Oct 5, 2016

oh.... I forgot about one. New OpenCL drivers just use OpenCL 2.0 format while using OpenCL 1.2. This is normal behaviour (same as later Windows drivers). Thus, the CLRX recognizes stored binaries as .amdcl2. Maybe later, I will find more time to solve this problem.

@OhGodAPet
Copy link
Author

I'd be willing to donate, if that would help. I really want the ability to use this for my Polaris GPUs on linux.

@matszpk
Copy link
Member

matszpk commented Oct 5, 2016

I apologize for my postponed reaction, but I have really urgent task to do, I couldn't properly understand clue of the problem.

@OhGodAPet
Copy link
Author

That's okay - some other time.

@matszpk
Copy link
Member

matszpk commented Oct 5, 2016

I fixed detection AMDGPU-PRO driver while compilation (by adding extra path to AMDOpenCL library). Now, samples (with OpenCL 2.0 format, even with option "-cl-std=CL2.0") works correctly. With latest version CLRX detect an OpenCL libraries in 2117.07 version. However, disassembled codes should be work correctly, because disassembler includes info about driver version '.driver_version 203603'.
I successfully disassembled simple code compiled for devices Stoney, Ellesmere, Baffin (Polaris). If last fix doesn't solve your problem, please report me about that.
Edit: You can check rapidly that by setting environment variable CLRX_AMDOCL_PATH to "/usr/lib/x86_64-linux-gnu/amdgpu-pro". However, I recommended to download latest sources, build them and try them.

@OhGodAPet
Copy link
Author

Disassembling a binary made with AMD's OCL runtime (AMDGPU-PRO), and then assembling it produces a binary that, if you attempt to load it with AMD's OCL runtime, it throws this error:

[06:19:02] Error -11: Building Program (clBuildProgram)
[06:19:02] Error: AMD HSA Code Object loading failed.

@matszpk
Copy link
Member

matszpk commented Oct 6, 2016

Some hint for you: to disassemble code for assemble you should use options -a (--all) and optionally '-C' (-config). like:
clrxdisasm -aC binary
That output of clrxdisasm can be used to rebuild binary. Option '-a' just add all important info to construct binary by assembler.

@OhGodAPet
Copy link
Author

Just tried -aC - did not work.

@matszpk
Copy link
Member

matszpk commented Oct 6, 2016

Do you check also only '-a' option for clrxdisasm (option '-C' stores kernel configuration in human-readable form, without this option disassembler stores closely this same config as in input file)?
I was comparing differences between original binary generated by AMDGPU-PRO driver and reconstructed binary by CLRX disassembler and assembler (for device Ellesmere). I couldn't find important differences which would be have impact while loading file (kernel config and other metadata).
Can you give a binary for simple kernel (including disassembly code and clrxasm binary) (like VectorAdd) to track differences? What a version of CLRX do you use?

@OhGodAPet
Copy link
Author

I just downloaded it from git last night. I'll try only -a.

@OhGodAPet
Copy link
Author

This is the binary produced by AMD: https://ottrbutt.com/tmp/ethashEllesmeregw256l8.bin
This is the one produced by CLRX: https://ottrbutt.com/tmp/ethashEllesmeregw256l8.bin.clrx
This is the disassembly: https://ottrbutt.com/tmp/ethashEllesmeregw256l8.asm

@matszpk
Copy link
Member

matszpk commented Oct 6, 2016

Thank you. Very likely CLRX assembler stores incorrectly info about constant data (sections, offset, symbols). If I will find free time I will fix that bug.

@OhGodAPet
Copy link
Author

Thank you!

@matszpk
Copy link
Member

matszpk commented Oct 6, 2016

Ok. Can you do yet another thing? Just write a very simple kernel (like vectorAdd), try to rebuild with CLRX and load via OpenCL runtime. Just check, whether CLRX can correctly rebuild simpliest kernel for the AMDGPU-PRO driver. Just like that:

kernel void vectorAdd(uint n, const global float* a, const global float* b,
            global float* c)
{
    uint i = get_global_id(0);
    c[i] = a[i] + b[i];
}

@matszpk
Copy link
Member

matszpk commented Oct 7, 2016

I have some doubts, whether your problems caused by a wrong storing of constant datas by CLRX while building binaries. Can you make simple test described above?

@matszpk
Copy link
Member

matszpk commented Oct 8, 2016

I have no a RX 480, however I was trying load by runtime (AMDGPUPRO) a delivered (by you) binaries. I modified device (to my Bonaire) type in ELF header (no other changes) on both binaries (CLRX and original). Results are:

  • originally generated file by driver couldn't be loaded by my driver installation
  • file regenerated by CLRX was succesfully loaded by my driver installation.

I created simple kernel that have this same amount of constant data (192 bytes) and I was trying to load both original binary and regenerated binary (both compiled for Bonaire). Both was loaded successfully by my driver installation.
Maybe AMDGPU-PRO treats in another way binaries for GCN 1.1 and GCN 1.2? I don't know.

EDIT: I could load (by clBuildProgram) listed above binaries either with "-cl-std=CL1.2", "-cl-std=CL2.0" and without and options.

@matszpk
Copy link
Member

matszpk commented Oct 8, 2016

Very likely, problem has been fixed. I added a missing arch minor, stepping and set architecture type in '.note' section. Very likely should work with GCN 1.2 code.

@matszpk matszpk closed this as completed Oct 8, 2016
@OhGodAPet
Copy link
Author

It appears to work! Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants