Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llvm toolchain #202

Closed
talhaber opened this issue Jul 30, 2018 · 13 comments
Closed

llvm toolchain #202

talhaber opened this issue Jul 30, 2018 · 13 comments

Comments

@talhaber
Copy link

I tryied compiling the library for crosscompilation for ARMv8 with llvm3.8 toolchain (it supports the NEON intrinsics well).
It seems the compiler does not recognize the assembler in the NE10 library.
Is there a source fit for the llvm toolchain?
Is there a script I can use manipulate the assemler to be supported?

@lieff
Copy link

lieff commented Jul 30, 2018

Try to disable NE10_ASM_OPTIMIZATION parameter.

@talhaber
Copy link
Author

Let me explain my setup:

  1. I am trying to compile the library for VxWorks operating system (hoping the library is agnostic to the OS because it supposed to be agnostic to the platform). the llvm toolchain (3.8.1.1) support NEON intrinsics well.
  2. after having a lot of problems integrating cmake with the llvm toolchain integrated VxWorks IDE,
    I tried to manually take the source code and build it without cmake (and cherry pick the ARMv8 parts from the sources)
  3. I noticed that any assembly source I try to compile is mostly errors (the C sources seems fine), even the comments (obviously @ is not recognized as a comment in llvm). the llvm does not understand any of the assembly (it is GNU, assembly I think), obviously it is for a different assembler.
  4. I found the python script in the tools folder and I found the converter for clag from GAS. it didn't work for my llvm toolchain

is there a script (or somting) I can use?
or am I in the wrong direction completely?

@lieff
Copy link

lieff commented Jul 30, 2018

You can use intrinsic versions instead of asm, here where it's configured in cmake:

    if(NE10_ASM_OPTIMIZATION)
        set(NE10_DSP_NEON_SRCS
            ${NE10_DSP_NEON_SRCS}
            ${PROJECT_SOURCE_DIR}/modules/dsp/NE10_fft_float32.neon.s
            ${PROJECT_SOURCE_DIR}/modules/dsp/NE10_fft_int32.neon.s
            ${PROJECT_SOURCE_DIR}/modules/dsp/NE10_fft_int16.neon.s
        )
        set(NE10_DSP_INTRINSIC_SRCS
            ${NE10_DSP_INTRINSIC_SRCS}
            ${PROJECT_SOURCE_DIR}/modules/dsp/NE10_fft_float32.neon.c
            ${PROJECT_SOURCE_DIR}/modules/dsp/NE10_fft_int32.neon.c
            ${PROJECT_SOURCE_DIR}/modules/dsp/NE10_fft_int16.neon.c
        )
    else()
        add_definitions(-DNE10_UNROLL_LEVEL=1)
        set(NE10_DSP_INTRINSIC_SRCS
            ${NE10_DSP_INTRINSIC_SRCS}
            ${PROJECT_SOURCE_DIR}/modules/dsp/NE10_fft_float32.neonintrinsic.c
            ${PROJECT_SOURCE_DIR}/modules/dsp/NE10_fft_int32.neonintrinsic.c
            ${PROJECT_SOURCE_DIR}/modules/dsp/NE10_fft_int16.neonintrinsic.c
            ${PROJECT_SOURCE_DIR}/modules/dsp/NE10_rfft_float32.neonintrinsic.c
            )
    endif()

@talhaber
Copy link
Author

thank you very much for your quick answer.

so I understand that in the intrinsics vertion I don't need any of the assemby sources, am I right?
how is it, performance wise, compared to the assembly version? is the difference significant?

thank you in advance

@Phillip-Wang
Copy link
Contributor

You don't need the assembly sources if you use intrinsics. The performance of the intrinsics version depends on the compile very much. There are cases where the intrinsics version is a little bit faster. Usually, with a latest compiler, the difference is not significant.

@talhaber
Copy link
Author

talhaber commented Aug 7, 2018

OK, trying to build the library with the VxWorks doesn't support the intrinsics (it fails at build time) and even windrever themselves can't build it. I am trying a different approch and crosscompile it with cmake (like it supposed to be built).

my setup is:
PC: intel with windows7
target arch: ARM cortex A ARMv8. added the cmake var NE10_LINUX_TARGET_ARCH "aarch64"
cmake: 3.10.3
I downlowded the llvm 3.8 for windows from the pre-compiled binaries.
the make generator is minGW

I edited the toolchain file to (after successful cross-compiling with gcc linario as a refference in a linux virtual machine):
set CMAKE_SYSTEM_NAME Generic
set CMAKE_SYSTEM_PROCESSOR arm
set CMAKE_CROSSCOMPILE 1

I skipped the compiler testing (it fails when the cmake uses operating system utilities whitch I don't care for because I need to integrate to a different OS)
set CMAKE_CXX_SYSTEM_COMPILER_WORKS TRUE
set CMAKE_C_SYSTEM_COMPILER_WORKS TRUE
set CMAKE_ASM_SYSTEM_COMPILER_WORKS TRUE

and set the C compiler to calng-cl and the CXX to clang++

it builds until it gets to a neon intrinsics object file (NE10_fft_float32.neonintrinsics.c.obj)
and fail in line 32
the error is "NEON support is not enabled"

what do I do now?

@lieff
Copy link

lieff commented Aug 7, 2018

So, your compiler do not support intrinsics and gcc asm syntax. You can only build generic C version using this tools them. But why you are using clang if your target is cortex? Why not gcc-aarch64-linux-gnu? Here base sample https://github.com/jserv/armv8-hello

@lieff
Copy link

lieff commented Aug 7, 2018

I've checked that NDK`s clang supports intrinsics. Some intrinsics relocated to different include file:

#include <arm_acle.h>

So, if you can bypass ABI differences, you can try NDK clang compiler.

@talhaber
Copy link
Author

talhaber commented Aug 7, 2018

I did managged to run only the pure C implimimentation. for example I succeed with the ne10_mulmat_3x3f_c and failed in ne10_mulmat_3x3f_neon (and ne10_mulmat_3x3f_asm). it seems my problem is more than the neon assembly but the gnu assembly in general.

I did compiled the library with gcc-linaro successfully and tryied linking it with the VxWorks kernel (I have to compile the kernel with windriver toolchain for ARMv8A whitch is based on llvm3.8) but when I call functions from the library I get undefined reference (I assume it is ABI incompatability because I see the linker links the library).

@lieff
Copy link

lieff commented Aug 7, 2018

Hmm, it can be float ABI difference. I remember same link problem when code compiled with different -mfloat-abi . But it correctly point about ABI difference, not just symbol not found. May be your linker just ignore symbol silently.

@lieff
Copy link

lieff commented Aug 7, 2018

I barely remember, ABI information stored in section .ARM.attributes, you can compare this section from two compilers, they should be same.

@talhaber
Copy link
Author

talhaber commented Aug 9, 2018

OK a little progress:

I managed to build the library with llvm also.
I did it by adding to the C compiler flags:
--target=aarch64-arm-none-eabi -DCPU=_VX_ARMARCH8A -I {VSB_PATH}\krnl\h\public -MM _MG.

I am trying to link the library to the VIP project. when I add calls to functions from the .a library I get a undefined reference error.
I have built a linker address map file and I see the library is loaded but I don't see any symbol from the library.

It is the same phenomenon I saw when I build the library with gcc-linaro (but now I build the kernel and the NE10 library with the same toolchain)!
maybe it isn't ABI problem after all?
The test function I try to run is NE10_sample_matrix_multiply.c
I included the paths of inc common and test/include to my eclipse include paths.
The unresolved symbols are ne10_init and ne10_mulmat

@talhaber
Copy link
Author

SOLVED.

I used the cmake gui to set all the tools of the llvm toolchain (I needed the linker)
additionaly I add some some flags for the target and include to some OS source folders and the objects are being built.
then it fails in linking the objects so I edited the link.txt to run the archiver and runlib.

it built the dsp and imageproc but I need math also.
I think it is an issue for an other case so I will close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants