Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building on Raspberry Pi with clang 7 does not detect NEON or NEON7 correctly #221

Closed
ast opened this issue Dec 25, 2018 · 12 comments
Closed

Comments

@ast
Copy link
Contributor

ast commented Dec 25, 2018

Why is neon not in available machines? Tests pass.
And why is avx512f;avx512cd there?

-- Performing Test neon_compile_result - Success
-- Performing Test have_neonv7_result
-- Performing Test have_neonv7_result - Success
-- Performing Test have_neonv8_result
-- Performing Test have_neonv8_result - Failed
-- CPU is armv7, Overruled arch neonv8
-- ORC support not found, Overruled arch orc
-- Available architectures: generic;abm;popcount;norc;avx512f;avx512cd
-- Available machines: generic````
@ast ast changed the title Building on Raspberry Pi with clang 7 disables neon support Building on Raspberry Pi with clang 7 does not detect NEON or NEON7 correctly Dec 25, 2018
@ast
Copy link
Contributor Author

ast commented Dec 25, 2018

Digging deeper: I believe this is because volk cmake tests for neon but never actually adds it to the available_archs!? After adding adding the lines below (around line 271 lib/CMakeLists.txt) it builds correctly and works.

I'm using clang 7.0 on a Raspberr Pi 3.

   if (have_neonv7_result)
        OVERRULE_ARCH(neonv8 "CPU is armv7")
        list(APPEND available_archs "neon7") # added this!
        list(APPEND available_archs "neon") # added this!
    endif()

    if (have_neonv8_result)
        OVERRULE_ARCH(neonv7 "CPU is armv8")
        list(APPEND available_archs "neonv8") # added this!
        list(APPEND available_archs "neon") # added this!
    endif()

@ast
Copy link
Contributor Author

ast commented Dec 25, 2018

volk_16i_max_star_horizontal_16i.s fails to build with this configuration but the rest works.

volk_profile shows significant improvements.

@balister
Copy link
Contributor

balister commented Dec 26, 2018

Pretty sure neon is no longer an arch. There are two different flavours now. I'll have to poke around some. For the intel stuff, check if volk is doing compiler tests or looking at flag settings.

@ast
Copy link
Contributor Author

ast commented Dec 26, 2018

Hmm ok.. well it worked as a workaround but maybe it's not optimal? Anyhow cmake fails to enable neon after detecting it.

I'm not so well versed in the build system so I'm not able to suggest a good solution...

Tell me if there's anything I can do to help.

@balister
Copy link
Contributor

balister commented Dec 26, 2018

AH yes, neon selects intrinsics. Can you build with gcc and compare results? I think volk enables all available archs and then disables them when tests fail.

@ast
Copy link
Contributor Author

ast commented Dec 26, 2018

Tried four different combinations. Clang 3.8, Clang 7.0.0, GCC 8 and GCC 6 (default with Raspbian).

Both GCC adds the correct machines like you suspected. However two assembly file fails to compile.

CC=gcc CXX=g++ cmake -DCMAKE_C_FLAGS='-mfpu=neon -mfloat-abi=hard' ..

-- Available architectures: generic;hardfp;neon;neonv7;norc
-- Available machines: generic;neon;neonv7_hardfp

Both clang 3.8 (raspbian included) and clang 7

CC=clang CXX=clang++ cmake -DCMAKE_C_FLAGS='-mfpu=neon -mfloat-abi=hard' ..

-- Available architectures: generic;abm;popcount;norc;avx512f;avx512cd
-- Available machines: generic

If I removed these files and comment out the references everything compiles fine with gcc 6 and 8.

volk_32f_x2_dot_prod_32f_a_neonasm_opts.s_nop
volk_32f_x2_dot_prod_32f_a_neonasm.s_nop

/home/pi/src/volk/kernels/volk/asm/neon/volk_32f_x2_dot_prod_32f_a_neonasm_opts.s: Assembler messages:
/home/pi/src/volk/kernels/volk/asm/neon/volk_32f_x2_dot_prod_32f_a_neonasm_opts.s:46: Error: selected processor does not support `sbfx r11,r1,#2,#1' in ARM mode

@ast
Copy link
Contributor Author

ast commented Dec 26, 2018

Wouldn't it be easier to migrate neon to only intrinsic?

@ast
Copy link
Contributor Author

ast commented Dec 26, 2018

Clang is missing here:

  <flag compiler="gnu">-funsafe-math-optimizations</flag>
  <alignment>16</alignment>
  <check name="has_neon"></check>
</arch>

<arch name="neonv7">
  <flag compiler="gnu">-mfpu=neon</flag>
  <flag compiler="gnu">-funsafe-math-optimizations</flag>
  <alignment>16</alignment>
  <check name="has_neonv7"></check>
</arch>```

And as for avx512 I believe it's missing in the overrule list.

@ast
Copy link
Contributor Author

ast commented Dec 26, 2018

Indeed clang is missing in the definitions here:

<flag compiler="gnu">-funsafe-math-optimizations</flag>

After I've added it, it works as expected (just copied the gnu arguments).

volk_32f_x2_dot_prod_32f_a_neonasm_opts.s_nop
volk_32f_x2_dot_prod_32f_a_neonasm.s_nop

Works fine now in clang but not in gcc (it's related to thumb2/thumb somehow) but I found another bug (and fixed I hope) in:

volk_16i_max_star_horizontal_16i_a_neonasm.s

#222

@balister
Copy link
Contributor

@ast
Copy link
Contributor Author

ast commented Dec 27, 2018

But it seems clang has to be in the XML or neon won't be included at all.. Or are you referring to thumb?

@ast
Copy link
Contributor Author

ast commented Aug 11, 2019

Resolved by
#271
#271

@ast ast closed this as completed Aug 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants