Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aarch32, GCC 4.9 and internal compiler error: in expand_shift_1, at expmed.c:2318 #233

Closed
noloader opened this issue Jul 28, 2016 · 1 comment
Labels
arm Bug gcc-4 GCC compiler version 4

Comments

@noloader
Copy link
Collaborator

We got an Aarch32 test case together with the help of GCC and SO. Aarch32 is basically the 32-bit instruction set from ARMv8. Its slightly different than the traditional 32-bit instruction set from ARMv7 and below. Also see ARM NEON programming quick reference.

We are experiencing a compiler crash on a Raspberry Pi 3 undr GCC 4.9.2:

$ make cpu.o
g++ -DNDEBUG -g2 -O3 -march=armv8-a+crc -mtune=cortex-a53 -mfpu=crypto-neon-fp-armv8 -fPIC -c cpu.cpp
cpu.cpp: In function ‘bool CryptoPP::TryPMULL()’:
cpu.cpp:476:16: internal compiler error: in expand_shift_1, at expmed.c:2318
   result = (r1 != r2);
                ^
Please submit a full bug report...

Here's the relevant code from cpu.cpp. It simply tests for the availability of PMULL and PMULL2:

const poly64_t a1={1}, b1={2};
const poly64x2_t a2={1}, b2={2};
const poly128_t r1 = vmull_p64(a1, b1);
const poly128_t r2 = vmull_high_p64(a2, b2);

result = (r1 != r2);

According to the GCC folks, the comparison is dodgy. Also see Issue 72738 - internal compiler error: in expand_shift_1, at expmed.c:2318.

I kind of knew it was dodgy, but I did not pay it much mind. It was a low priority because I was having trouble finding the intrinsic to convert it to a *_u64 or *_u32 type. The code also worked under ARMv7 and ARMv8/Aarch64. Finally, there were no diagnostics at -Wall -Wextra.

Once in a more normal form, we can simply compare values after lane extraction. For example, the other tests perform:

result = !!(vgetq_lane_u64(x3,0) | vgetq_lane_u64(x4,1));
@noloader
Copy link
Collaborator Author

noloader commented Jul 29, 2016

The fix was easy enough... Stop using == to compare poly128_t types. We could not use vreinterpretq_u64_p128 to cast between r1 and t1 because Linaro is missing it:

const poly64_t a1={2}, b1={3};
const poly128_t r1 = vmull_p64(a1, b1);
const uint64x2_t& t1 = (uint64x2_t)(r1);  // {6,0}
...
result = !!(vgetq_lane_u64(t1,0) == 0x06 && vgetq_lane_u64(t1,1) == 0x00 ...);

Also see Commit 0db3a4e5d7b65e98.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arm Bug gcc-4 GCC compiler version 4
Projects
None yet
Development

No branches or pull requests

1 participant