-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
undefined behavior warning on arm with -mfpu=neon #88
Comments
Hmm, looks like it tries to vectorize the loop and run into aliasing issue. Can you try -fno-strict-aliasing option? |
No effect. Lowering optimization to -O1 does eliminate the warning, so I'm now trying a binary search of the optimization settings that differ between O1 and O2, perhaps the specific flag that triggers it might give a clue |
|
and of course |
Hmmm.
Nothing to do with aliasing. Looks like gcc bug after all. |
Yup. Further analysis (all tests done with The "offending" call is this one in L3_decode: The warning can also be suppressed by any of the following changes to
float vl0 = left [i], vl1 = left [i+1], vl2 = left [i+2], vl3 = left [i+3];
float vr0 = right[i], vr1 = right[i+1], vr2 = right[i+2], vr3 = right[i+3];
left [i] = vl0 + vr0; left [i+1] = vl1 + vr1; left [i+2] = vl2 + vr2; left [i+3] = vl3 + vr3;
right[i] = vl0 - vr0; right[i+1] = vl1 - vr1; right[i+2] = vl2 - vr2; right[i+3] = vl3 - vr3;
right[i] = left[i]; right[i+1] = left[i+1]; right[i+2] = left[i+2]; right[i+3] = left[i+3]; However, none of the following changes to
In retrospect there was no need to try all these variations of the SIMD loop body but the inconsistent behaviour from gcc puzzled me greatly. |
Workaround for issue lieff#88
Issue is known and being worked on: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100801 |
Thanks for such detailed research) |
That works too of course but my suggested patch is pretty simple too and has the benefit that it doesn't affect optimization, avoiding any need to check performance impact. This seemed desirable considering this isn't a case of misoptimization, it's merely a spurious warning. |
I've checked code sizes for x64 and looks like your workaround is better:
That`s strange that noinline increases code size, but ok, picking the best one) |
Agreed. I'm also surprised my change makes a noticable difference, that doesn't really make sense either. Compilers are weird. |
Resolved, so closing. |
When building on armhf with
-mfpu=neon
using gcc 10.2, I get the following failure:This seems really puzzling... the loop it's referring to can never do more than 576 iterations. But perhaps it's talking about the loop after it's done structural transformations and there's an actual issue somewhere higher up? Either way, regardless of whether this is a minimp3 bug or a gcc bug, it's making me nervous.
The text was updated successfully, but these errors were encountered: