New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
resample: port resample_neon.h to aarch64 #8
Conversation
port optimized inner_product_single and WORD2INT(x) for fixed and floating point from 32 bit armv7 NEON to aarch64 NEON.
#if defined(__aarch64__) | ||
static inline int32_t saturate_float_to_16bit(float a) { | ||
int32_t ret; | ||
asm ("fmov s0, %w[a]\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting an assembler error with gcc 4.9 here:
Error: operand 2 should be an integer register -- fmov s0,v1
Why not:
asm ("fcvtas s0, %s[a]\n"
"sqxtn h0, s0\n"
"sxtl v0.4s, v0.4h\n"
"fmov %w[ret], s0\n"
: [ret] "=r" (ret)
: [a] "w" (a)
: "v0");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Mon, Aug 8, 2016 at 8:26 PM, Tristan Matthews notifications@github.com
wrote:
In libspeexdsp/resample_neon.h
#8 (comment):+#elif defined(FLOATING_POINT)
+#if defined(aarch64)
+static inline int32_t saturate_float_to_16bit(float a) {
- int32_t ret;
- asm ("fmov s0, %w[a]\n"
I'm getting an assembler error with gcc 4.9 here:
Error: operand 2 should be an integer register -- fmov s0,v1doh! My bad. I've reproduced the error with gcc 4.9. It was supposed to
be:
asm ("fmov s0, *%s[a]\*n"
..
Why not:
asm ("fcvtas s0, %s[a]\n" "sqxtn h0, s0\n" "sxtl v0.4s, v0.4h\n" "fmov %w[ret], s0\n" : [ret] "=r" (ret) : [a] "w" (a) : "v0");
at the time I had trouble with clang... it said it couldnt allocate a
register. perhaps because the float is in s0, and I used s0. Reguardless,
I'm able to build your code with clang 3.8 so should be fine.
Changing it to use v1 saves a mov
asm ("fcvtas s1, %s[a]\n"
"sqxtn h1, s1\n"
"sxtl v1.4s, v1.4h\n"
"fmov %w[ret], s1\n"
: [ret] "=r" (ret)
: [a] "w" (a)
: "v1");
0: 5e21c801 fcvtas s1, s0
4: 5e614821 sqxtn h1, s1
8: 0f10a421 sxtl v1.4s, v1.4h
c: 1e260020 fmov w0, s1
10: d65f03c0 ret
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/xiph/speexdsp/pull/8/files/7c20961d4688ca1a486557d88fefcfe4f6c11412#r73993202,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AOOYDR-ch7wMFbOcSTADN7F1NWm2nvKeks5qd_ODgaJpZM4JSTPL
.
Probably @sasshka might help on that. |
The 64 bit NEON in this pull request, is a direct port of the 32 bit version. I tested it with clang, but if you're still seeing saturate_float_to_16bit build errors (compiler bug), try a simple '=r' for the output, which will generate 2 mov's. Or use tristans suggestion, which failed on older clang versions, but the bug is fixed on current versions of clang. |
@fbarchard are you able to reproduce this issue by any chance? |
@fbarchard I squashed your patch and subsequent fix for GCC into this branch: |
I tested on the odroid and opened #17 to make the experience a little nicer. |
Merged as 8ce055a (finally!), thanks |
For speexdsp, support for NEON on aarch64 was added in 1.2.0[1]. [1]: xiph/speexdsp#8 Signed-off-by: Jeffery To <jeffery.to@gmail.com>
For speexdsp, support for NEON on aarch64 was added in 1.2.0[1]. [1]: xiph/speexdsp#8 Signed-off-by: Jeffery To <jeffery.to@gmail.com> (cherry picked from commit dec17eb)
For speexdsp, support for NEON on aarch64 was added in 1.2.0[1]. [1]: xiph/speexdsp#8 Signed-off-by: Jeffery To <jeffery.to@gmail.com> (cherry picked from commit dec17eb)
For speexdsp, support for NEON on aarch64 was added in 1.2.0[1]. [1]: xiph/speexdsp#8 Signed-off-by: Jeffery To <jeffery.to@gmail.com>
For speexdsp, support for NEON on aarch64 was added in 1.2.0[1]. [1]: xiph/speexdsp#8 Signed-off-by: Jeffery To <jeffery.to@gmail.com>
For speexdsp, support for NEON on aarch64 was added in 1.2.0[1]. [1]: xiph/speexdsp#8 Signed-off-by: Jeffery To <jeffery.to@gmail.com>
For speexdsp, support for NEON on aarch64 was added in 1.2.0[1]. [1]: xiph/speexdsp#8 Signed-off-by: Jeffery To <jeffery.to@gmail.com>
port optimized inner_product_single and WORD2INT(x) for fixed
and floating point from 32 bit armv7 NEON to aarch64 NEON.