New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use compiler intrinsics for byte swapping #20139
Conversation
#ifdef __x86_64__ | ||
# define LEGACY_REGS "=Q" | ||
#if defined(__clang__) || (defined(__GNUC__) && (__GNUC__ > 4 || __GNUC_MINOR__ >= 8)) | ||
#define bswap_16(x) __builtin_bswap16(x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be available in https://gcc.gnu.org/onlinedocs/gcc-4.5.0/gcc/Other-Builtins.html which is older than our minimum gcc version requirement so remove the version check (or at least use it on 4.7)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linux seemed to think 4.8 was the first one that had it. https://github.com/torvalds/linux/blob/master/include/linux/compiler-gcc.h#L261
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you are right. bswap16 is only in 4.8....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently gcc 4.7 is mostly-in-theory the minimum for llvm since it wasn't the default system compiler on many particularly commonly tested distros.
#define bswap_32(x) _byteswap_ulong(x) | ||
#define bswap_64(x) _byteswap_uint64(x) | ||
#elif defined(__INTEL_COMPILER) | ||
#define bswap_16(X) _bswap16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(X)
?
There is really no advantage to messing with inline assembly here, but plenty of disadvantages. The compiler can't optimize this for whatever domain the data happens to be in (integer vs SSE+ registers), some of the code violates the C standard, and it generates warnings on newer gcc platforms (on i686 at least). Just get rid of it, use the compiler intrinsics, and fall back to generic C instructions for unknown compilers.
There is really no advantage to messing with inline assembly here,
but plenty of disadvantages. The compiler can't optimize this for
whatever domain the data happens to be in (integer vs SSE+ registers),
some of the code violates the C standard, and it generates warnings
on newer gcc platforms (on i686 at least). Just get rid of it, use
the compiler intrinsics, and fall back to generic C instructions for
unknown compilers.