New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace in-house rand with absl::Random for backoff with HWAES disabled #26479
Conversation
grpc_build_artifacts_multiplatform: failed for mac but it looks irrelavant? |
linux/aws/grpc_aws_basictests_php: passed |
return [ABSEIL_PATH + "/" + f for f in files if f.endswith(".cc")] | ||
srcs = [ABSEIL_PATH + "/" + f for f in files if f.endswith(".cc")] | ||
# Replace absl/random/internal/randen_hwaes.cc with | ||
# src/abseil-cpp/randen_hwaes_nohw.c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you write randen_hwaes_nohw.c
by hand or how did you obtain it?
Since what you're doing seems quite dirty, I think there needs to be a more detailed description of what you've done in a comment. Right now I kind of know what was the point but I wouldn't be able to do it myself since important information seems to be missing (so maintaining this hack in the future will be difficult).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're right. I wrote this randen_hwaes_nohw.cc
which is meant to forward all operations to the non-hw one. My plan is to use this one-off hack for a few month (proving this is a working solution) and work with the Abseil team to have an option to disable HW acceleration completely. I guess we'll be able to replace it with the formal one from the next LTS version of Abseil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your plan sounds fine but a detailed description of the workaround and the plan needs to be provided in a comment.
Also please make sure the src/abseil-cpp/randen_hwaes_nohw.cc has an introductory comment that mentions the context or links to a place where context can be found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! I'll do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! PTAL
namespace random_internal { | ||
|
||
void RandenHwAes::Generate(const void* keys, void* state_void) { | ||
return RandenSlow::Generate(keys, state_void); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If absl already provides RandenSlow():
- why can't we use it?
- isn't there already a complilation flag for absl that allows you to fallback to using RandenSlow automatically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RandenSlow is an implementation detail then not exposed so it's not possible for us to use it directly. Abseil automatically choose either fast or slow one by detecting whether the fast one is feasible at compile time or runtime.
There is no flag to control this. That's what I want to suggest to the Abseil team to explicitly disable it.
# to disable HWAES completely. (https://github.com/grpc/grpc/issues/26478) | ||
try: | ||
i = srcs.index(ABSEIL_PATH + "/absl/random/internal/randen_hwaes.cc") | ||
srcs[i] = "src/abseil-cpp/randen_hwaes_nohw.cc" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing to investigate here is to:
-
see whether the necessary HW instructions (I think that's NEON for ARM64) are already available in the ARM64 architectures we support (we support aarch64 / ARM64 v8) - if they are, it should be fine to just build on ARM64 with neon instructions enabled and everything is going to work, without needing to resort to a hack
-
alternatively, we could use InsecureBitGen from ABSL, which IMHO wouldn't require HW instructions to be present so even if we enable NEON at compile time, the instructions wouldn't be used at runtime anyway (which avoids the problems with some processors potentially not supporting them).
I think it make sense to thing about the two paths above before resorting to using a hack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that NEON is supported by the all ARM architectures that gRPC supports. Challenge here is that gRPC needs to provide proper compilation option depending on the architecture. Otherwise it will throw a compilation error. So we need to update all build environments (makefile, php, python, ...) to have that thing. Given the fact that we don't need to have super-fast BitGen, it looks too much burden to maintain.
That's why I changed it to use InsecureBitGen instead. It's already super fast even without HW acceleration. But we still have this PR because it's about compilation. Although BitGen is not used by gRPC, its code needs to be compiled.
Older CPUs that do not have SSE4.1 would crash with the Ruby native gem due to an illegal instruction exception. The Abseil random library isn't being used at the moment (grpc#26476), and there's no reason gRPC needs to force SSE4.1 instructions on all platforms at the moment. There are other hardware-specific issues that need to be ironed out for this to work: grpc#26479 When the `-msse4` compiler flag was enabled, the Abseil code started using the `pinsrb` instruction: ``` $ elfx86exts abseil-cpp/absl/time/internal/cctz/src/time_zone_libc.o MODE64 (ret) CMOV (cmovne) SSE2 (movdqa) SSE41 (pinsrb) SSE1 (movaps) CPU Generation: Penryn ``` Closes grpc#27095
Older CPUs that do not have SSE4.1 would crash with the Ruby native gem due to an illegal instruction exception. The Abseil random library isn't being used at the moment (grpc#26476), and there's no reason gRPC needs to force SSE4.1 instructions on all platforms at the moment. There are other hardware-specific issues that need to be ironed out for this to work: grpc#26479 When the `-msse4` compiler flag was enabled, the Abseil code started using the `pinsrb` instruction: ``` $ elfx86exts abseil-cpp/absl/time/internal/cctz/src/time_zone_libc.o MODE64 (ret) CMOV (cmovne) SSE2 (movdqa) SSE41 (pinsrb) SSE1 (movaps) CPU Generation: Penryn ``` Closes grpc#27095
* Replace in-house rand with absl::Random for backoff * Run sanity * Added bscrypt dependency
Use insecure-rand for backoff
let me know once this is ready for review. |
@jtattermusch Abseil recently updated so that it can be used in gRPC without this problem. (#27193) Rather having this workaround, let's wait until they release the next LTS release. |
Good call. I wasn't fan of the workarounds in this PR, and not needing a workaround sounds much better. Thanks for investigating this! |
Older CPUs that do not have SSE4.1 would crash with the Ruby native gem due to an illegal instruction exception. The Abseil random library isn't being used at the moment (grpc#26476), and there's no reason gRPC needs to force SSE4.1 instructions on all platforms at the moment. There are other hardware-specific issues that need to be ironed out for this to work: grpc#26479 When the `-msse4` compiler flag was enabled, the Abseil code started using the `pinsrb` instruction: ``` $ elfx86exts abseil-cpp/absl/time/internal/cctz/src/time_zone_libc.o MODE64 (ret) CMOV (cmovne) SSE2 (movdqa) SSE41 (pinsrb) SSE1 (movaps) CPU Generation: Penryn ``` This was previously needed because gcc 4.8 wouldn't compile without the `-msse4` and `-maes` flags. However, per gcc-mirror/gcc@97db2bf gcc 5.0+ automatically detects whether these options are enabled. clang still needs `-maes` since including `wmmintrin.h` expects the AES option to be enabled. Closes grpc#27095
* Drop ABSL_RANDOM_HWAES_FLAGS compiler flags Older CPUs that do not have SSE4.1 would crash with the Ruby native gem due to an illegal instruction exception. The Abseil random library isn't being used at the moment (#26476), and there's no reason gRPC needs to force SSE4.1 instructions on all platforms at the moment. There are other hardware-specific issues that need to be ironed out for this to work: #26479 When the `-msse4` compiler flag was enabled, the Abseil code started using the `pinsrb` instruction: ``` $ elfx86exts abseil-cpp/absl/time/internal/cctz/src/time_zone_libc.o MODE64 (ret) CMOV (cmovne) SSE2 (movdqa) SSE41 (pinsrb) SSE1 (movaps) CPU Generation: Penryn ``` Closes #27095 * Revert "Drop ABSL_RANDOM_HWAES_FLAGS compiler flags" This reverts commit 3b7cc74. * Drop -msse4 compiler flag Older CPUs that do not have SSE4.1 would crash with the Ruby native gem due to an illegal instruction exception. The Abseil random library isn't being used at the moment (#26476), and there's no reason gRPC needs to force SSE4.1 instructions on all platforms at the moment. There are other hardware-specific issues that need to be ironed out for this to work: #26479 When the `-msse4` compiler flag was enabled, the Abseil code started using the `pinsrb` instruction: ``` $ elfx86exts abseil-cpp/absl/time/internal/cctz/src/time_zone_libc.o MODE64 (ret) CMOV (cmovne) SSE2 (movdqa) SSE41 (pinsrb) SSE1 (movaps) CPU Generation: Penryn ``` This was previously needed because gcc 4.8 wouldn't compile without the `-msse4` and `-maes` flags. However, per gcc-mirror/gcc@97db2bf gcc 5.0+ automatically detects whether these options are enabled. clang still needs `-maes` since including `wmmintrin.h` expects the AES option to be enabled. Closes #27095
Replace
randen_hwaes.cc
of abseil withranden_hwaes_nohw.cc
becauseranden_hwaes.cc
uses HW-specific instructions which require particular compiler options which are not easy to add for certain build environments. This is a workaround and I'll work with Abseil team to have more maintainable way to address this.Related to #26478
Fixes #26476 and fix-forward #26463