Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace in-house rand with absl::Random for backoff with HWAES disabled #26479

Closed
wants to merge 3 commits into from

Conversation

veblush
Copy link
Contributor

@veblush veblush commented Jun 14, 2021

Replace randen_hwaes.cc of abseil with randen_hwaes_nohw.cc because randen_hwaes.cc uses HW-specific instructions which require particular compiler options which are not easy to add for certain build environments. This is a workaround and I'll work with Abseil team to have more maintainable way to address this.

Related to #26478
Fixes #26476 and fix-forward #26463

@veblush veblush added kind/bug lang/core release notes: no Indicates if PR should not be in release notes labels Jun 14, 2021
@veblush
Copy link
Contributor Author

veblush commented Jun 14, 2021

grpc_build_artifacts_multiplatform: failed for mac but it looks irrelavant?
grpc_build_artifacts_multiplatform: running

@veblush
Copy link
Contributor Author

veblush commented Jun 14, 2021

@veblush veblush marked this pull request as ready for review June 15, 2021 00:13
@veblush veblush changed the title Disable absl::random HWAES when used for makefile and wrapped languages. Replace in-house rand with absl::Random for backoff with HWAES disabled Jun 15, 2021
return [ABSEIL_PATH + "/" + f for f in files if f.endswith(".cc")]
srcs = [ABSEIL_PATH + "/" + f for f in files if f.endswith(".cc")]
# Replace absl/random/internal/randen_hwaes.cc with
# src/abseil-cpp/randen_hwaes_nohw.c
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you write randen_hwaes_nohw.c by hand or how did you obtain it?
Since what you're doing seems quite dirty, I think there needs to be a more detailed description of what you've done in a comment. Right now I kind of know what was the point but I wouldn't be able to do it myself since important information seems to be missing (so maintaining this hack in the future will be difficult).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. I wrote this randen_hwaes_nohw.cc which is meant to forward all operations to the non-hw one. My plan is to use this one-off hack for a few month (proving this is a working solution) and work with the Abseil team to have an option to disable HW acceleration completely. I guess we'll be able to replace it with the formal one from the next LTS version of Abseil.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your plan sounds fine but a detailed description of the workaround and the plan needs to be provided in a comment.
Also please make sure the src/abseil-cpp/randen_hwaes_nohw.cc has an introductory comment that mentions the context or links to a place where context can be found.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! I'll do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! PTAL

namespace random_internal {

void RandenHwAes::Generate(const void* keys, void* state_void) {
return RandenSlow::Generate(keys, state_void);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If absl already provides RandenSlow():

  • why can't we use it?
  • isn't there already a complilation flag for absl that allows you to fallback to using RandenSlow automatically?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RandenSlow is an implementation detail then not exposed so it's not possible for us to use it directly. Abseil automatically choose either fast or slow one by detecting whether the fast one is feasible at compile time or runtime.

There is no flag to control this. That's what I want to suggest to the Abseil team to explicitly disable it.

# to disable HWAES completely. (https://github.com/grpc/grpc/issues/26478)
try:
i = srcs.index(ABSEIL_PATH + "/absl/random/internal/randen_hwaes.cc")
srcs[i] = "src/abseil-cpp/randen_hwaes_nohw.cc"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing to investigate here is to:

  • see whether the necessary HW instructions (I think that's NEON for ARM64) are already available in the ARM64 architectures we support (we support aarch64 / ARM64 v8) - if they are, it should be fine to just build on ARM64 with neon instructions enabled and everything is going to work, without needing to resort to a hack

  • alternatively, we could use InsecureBitGen from ABSL, which IMHO wouldn't require HW instructions to be present so even if we enable NEON at compile time, the instructions wouldn't be used at runtime anyway (which avoids the problems with some processors potentially not supporting them).

I think it make sense to thing about the two paths above before resorting to using a hack.

Copy link
Contributor Author

@veblush veblush Jun 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that NEON is supported by the all ARM architectures that gRPC supports. Challenge here is that gRPC needs to provide proper compilation option depending on the architecture. Otherwise it will throw a compilation error. So we need to update all build environments (makefile, php, python, ...) to have that thing. Given the fact that we don't need to have super-fast BitGen, it looks too much burden to maintain.

That's why I changed it to use InsecureBitGen instead. It's already super fast even without HW acceleration. But we still have this PR because it's about compilation. Although BitGen is not used by gRPC, its code needs to be compiled.

stanhu added a commit to stanhu/grpc that referenced this pull request Aug 25, 2021
Older CPUs that do not have SSE4.1 would crash with the Ruby native gem
due to an illegal instruction exception.

The Abseil random library isn't being used at the moment
(grpc#26476), and there's no reason gRPC
needs to force SSE4.1 instructions on all platforms at the moment. There
are other hardware-specific issues that need to be ironed out for this
to work: grpc#26479

When the `-msse4` compiler flag was enabled, the Abseil code started
using the `pinsrb` instruction:

```
$ elfx86exts abseil-cpp/absl/time/internal/cctz/src/time_zone_libc.o
MODE64 (ret)
CMOV (cmovne)
SSE2 (movdqa)
SSE41 (pinsrb)
SSE1 (movaps)
CPU Generation: Penryn
```

Closes grpc#27095
stanhu added a commit to stanhu/grpc that referenced this pull request Aug 25, 2021
Older CPUs that do not have SSE4.1 would crash with the Ruby native gem
due to an illegal instruction exception.

The Abseil random library isn't being used at the moment
(grpc#26476), and there's no reason gRPC
needs to force SSE4.1 instructions on all platforms at the moment. There
are other hardware-specific issues that need to be ironed out for this
to work: grpc#26479

When the `-msse4` compiler flag was enabled, the Abseil code started
using the `pinsrb` instruction:

```
$ elfx86exts abseil-cpp/absl/time/internal/cctz/src/time_zone_libc.o
MODE64 (ret)
CMOV (cmovne)
SSE2 (movdqa)
SSE41 (pinsrb)
SSE1 (movaps)
CPU Generation: Penryn
```

Closes grpc#27095
* Replace in-house rand with absl::Random for backoff

* Run sanity

* Added bscrypt dependency
Use insecure-rand for backoff
@jtattermusch
Copy link
Contributor

let me know once this is ready for review.

@veblush
Copy link
Contributor Author

veblush commented Sep 8, 2021

@jtattermusch Abseil recently updated so that it can be used in gRPC without this problem. (#27193) Rather having this workaround, let's wait until they release the next LTS release.

@veblush veblush closed this Sep 8, 2021
@jtattermusch
Copy link
Contributor

@jtattermusch Abseil recently updated so that it can be used in gRPC without this problem. (#27193) Rather having this workaround, let's wait until they release the next LTS release.

Good call. I wasn't fan of the workarounds in this PR, and not needing a workaround sounds much better. Thanks for investigating this!

stanhu added a commit to stanhu/grpc that referenced this pull request Sep 10, 2021
Older CPUs that do not have SSE4.1 would crash with the Ruby native gem
due to an illegal instruction exception.

The Abseil random library isn't being used at the moment
(grpc#26476), and there's no reason gRPC
needs to force SSE4.1 instructions on all platforms at the moment. There
are other hardware-specific issues that need to be ironed out for this
to work: grpc#26479

When the `-msse4` compiler flag was enabled, the Abseil code started
using the `pinsrb` instruction:

```
$ elfx86exts abseil-cpp/absl/time/internal/cctz/src/time_zone_libc.o
MODE64 (ret)
CMOV (cmovne)
SSE2 (movdqa)
SSE41 (pinsrb)
SSE1 (movaps)
CPU Generation: Penryn
```

This was previously needed because gcc 4.8 wouldn't compile without the
`-msse4` and `-maes` flags.

However, per
gcc-mirror/gcc@97db2bf
gcc 5.0+ automatically detects whether these options are enabled.

clang still needs `-maes` since including `wmmintrin.h` expects the AES
option to be enabled.

Closes grpc#27095
jtattermusch pushed a commit that referenced this pull request Oct 5, 2021
* Drop ABSL_RANDOM_HWAES_FLAGS compiler flags

Older CPUs that do not have SSE4.1 would crash with the Ruby native gem
due to an illegal instruction exception.

The Abseil random library isn't being used at the moment
(#26476), and there's no reason gRPC
needs to force SSE4.1 instructions on all platforms at the moment. There
are other hardware-specific issues that need to be ironed out for this
to work: #26479

When the `-msse4` compiler flag was enabled, the Abseil code started
using the `pinsrb` instruction:

```
$ elfx86exts abseil-cpp/absl/time/internal/cctz/src/time_zone_libc.o
MODE64 (ret)
CMOV (cmovne)
SSE2 (movdqa)
SSE41 (pinsrb)
SSE1 (movaps)
CPU Generation: Penryn
```

Closes #27095

* Revert "Drop ABSL_RANDOM_HWAES_FLAGS compiler flags"

This reverts commit 3b7cc74.

* Drop -msse4 compiler flag

Older CPUs that do not have SSE4.1 would crash with the Ruby native gem
due to an illegal instruction exception.

The Abseil random library isn't being used at the moment
(#26476), and there's no reason gRPC
needs to force SSE4.1 instructions on all platforms at the moment. There
are other hardware-specific issues that need to be ironed out for this
to work: #26479

When the `-msse4` compiler flag was enabled, the Abseil code started
using the `pinsrb` instruction:

```
$ elfx86exts abseil-cpp/absl/time/internal/cctz/src/time_zone_libc.o
MODE64 (ret)
CMOV (cmovne)
SSE2 (movdqa)
SSE41 (pinsrb)
SSE1 (movaps)
CPU Generation: Penryn
```

This was previously needed because gcc 4.8 wouldn't compile without the
`-msse4` and `-maes` flags.

However, per
gcc-mirror/gcc@97db2bf
gcc 5.0+ automatically detects whether these options are enabled.

clang still needs `-maes` since including `wmmintrin.h` expects the AES
option to be enabled.

Closes #27095
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug lang/core release notes: no Indicates if PR should not be in release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants