Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

envoy: fix build for Linux #82077

Merged
merged 1 commit into from
Jul 30, 2021
Merged

envoy: fix build for Linux #82077

merged 1 commit into from
Jul 30, 2021

Conversation

cho-m
Copy link
Member

@cho-m cho-m commented Jul 28, 2021

  • Have you followed the guidelines for contributing?
  • Have you ensured that your commits follow the commit style guide?
  • Have you checked that there aren't other open pull requests for the same formula update/change?
  • Have you built your formula locally with brew install --build-from-source <formula>, where <formula> is the name of the formula you're submitting?
  • Is your test running fine brew test <formula>, where <formula> is the name of the formula you're submitting?
  • Does your build pass brew audit --strict <formula> (after doing brew install --build-from-source <formula>)? If this is a new formula, does it pass brew audit --new <formula>?

@cho-m cho-m added do not merge in progress Stale bot should stay away linux to homebrew-core Migration of linuxbrew-core to homebrew-core CI-long-timeout [DEPRECATED] Use longer GitHub Actions CI timeout. CI-force-linux [DEPRECATED] Don't pass --skip-unbottled-linux to brew test-bot. CI-no-fail-fast Continue CI tests despite failing GitHub Actions matrix builds. labels Jul 28, 2021
@cho-m cho-m marked this pull request as draft July 28, 2021 20:04
@BrewTestBot BrewTestBot added go Go use is a significant feature of the PR or issue no ARM bottle Formula has no ARM bottle labels Jul 28, 2021
@cho-m cho-m force-pushed the envoy-linux branch 6 times, most recently from ea6776f to cc223e3 Compare July 28, 2021 22:28
@cho-m
Copy link
Member Author

cho-m commented Jul 28, 2021

Failure with GCC 11 (cc223e3 ~15min):

[1,531 / 4,866] Compiling src/google/protobuf/repeated_field.cc; 2s processwrapper-sandbox ... (2 actions running)
INFO: From CcCmakeMakeRule bazel/foreign_cc/nghttp2/include:
rules_foreign_cc: Cleaning temp directories
INFO: From CcCmakeMakeRule bazel/foreign_cc/ares/include:
rules_foreign_cc: Cleaning temp directories
ERROR: /tmp/envoy-20210728-8806-1776cas/.brew_home/_bazel/198d9fb500d90bb37d44573bb4174b7e/external/boringssl/BUILD:162:11: Compiling src/crypto/curve25519/curve25519.c failed: (Exit 1): gcc-11 failed: error executing command 
  (cd /tmp/envoy-20210728-8806-1776cas/.brew_home/_bazel/198d9fb500d90bb37d44573bb4174b7e/sandbox/processwrapper-sandbox/1510/execroot/envoy && \
  exec env - \
    BAZEL_LINKLIBS=-l%:libstdc++.a \
    BAZEL_LINKOPTS=-lm \
    CC=gcc-11 \
    CXX=g++-11 \
    PATH=/home/linuxbrew/.linuxbrew/opt/python@3.9/libexec/bin:/home/linuxbrew/.linuxbrew/bin:/usr/bin:/bin \
    *** \
  /home/linuxbrew/.linuxbrew/bin/gcc-11 -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections -fdata-sections -MD -MF bazel-out/k8-opt/bin/external/boringssl/_objs/crypto/curve25519.pic.d '-frandom-seed=bazel-out/k8-opt/bin/external/boringssl/_objs/crypto/curve25519.pic.o' -fPIC -gsplit-dwarf -iquote external/boringssl -iquote bazel-out/k8-opt/bin/external/boringssl -isystem external/boringssl/src/include -isystem bazel-out/k8-opt/bin/external/boringssl/src/include -fPIC -fexceptions -Wa,--noexecstack '-D_XOPEN_SOURCE=700' -Wall -Werror '-Wformat=2' -Wsign-compare -Wmissing-field-initializers -Wwrite-strings -Wshadow -fno-common '-std=c11' -Wmissing-prototypes -Wold-style-definition -Wstrict-prototypes -fno-canonical-system-headers -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -c external/boringssl/src/crypto/curve25519/curve25519.c -o bazel-out/k8-opt/bin/external/boringssl/_objs/crypto/curve25519.pic.o)
Execution platform: @local_config_platform//:host

Use --sandbox_debug to see verbose messages from the sandbox
external/boringssl/src/crypto/curve25519/curve25519.c:503:57: error: argument 2 of type 'const uint8_t[32]' {aka 'const unsigned char[32]'} with mismatched bound [-Werror=array-parameter=]
  503 | int x25519_ge_frombytes_vartime(ge_p3 *h, const uint8_t s[32]) {
      |                                           ~~~~~~~~~~~~~~^~~~~
In file included from external/boringssl/src/crypto/curve25519/curve25519.c:33:
external/boringssl/src/crypto/curve25519/internal.h:109:58: note: previously declared as 'const uint8_t *' {aka 'const unsigned char *'}
  109 | int x25519_ge_frombytes_vartime(ge_p3 *h, const uint8_t *s);
      |                                           ~~~~~~~~~~~~~~~^
external/boringssl/src/crypto/curve25519/curve25519.c:823:57: error: argument 2 of type 'const uint8_t *' {aka 'const unsigned char *'} declared as a pointer [-Werror=array-parameter=]
  823 | void x25519_ge_scalarmult_base(ge_p3 *h, const uint8_t *a) {
      |                                          ~~~~~~~~~~~~~~~^
In file included from external/boringssl/src/crypto/curve25519/curve25519.c:33:
external/boringssl/src/crypto/curve25519/internal.h:117:56: note: previously declared as an array 'const uint8_t[32]' {aka 'const unsigned char[32]'}
  117 | void x25519_ge_scalarmult_base(ge_p3 *h, const uint8_t a[32]);
      |                                          ~~~~~~~~~~~~~~^~~~~
cc1: all warnings being treated as errors
Target //source/exe:envoy-static failed to build
INFO: Elapsed time: 629.287s, Critical Path: 29.96s
INFO: 1646 processes: 131 internal, 1514 processwrapper-sandbox, 1 worker.
FAILED: Build did NOT complete successfully
FAILED: Build did NOT complete successfully

@cho-m cho-m force-pushed the envoy-linux branch 4 times, most recently from 4d8c84b to 6b8fcce Compare July 29, 2021 02:20
@cho-m
Copy link
Member Author

cho-m commented Jul 29, 2021

Also adding note that attempt with LLVM 12 failed - 4d8c84b ~18min
Upstream only runs CI build+tests on LLVM/Clang 10. They do officially list GCC support and do check CI build (no test) on GCC 9 (see: https://github.com/envoyproxy/envoy/tree/main/ci#build-image-base-and-compiler-versions).

[1,440 / 4,914] Compiling src/compiler/python_generator.cc [for host]; 4s processwrapper-sandbox ... (2 actions, 1 running)
INFO: From Generating C++ proto_library @com_google_googleapis//google/devtools/cloudtrace/v2:cloudtrace_proto:
google/devtools/cloudtrace/v2/trace.proto:19:1: warning: Import google/api/annotations.proto is unused.
google/devtools/cloudtrace/v2/tracing.proto:26:1: warning: Import google/protobuf/timestamp.proto is unused.
ERROR: /tmp/envoy-20210729-10250-1dlkcus/.brew_home/_bazel/9ea9a869639bb884de78ea3b634500d5/external/com_google_absl/absl/base/BUILD.bazel:66:11: output 'external/com_google_absl/absl/base/_objs/raw_logging_internal/raw_logging.dwo' was not created
ERROR: /tmp/envoy-20210729-10250-1dlkcus/.brew_home/_bazel/9ea9a869639bb884de78ea3b634500d5/external/com_google_absl/absl/base/BUILD.bazel:66:11: Compiling absl/base/internal/raw_logging.cc failed: not all outputs were created or valid
Target //source/exe:envoy-static failed to build
INFO: Elapsed time: 726.592s, Critical Path: 23.36s
INFO: 1515 processes: 247 internal, 1267 processwrapper-sandbox, 1 worker.
FAILED: Build did NOT complete successfully
FAILED: Build did NOT complete successfully

Currently trying gcc@10. Has been running for 4+ hours so either working or stuck.
EDIT: Failed at different location (6b8fcce ~5hr):

external/com_google_absl/absl/container/internal/inlined_vector.h: In constructor 'Envoy::Http::HeaderString::HeaderString()':
external/com_google_absl/absl/container/internal/inlined_vector.h:469:5: error: '<anonymous>.absl::inlined_vector_internal::Storage<char, 128, std::allocator<char> >::data_' is used uninitialized in this function [-Werror=uninitialized]
  469 |     data_ = other_storage.data_;
      |     ^~~~~
cc1plus: all warnings being treated as errors
Target //source/exe:envoy-static failed to build
INFO: Elapsed time: 18251.508s, Critical Path: 6492.21s
INFO: 4617 processes: 139 internal, 4477 processwrapper-sandbox, 1 worker.
FAILED: Build did NOT complete successfully
FAILED: Build did NOT complete successfully

@cho-m cho-m added the CI-force-arm [DEPRECATED] Don't pass --skip-unbottled-arm to brew test-bot. label Jul 29, 2021
@cho-m cho-m force-pushed the envoy-linux branch 3 times, most recently from 64e1486 to bbf8d23 Compare July 29, 2021 09:31
@cho-m cho-m removed the CI-force-arm [DEPRECATED] Don't pass --skip-unbottled-arm to brew test-bot. label Jul 29, 2021
@cho-m
Copy link
Member Author

cho-m commented Jul 29, 2021

Finally built by using gcc@9.
Test failure due to Homebrew issue. Will add as test dependency with comment that Homebrew issue.

==> brew test --verbose envoy
==> FAILED
Error: test failed
==> Testing envoy
Error: envoy: failed
An exception occurred within a child process:
  CompilerSelectionError: envoy cannot be built with any available compilers.
Install Clang or run `brew install gcc`.

@cho-m cho-m marked this pull request as ready for review July 29, 2021 19:26
@cho-m cho-m added ready to merge PR can be merged once CI is green and removed do not merge in progress Stale bot should stay away labels Jul 29, 2021
Copy link
Member

@SMillerDev SMillerDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cho-m !

@Homebrew/linux could you take a look at the brew issue?

@iMichka
Copy link
Member

iMichka commented Jul 30, 2021

I'm fine as-is. @cho-m can you open an issue in Homebrew/brew with a link to this pull request, so that we keep track of it. Not saying that it will be fixed in the next days, but it's something we need to address whenever someone finds time to look into that.

@chenrui333 chenrui333 merged commit 4978848 into Homebrew:master Jul 30, 2021
@chenrui333
Copy link
Member

@BrewTestBot
Copy link
Member

@chenrui333 bottle request for envoy failed.

@cho-m
Copy link
Member Author

cho-m commented Jul 30, 2021

@cho-m can you open an issue in Homebrew/brew with a link to this pull request, so that we keep track of it.

@iMichka Opened Homebrew/brew#11795.
A similar failure happened in dotnet for depends_on "llvm" => :build.
And I think we saw the failure when semgrep was tested as dependent.


Dispatched, https://github.com/Homebrew/homebrew-core/actions/runs/1083018391

@chenrui333 envoy takes 5 hours to build, so it hits the 1 hour timeout.
There is probably another way to dispatch/upload the bottle, but I am not aware of steps.

@iMichka
Copy link
Member

iMichka commented Jul 30, 2021

I dispatched a new build: https://github.com/Homebrew/homebrew-core/actions/runs/1083411661
You can change the timeout from 60 to 360 in the dispatch options.

And I think you can use "Linux" as build agent instead of "ubuntu-latest" to use the self-hosted runner if the build takes more than 360 minutes.

@cho-m cho-m deleted the envoy-linux branch August 23, 2021 04:52
@cho-m cho-m mentioned this pull request Aug 25, 2021
@codefromthecrypt codefromthecrypt mentioned this pull request Oct 9, 2021
6 tasks
# GCC added as a test dependency to work around Homebrew issue. Otherwise `brew test` fails.
# CompilerSelectionError: envoy cannot be built with any available compilers.
depends_on "gcc@9" => [:build, :test]
depends_on "python@3.9" => :build
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cho-m do you think this was due to bazel? If so, it recentlyupdated to python 3.10 and I should update and comment in #86755

Copy link
Member Author

@cho-m cho-m Oct 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python is one of the always-versioned formulae (similar to openssl).

So, how Homebrew handles this is the dependency is set to the latest version of python that existed & worked at the time (e.g. python@3.9).

When a new version of Python comes out, it goes through a tedious migration process, e.g.

This is in-part due to over 400+ formulae that use Python 3.9 and the amount of breakage that occurs due to API changes.


We usually don't add comments all over the place.

It should be pretty obvious that Python is needed to build bazel. The versioning doesn't matter as previously mentioned since 3.9 essentially means "latest" (and to be bumped on future migration whenever anyone has a chance).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The odd part is that this version is only in the linux section. that it is also overridden is a cognitive glitch. Ex you have to think through why this is here. I still don't know why the path is overridden. I'm guessing maybe because there is another python in the system path, and that problem doesn't exist in macOS.

The part about which python is a separate issue and while obvious to you, and I certainly understand now, it is distracting especially when bazel has a different version than this.

I'm not trying to argue, just suggesting not everyone are as experienced as you with nuance and so comments for others can be helpful sometime even if it isn't helpful to you. Ex we took time to document why GCC is there, but not why python is added and also prefixed into the path and only on linux.

I'd still like to document this even if it doesn't help you. If I wrote "In linux, we need to override the path to avoid using system python. The version here should always be latest".. would that be accurate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually never mind, I think I've taken enough time on this. I highly doubt anyone will maintain this stuff except the two of us anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still like to document this even if it doesn't help you. If I wrote "In linux, we need to override the path to avoid using system python. The version here should always be latest".. would that be accurate?

I would like that to go into the https://docs.brew.sh/Formula-Cookbook then, maybe in a section with Linux hints. It's much easier to find there than in a formula comment somewhere.

Copy link
Member Author

@cho-m cho-m Oct 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I recall, the problem in Linux was the lack of a "python" binary along with no guarantee on what Python is installed on someone's particular distro/OS. This is a common situation across multiple packages(formulae) that do something like #!/usr/bin/env python or expect the python command to exist, so it isn't often commented.

In Homebrew's Ubuntu 16.04 container, the system Python 3.5 only provides /usr/bin/python3, /usr/bin/python3.5 and similar.

Homebrew's formula also only links the python3 binary by default, but it provides the python binary inside the libexec path, which is the extra PATH added to environment.

On macOS, system python exists (/usr/bin/python) and points to Python 2.7.


On the other hand, the GCC issue is commented because it isn't behaving as we want it to. On Linux, Homebrew prefers to use the CI's system gcc-5. If it doesn't work, then Homebrew's unversioned gcc is preferred.

The problem here is that the newer gcc versions (v10, v11) were failing, so I had to keep dropping version all the way to gcc@9, which is not ideal. This is something that needs to be revisited and hopefully removed (if it works with older gcc-5 or future Ubuntu 18 system gcc-7) or replaced with unversioned depends_on "gcc"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks. I think I understand what's going on better. it is probably water under the bridge now, but in hindsight I would have suggested to install and override the path to python uniformly as doing that is less if statements and also clarifies that python is in no way coupled to the notes about gcc. That python, and a specific version of python was there and not in darwin raised all sorts of questions why, which were all not really about GCC. Again this is hindsight, but if it were uniformly using python, that dep would be near bazel and since there'd be no if statement, the env drift that happened wouldn't have. In other words, it is more coherent cognitively and less fragile. You'd still wonder "why python 3.9?", but not "why wouldn't we do it on darwin? does 3.9 break something on darwin?"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can update anyway, but the whole thing is odd because basilisk has this: #!/usr/bin/env python3

https://github.com/bazelbuild/bazelisk/blob/280b07a854a70f2ab0887d644e364c8e8c1c6187/bazelisk.py#L1

I'll remove this override and see if the build actually fails as it is confusing and also duplicate effort (to maintain Formula["python@3.10"] both as an install dep and also as a path override).

I tried to remove the problem entirely by deleting the linux build, but it seems that's required to be retained, as well hacks around getting linux to remain working #87142

@codefromthecrypt codefromthecrypt mentioned this pull request Oct 13, 2021
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI-force-linux [DEPRECATED] Don't pass --skip-unbottled-linux to brew test-bot. CI-long-timeout [DEPRECATED] Use longer GitHub Actions CI timeout. CI-no-fail-fast Continue CI tests despite failing GitHub Actions matrix builds. go Go use is a significant feature of the PR or issue linux to homebrew-core Migration of linuxbrew-core to homebrew-core no ARM bottle Formula has no ARM bottle ready to merge PR can be merged once CI is green
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants