Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Envoy 1.20 #86755

Closed
Closed

Conversation

codefromthecrypt
Copy link
Contributor

@codefromthecrypt codefromthecrypt commented Oct 7, 2021

  • Have you followed the guidelines for contributing?
  • Have you ensured that your commits follow the commit style guide?
  • Have you checked that there aren't other open pull requests for the same formula update/change?
  • Have you built your formula locally with brew install --build-from-source <formula>, where <formula> is the name of the formula you're submitting?
  • Is your test running fine brew test <formula>, where <formula> is the name of the formula you're submitting?
  • Does your build pass brew audit --strict <formula> (after doing brew install --build-from-source <formula>)? If this is a new formula, does it pass brew audit --new <formula>?

@BrewTestBot BrewTestBot added the automerge-skip `brew pr-automerge` will skip this pull request label Oct 7, 2021
@chenrui333
Copy link
Member

Duplicate of #86741

@chenrui333 chenrui333 marked this as a duplicate of #86741 Oct 7, 2021
@codefromthecrypt
Copy link
Contributor Author

@chenrui333 you might notice from the commits here that #86741 is missing some steps including updating the alias and bumping the minor versions. that's why I don't use bump automation yet..

@codefromthecrypt
Copy link
Contributor Author

It will take a while for me to mark this ready for review as I need to test locally and envoy takes ages to build. Here's what I am doing:

$ cd /usr/local/Homebrew/Library/Taps/homebrew/homebrew-core
$ gh pr checkout 86755
$ brew install --build-from-source envoy@1.19
$ envoy version
$ brew install --build-from-source envoy
$ envoy version

@codefromthecrypt
Copy link
Contributor Author

envoy@1.19 is ok locally

$ cd /usr/local/Homebrew/Library/Taps/homebrew/homebrew-core
$ gh pr checkout 86755
$ brew install --build-from-source envoy@1.19
$ brew test envoy@1.19
$ brew audit --strict envoy@1.19
$ brew audit --new envoy@1.19
$ /usr/local/Cellar/envoy\@1.19/1.19.1/bin/envoy --version

@codefromthecrypt
Copy link
Contributor Author

codefromthecrypt commented Oct 7, 2021

Envoy still doesn't build on M1 due to an unmerged Bazel-related change, so it will continue to fail here. I would also warn that even if merged, typically Envoy doesn't backport build-related stuff to release branches, so it might not affect anything until 1.21 or later envoyproxy/envoy#17438

@codefromthecrypt
Copy link
Contributor Author

codefromthecrypt commented Oct 7, 2021

Ran the following and it all worked, except a warning about patches in new formula. I looked and found the patch did seem to land in 1.20, so I removed it and am running the below again. If they pass, I will mark ready for review.

$ brew install --build-from-source envoy
$ brew test envoy
$ brew audit --strict envoy
$ brew audit --new envoy
$ envoy --version

@codefromthecrypt
Copy link
Contributor Author

ok works locally both!

@codefromthecrypt codefromthecrypt marked this pull request as ready for review October 7, 2021 05:29
@codefromthecrypt
Copy link
Contributor Author

another one for you @carlocab 😎

@carlocab carlocab added CI-long-timeout [DEPRECATED] Use longer GitHub Actions CI timeout. CI-no-fail-fast Continue CI tests despite failing GitHub Actions matrix builds. labels Oct 7, 2021
@SMillerDev
Copy link
Member

Why is ARM no longer supported in this version?

@codefromthecrypt
Copy link
Contributor Author

sorry @SMillerDev I was unclear with my comment. I should have emphasized that arm64 didn't build before. My note was more just re-checking the status of it, and while arm64 support is near, it isn't here, yet.

@codefromthecrypt
Copy link
Contributor Author

PS I've noticed this on a few builds over the last months. Besides bloat in the html caused by so many logos (17k), there's nothing I can tell particularly amuck with envoy's homepage (served by netlify). Is there a way to allow retry in the linter to avoid this?

image

@codefromthecrypt
Copy link
Contributor Author

@cho-m looks like the linux build broke again? Since you might be familiar, any clues from #82077?

 Error: An exception occurred within a child process:
  CompilerSelectionError: envoy@1.19 cannot be built with any available compilers.
Install Clang or run `brew install gcc`.

I suspect this would need to be done on both 1.19 and current (1.20)...

@codefromthecrypt
Copy link
Contributor Author

nm @cho-m I found the part I missed

  on_linux do
    # GCC added as a test dependency to work around Homebrew issue. Otherwise `brew test` fails.
    # CompilerSelectionError: envoy cannot be built with any available compilers.
    depends_on "gcc@9" => [:build, :test]
    depends_on "python@3.9" => :build
  end

@codefromthecrypt
Copy link
Contributor Author

Current failure looks like tar for go isn't found, or the file it is looking for isn't found. It seems unlikely tar is missing.

INFO: Repository go_sdk instantiated at:
  /tmp/envoyA1.19-20211009-12166-1s4jbbp/WORKSPACE:21:25: in <toplevel>
  /tmp/envoyA1.19-20211009-12166-1s4jbbp/bazel/dependency_imports.bzl:19:27: in envoy_dependency_imports
  /tmp/envoyA1.19-20211009-12166-1s4jbbp/.brew_home/_bazel/32ea98d9ad756d02b6517b5463b86b30/external/io_bazel_rules_go/go/private/sdk.bzl:453:28: in go_register_toolchains
  /tmp/envoyA1.19-20211009-12166-1s4jbbp/.brew_home/_bazel/32ea98d9ad756d02b6517b5463b86b30/external/io_bazel_rules_go/go/private/sdk.bzl:129:21: in go_download_sdk
Repository rule _go_download_sdk defined at:
  /tmp/envoyA1.19-20211009-12166-1s4jbbp/.brew_home/_bazel/32ea98d9ad756d02b6517b5463b86b30/external/io_bazel_rules_go/go/private/sdk.bzl:116:35: in <toplevel>
ERROR: An error occurred during the fetch of repository 'go_sdk':
   Traceback (most recent call last):
	File "/tmp/envoyA1.19-20211009-12166-1s4jbbp/.brew_home/_bazel/32ea98d9ad756d02b6517b5463b86b30/external/io_bazel_rules_go/go/private/sdk.bzl", line 100, column 16, in _go_download_sdk_impl
		_remote_sdk(ctx, [url.format(filename) for url in ctx.attr.urls], ctx.attr.strip_prefix, sha256)
	File "/tmp/envoyA1.19-20211009-12166-1s4jbbp/.brew_home/_bazel/32ea98d9ad756d02b6517b5463b86b30/external/io_bazel_rules_go/go/private/sdk.bzl", line 196, column 17, in _remote_sdk
		fail("error extracting Go SDK:\n" + res.stdout + res.stderr)
Error in fail: error extracting Go SDK:
src/main/tools/process-wrapper-legacy.cc:80: "execvp(tar, ...)": No such file or directory
ERROR: Error fetching repository: Traceback (most recent call last):
	File "/tmp/envoyA1.19-20211009-12166-1s4jbbp/.brew_home/_bazel/32ea98d9ad756d02b6517b5463b86b30/external/io_bazel_rules_go/go/private/sdk.bzl", line 100, column 16, in _go_download_sdk_impl
		_remote_sdk(ctx, [url.format(filename) for url in ctx.attr.urls], ctx.attr.strip_prefix, sha256)
	File "/tmp/envoyA1.19-20211009-12166-1s4jbbp/.brew_home/_bazel/32ea98d9ad756d02b6517b5463b86b30/external/io_bazel_rules_go/go/private/sdk.bzl", line 196, column 17, in _remote_sdk
		fail("error extracting Go SDK:\n" + res.stdout + res.stderr)
Error in fail: error extracting Go SDK:
src/main/tools/process-wrapper-legacy.cc:80: "execvp(tar, ...)": No such file or directory
ERROR: Analysis of target '//source/exe:envoy-static' failed; build aborted: error extracting Go SDK:
src/main/tools/process-wrapper-legacy.cc:80: "execvp(tar, ...)": No such file or directory

@codefromthecrypt
Copy link
Contributor Author

I asked for help on envoyproxy/envoy#17438 on the failure extracting go here

@codefromthecrypt
Copy link
Contributor Author

the error message seems to match this from bazel

        ctx.download(
            url = urls,
            sha256 = sha256,
            output = "go_sdk.tar.gz",
        )
        res = ctx.execute(["tar", "-xf", "go_sdk.tar.gz", "--strip-components=1"])
        if res.return_code:
            fail("error extracting Go SDK:\n" + res.stdout + res.stderr)
        ctx.delete("go_sdk.tar.gz")

bazelbuild/rules_go@e01a4e8

If this is the right bazel code, the command invoked was 'tar -xf go_sdk.tar.gz --strip-components=1' and 'No such file or directory' could possibly have been from tar, if the "go_sdk.tar.gz" didn't actually download. On my ubuntu host's tar, the message is longer, but ends in 'No such file or directory', whereas a binary not found looks like 'not found' or 'command not found'

@codefromthecrypt
Copy link
Contributor Author

I'm currently of the opinion that tar wasn't found perhaps due to some sandboxing. I can't explain why

function test_execvp_error_message() {
  local code=0
  $process_wrapper --stdout=$OUT --stderr=$ERR /bin/notexisting &> $TEST_log || code=$?
  assert_equals 1 "$code"
  assert_contains "\"execvp(/bin/notexisting, ...)\": No such file or directory" "$ERR"
}

https://github.com/bazelbuild/bazel/blob/4.1.0/src/test/shell/integration/process-wrapper_test.sh#L221-L226

Please someone who actually works on envoy please troubleshoot this. Things like this are a good reason why homebrew should be a part of the upstream release and not punted to community who have little experience to figure out these things. See envoyproxy/envoy#17500

@codefromthecrypt
Copy link
Contributor Author

@carlocab @SMillerDev I also opened a frank discussion as us continually losing weekends over this is less than ideal. Unless envoy starts caring actively about success here I don't think we can afford two operating systems to maintain Homebrew/discussions#2271

Formula/envoy@1.19.rb Outdated Show resolved Hide resolved
@cho-m cho-m added the CI-linux-self-hosted Build on Linux self-hosted runner label Oct 10, 2021
@codefromthecrypt
Copy link
Contributor Author

codefromthecrypt commented Oct 10, 2021

Summary:

Envoy's build uses go to generate files from protobuf sources. Due to bazel norms, it nests download and extraction of a pinned go SDK. Recently, the bazel rules changed to use system tar instead of built-in. So, downloading would work, but if system tar can't be found, the build would fail.

The default PATH changed recently when on_mac/on_linux was phased out and replaced with OS.mac/OS.linux. This wasn't tested, but also the substitution went fine on macOS.

Linux ended up with a partial failure: #{Formula["python@3.9"].opt_libexec}/bin is placed first in the PATH, so worked, but the rest of the path was messed up. This meant only what's in python's bin ended up in the path, delaying other PATH failures. When bazel went to extract go, it went to fork tar and failed. It probably would fork other things and fail, but tar failed first.

The main takeaway is this formula can't use mechanical substitution due to branching logic. All changes need to be built-from-source on change on all supported operating systems.

FWIW, I still recommend removing linux support (though maybe not in this PR). It is true this formula will still be patched and complicated even without linux, but it will be less. More importantly, the burden of change is lower, instead of having to test on linux due to issues like this, possibly via docker which is painfully slow, tests are limited to "just macOS". Even if that still takes an hour or more to build locally, it is an improvement and in this case would have saved days.

Thanks to @cho-m for spending time to get to the bottom of this. Hopefully, this passes now except for the known issue where arm64 won't yet work on macOS.

@codefromthecrypt
Copy link
Contributor Author

updated the comments on patches and remaining upstream issues

# GCC added as a test dependency to work around Homebrew issue. Otherwise `brew test` fails.
# CompilerSelectionError: envoy cannot be built with any available compilers.
depends_on "gcc@9" => [:build, :test]
depends_on "python@3.9" => :build
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my 2p: this deserves a comment why python@3.9 especially as this is why the PATH is patched for linux below.

Formula/envoy.rb Outdated Show resolved Hide resolved
@@ -0,0 +1,83 @@
class EnvoyAT119 < Formula
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is the version that's failing to build, why don't we leave 1.18 be for now? It's not terribly popular anyway:

==> Analytics
install: 13 (30 days), 73 (90 days), 73 (365 days)
install-on-request: 13 (30 days), 73 (90 days), 73 (365 days)
build-error: 0 (30 days)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to take your time on this one, just I don't fully understand which course to take.

Do you mean leave envoy@1.18 while adding envoy@1.19 and envoy (aliased to @1.20)?
or do you mean leave envoy@1.18, (drop envoy@1.19) while adding envoy (aliased to @1.20)?

Main issue in envoy releases is many are incompatible with each other in some way even if basic usage rarely is impacted. If you meant to drop envoy@1.19 this could interfere with upgrading because that version won't be downloadable when envoy.rb switches from 1.19 to 1.20. Ignore this whole paragraph if that's not what you meant, just elaborating for the sake of timezones.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, your opinion might be correct, as if you read carefully the release notes, it seems like 1.19 will be a dead release. That or they suggest skipping it, but will still release patches. In any case, I think you are right on this one..

https://www.envoyproxy.io/docs/envoy/v1.20.0/version_history/current

dns_filter: dns_filter protobuf fields have been renumbered to restore compatibility with Envoy 1.18, breaking compatibility with Envoy 1.19.0 and 1.19.1. The new field numbering allows control planes supporting Envoy 1.18 to gracefully upgrade to dns_resolution_config, provided they skip over Envoy 1.19.0 and 1.19.1. Control planes upgrading from Envoy 1.19.0 and 1.19.1 will need to vendor the corresponding protobuf definitions to ensure that the renumbered fields have the types expected by those releases.

@codefromthecrypt
Copy link
Contributor Author

One of the builds failed like this:

==> Cleaning
10531
Error: File exists @ dir_s_mkdir - /usr/local/opt/envoy@1.19/.brew
10532

10533
==> brew audit envoy@1.19 --online --new-formula
10534
==> FAILED envoy@1.19
10535
Error: install failed

I'll restore the bottle block and 🤞

Signed-off-by: Adrian Cole <adrian@tetrate.io>
Co-authored-by: Michael Cho <cho-m@tuta.io>
@codefromthecrypt
Copy link
Contributor Author

I removed the dropping 1.18 for 1.19 after thinking through the implications of #86755 (comment). The summary is that if the project team suggest skipping 1.19.0 and 1.19.1, there's no value in retaining it. Implicitly, 1.18.x has higher priority.

@codefromthecrypt
Copy link
Contributor Author

I feel the osx 10.14 fail might be around old certs as it consistently fails The homepage URL https://www.envoyproxy.io/index.html is not reachable

I'm surprised arm64 is passing.. I'll try it on my M1

@codefromthecrypt
Copy link
Contributor Author

"Successful in 19s" should have tipped me off that arm64 didn't build, just skipped gracefully. I tried manually and the known issues discussed and documented about arm64 still exist.

@cho-m @carlocab are either of you keen to merge? Envoy had a conference this week and 1.20 was a part of that. It would be nice to have this version available to users.

@BrewTestBot
Copy link
Member

:shipit: @carlocab has triggered a merge.

Copy link
Member

@carlocab carlocab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's try it. The weird state of the Linux runner might make @BrewTestBot unhappy though.

@carlocab
Copy link
Member

Dispatched a bottle for Linux: https://github.com/Homebrew/homebrew-core/actions/runs/1335583065

@codefromthecrypt codefromthecrypt deleted the envoy-1.20 branch October 13, 2021 01:49
@codefromthecrypt
Copy link
Contributor Author

thanks @carlocab. I'll keep an eye out!

codefromthecrypt pushed a commit to tetratelabs/archive-envoy that referenced this pull request Oct 13, 2021
1.19 was dropped by Homebrew/homebrew-core#86755

Signed-off-by: Adrian Cole <adrian@tetrate.io>
codefromthecrypt added a commit to tetratelabs/archive-envoy that referenced this pull request Oct 13, 2021
1.19 was dropped by Homebrew/homebrew-core#86755

Signed-off-by: Adrian Cole <adrian@tetrate.io>
@github-actions github-actions bot added the outdated PR was locked due to age label Nov 14, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 14, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
automerge-skip `brew pr-automerge` will skip this pull request CI-linux-self-hosted Build on Linux self-hosted runner CI-long-timeout [DEPRECATED] Use longer GitHub Actions CI timeout. CI-no-fail-fast Continue CI tests despite failing GitHub Actions matrix builds. outdated PR was locked due to age
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants