Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change in macOS CROSSTOOL breaks rules_go #6283

Closed
katre opened this issue Oct 1, 2018 · 32 comments
Closed

Change in macOS CROSSTOOL breaks rules_go #6283

katre opened this issue Oct 1, 2018 · 32 comments
Assignees
Labels
P1 I'll work on this now. (Assignee required) platform: apple type: bug z-team-Apple Deprecated. Send to rules_apple, or label team-Rules-CPP + platform:apple

Comments

@katre
Copy link
Member

katre commented Oct 1, 2018

The commit 19a401c has caused an error during the build for rules_go. Logs are available here: https://buildkite.com/bazel/bazel-with-downstream-projects-bazel/builds/472#f9e3ac01-fa0a-4903-b10d-09a2f5550c88

(std::__1::ostreambuf_iterator<char, std::__1::char_traits<char> >, char const*, char const*, char const*, std::__1::ios_base&, char) in libversion.a(cxx_version.o)
--
  | Dwarf Exception Unwind Info (__eh_frame) in libversion.a(cxx_version.o)
  | ld: symbol(s) not found for architecture x86_64
  | clang: error: linker command failed with exit code 1 (use -v to see invocation)
  |  
  | GoLink: error running subcommand: exit status 2
  | ERROR: /Users/buildkite/builds/buildkite-imacpro-5-1/bazel-downstream-projects/rules_go/tests/legacy/examples/cgo/BUILD.bazel:38:1: Couldn't build file tests/legacy/examples/cgo/darwin_amd64_stripped/cgo_lib_test: GoLink tests/legacy/examples/cgo/darwin_amd64_stripped/cgo_lib_test failed (Exit 1) link failed: error executing command bazel-out/host/bin/external/io_bazel_rules_go/go/tools/builders/darwin_amd64_stripped/link -sdk external/go_sdk -installsuffix darwin_amd64 -arc ... (remaining 18 argument(s) skipped)
  |  
  | Use --sandbox_debug to see verbose messages from the sandbox
  | external/go_sdk/pkg/tool/darwin_amd64/link: running external/local_config_cc/cc_wrapper.sh failed: exit status 1
  | Undefined symbols for architecture x86_64:

Commit 19a401c was identified by bisecting the Bazel source.

@katre katre added type: bug P1 I'll work on this now. (Assignee required) platform: apple z-team-Apple Deprecated. Send to rules_apple, or label team-Rules-CPP + platform:apple labels Oct 1, 2018
@katre
Copy link
Member Author

katre commented Oct 1, 2018

CC @jayconrod for rules_go. This looks to be bazel-internal, but please let me know if you have any ideas.

Specifically, I'd like to know if this is a release blocker for 0.19.0 (#6116).

@aragos
Copy link
Contributor

aragos commented Oct 1, 2018

From a colleague who took a look:

https://github.com/bazelbuild/rules_go/blob/6915510b32e2d61702ae8208b0e64070e52056d2/go/private/context.bzl#L336
probs this?
it looks like -lc++ was coming from linker_flags
possibly mostly_static_link_options does not include the new feature configured flag

@jayconrod
Copy link
Contributor

I reproduced this and poked around a bit. I can confirm what @aragos said: the only flags I'm getting from mostly_static_link_options are:

-headerpad_max_install_names -no-canonical-prefixes

@trybka
Copy link
Contributor

trybka commented Oct 1, 2018

mostly_static_link_options fall into the legacy CROSSTOOL fields, I think:
https://docs.bazel.build/versions/master/skylark/backward-compatibility.html#disable-legacy-c-toolchain-api

There are potentially other link flags that are missing at this point?

@tburgin
Copy link

tburgin commented Oct 1, 2018

Looks like we may need to add another action here: https://github.com/bazelbuild/bazel/blob/master/tools/osx/crosstool/CROSSTOOL.tpl#L174-L178

@tburgin
Copy link

tburgin commented Oct 1, 2018

Actually there is already special handling for the c++ linking flags in the go_rules. The rules know if there is c++ source, adding the linker flag here seems like a good fit.

diff --git a/go/private/rules/cgo.bzl b/go/private/rules/cgo.bzl
index 631facc..9c34200 100644
--- a/go/private/rules/cgo.bzl
+++ b/go/private/rules/cgo.bzl
@@ -207,6 +207,8 @@ def _cgo_codegen_impl(ctx):
     have_cc = len(source.cxx) + len(source.objc) + len(ctx.attr.deps) > 0
     if not have_cc:
         linkopts = [o for o in linkopts if o not in ("-lstdc++", "-lc++")]
+    else:
+        linkopts.append("-lc++")

     tool_args.add_all(["-objdir", out_dir])

The failing tests passes with this patch. Running all go_rules tests now.

@tburgin
Copy link

tburgin commented Oct 1, 2018

Running on macOS all tests pass with the ^^ patch. One downside to changing the go rules is all platforms are effected and should be thoroughly tested.

@jayconrod
Copy link
Contributor

@tburgin I'm not sure rules_go can figure that out in general. The C++ standard library name varies by toolchain, and there are even multiple implementations that can be used with the same compiler. It would be better for that flag to come from the toolchain.

I'm not entirely comfortable with the code above because I don't have a full list of C++ library names. We needed that hack though for Go programs that run in containers where libstdc++.so isn't available.

@tburgin
Copy link

tburgin commented Oct 1, 2018

Gotcha, I am really only familiar with darwin ;) The darwin toolchain prior to 19a401c was unconditionally adding -lc++ to all link actions.
Adding Go's linking action to the link_libc++ feature would get the darwin toolchain back to where it was. I am not super familiar with the go rules, is there a c++-link-* equivalent for go (go-link-executable etc.)?

I don't fully understand @trybka's comment, but that may also be another way forward.

@jayconrod
Copy link
Contributor

@tburgin I'm unfortunately not that familiar with CROSSTOOL and features therein.

The Go linker logic is primarily within https://github.com/bazelbuild/rules_go/blob/master/go/private/actions/link.bzl.

Go linking happens in two stages: internal linking (with the Go linker) and external linker (with the system linker, typically whatever CC or in the Bazel world, cpp.compiler_executable). For pure Go programs, only internal linking is needed: that can produce a statically linked ELF / Mach-O / PE executable directly. For Go programs that incorporate some C code the external linker is used on the file produced by the internal linker to link with libc. In Bazel, we can also have dependencies on cc_library and objc_library targets, so we need -lc++ if we have those.

Not sure if that helps.

rules_go is still using ctx.fragments.cpp. I tried upgrading to cc_common pretty recently (maybe in 0.15.x), but not everything was there yet. I'd like to upgrade to cc_common soon, but I'm not sure whether it will be possible to backport that to older versions of rules_go, and I'd like to continue to support older versions for a while.

@tburgin
Copy link

tburgin commented Oct 1, 2018

Yeah, makes sense.
Okay, sounds like we need to find a way to add back the c++ linking flag to the darwin toolchain via CROSSTOOL, while still being able to disable it for certain rules in rules_apple.

I will take a look tomorrow.

@katre
Copy link
Member Author

katre commented Oct 2, 2018

This issue manifested with rules_go, but wouldn't any set of rules that tries to use the linker from a CC toolchain run into something similar?

We need to make sure this is solved for all Starlark C++ API users, not just for rules_go.

@tburgin
Copy link

tburgin commented Oct 2, 2018

@katre Only rules that do not align with these actions: https://github.com/bazelbuild/bazel/blob/master/tools/osx/crosstool/CROSSTOOL.tpl#L174-L178

@katre
Copy link
Member Author

katre commented Oct 2, 2018

I see, thanks for clarifying.

@hlopko
Copy link
Member

hlopko commented Oct 2, 2018

Hi all,

@trybka is right, go rules are using legacy Skylark API to the C++ toolchain that doesn't get flags from features. The principled fix is to migrate to the new toolchain API (https://docs.bazel.build/versions/master/skylark/backward-compatibility.html#disable-legacy-c-toolchain-api). While it's not set in stone, it's safe to assume the legacy API will be removed in Bazel 0.20.0.

To whom it helps, changes to the internal go rules for this happened in cl/210533647 and cl/213225869. I'll gladly help, just let me know.

Last commit to the API that is relevant here was released in Bazel 0.17.1 (b9c7269) so you should be able to use the complete API with current Bazel.

@hlopko
Copy link
Member

hlopko commented Oct 2, 2018

@katre I'd prefer the fix on the go rules side, and if not viable, rolling back 19a401c is the only option. Legacy Skylark C++ API didn't change, and that it doesn't get the flags from features was the reason to introduce the new API.

@katre
Copy link
Member Author

katre commented Oct 2, 2018

I agree that having rules_go upgrade to the latest API is good, but if we go that route, rules_go will need to complete that and release it before we can release 0.19.0, or else their users will be unable to build on macOS.

Can we fix this on the Bazel side to give rules_go and their users a grace period to make this transition smoother?

@hlopko
Copy link
Member

hlopko commented Oct 2, 2018

Sure, by rolling back 19a401c. Let me create a rollback.

@katre
Copy link
Member Author

katre commented Oct 2, 2018

@jayconrod If we can roll this back for 0.19.0, can rules_go be updated to the latest C++ api by the time we cut 0.20.0?

@sergiocampama
Copy link
Contributor

Is there any kind of existing infrastructure that would have detected this breakage in a timely manner, ideally before the change is even submitted? If so, are there docs on how to set it up? If not, can resources be added to such an effort?

It seems like we keep hitting similar issues again and again because there is not enough testing coverage for the multitude of scenarios where changes can break, wasting too many engineering hours.

I would expect a post mortem from the Bazel team after this is resolved (or at least a retrospective), because the current integration testing situation is frankly untenable for rule maintainers.

@katre
Copy link
Member Author

katre commented Oct 2, 2018

The CI system tests downstream projects with the latest bazel built from HEAD nightly: https://buildkite.com/bazel/bazel-with-downstream-projects-bazel

It looks like rules_go has been red on this for at least 5 days (I can't see history further back).

However, as you can see, a lot of projects are red. I know the Green Team (@buchgr is sheriff this week) is working to stabilize the system.

@hlopko
Copy link
Member

hlopko commented Oct 2, 2018

Yup, green team was supposed to catch this in O(days), but with rules_go it took us (well, not even us, it was John who investigated the failure) 11 days. This is not acceptable and we'll make sure it doesn't happen again.

@hlopko
Copy link
Member

hlopko commented Oct 2, 2018

Awesome! Look into bazelbuild/rules_swift@88ac6c3#diff-1394f96b7e9b2b96f5e8eff729be2e55 for backwards compatible implementation. Feel free to hardcode action names (you cannot load skylark files conditionally).

bazel-io pushed a commit that referenced this issue Oct 2, 2018
*** Reason for rollback ***

Breaks external go_rules (#6283)

*** Original change description ***

macos_bundle: Make "apple_product_type.kernel_extension" produce a valid kext.
PiperOrigin-RevId: 215447343
@katre
Copy link
Member Author

katre commented Oct 3, 2018

Last night's test of rules_go is clean (https://buildkite.com/bazel/bazel-with-downstream-projects-bazel/builds/475#b0c7777f-2466-489a-b8f6-1937a2e61b03), so I am closing this and will cherrypick 20bfdc6 into #6116.

@katre katre closed this as completed Oct 3, 2018
katre pushed a commit that referenced this issue Oct 3, 2018
*** Reason for rollback ***

Breaks external go_rules (#6283)

*** Original change description ***

macos_bundle: Make "apple_product_type.kernel_extension" produce a valid kext.
PiperOrigin-RevId: 215447343
katre pushed a commit that referenced this issue Oct 3, 2018
*** Reason for rollback ***

Breaks external go_rules (#6283)

*** Original change description ***

macos_bundle: Make "apple_product_type.kernel_extension" produce a valid kext.
PiperOrigin-RevId: 215447343
katre pushed a commit that referenced this issue Oct 4, 2018
*** Reason for rollback ***

Breaks external go_rules (#6283)

*** Original change description ***

macos_bundle: Make "apple_product_type.kernel_extension" produce a valid kext.
PiperOrigin-RevId: 215447343
@hlopko
Copy link
Member

hlopko commented Oct 6, 2018

Thanks John!

katre pushed a commit that referenced this issue Oct 16, 2018
*** Reason for rollback ***

Breaks external go_rules (#6283)

*** Original change description ***

macos_bundle: Make "apple_product_type.kernel_extension" produce a valid kext.
PiperOrigin-RevId: 215447343
katre pushed a commit that referenced this issue Oct 17, 2018
*** Reason for rollback ***

Breaks external go_rules (#6283)

*** Original change description ***

macos_bundle: Make "apple_product_type.kernel_extension" produce a valid kext.
PiperOrigin-RevId: 215447343
@jayconrod
Copy link
Contributor

@katre @mhlopko I released a change yesterday in rules_go which gathers flags from cc_common instead of CcToolchainInfo. I believe we are no longer accessing the deprecated fields of CcToolchainInfo. I built Bazel at 19a401c and ran the rules_go tests at 0.16.0, 0.15.5, and 0.14.3 (the newly released versions) on macOS. Everything passes.

Hopefully, it should be safe to roll this change forward.

@hlopko
Copy link
Member

hlopko commented Oct 18, 2018

Awesome! thanks!

@katre
Copy link
Member Author

katre commented Oct 18, 2018

I think we still need to make sure downstream projects of rules_go update before I can roll forward the change. I'll take care of coordinating and testing that, I have a list of projects available.

@katre
Copy link
Member Author

katre commented Oct 18, 2018

Actually, wait, I was thinking of a different change in rules_go. :)

I will roll this forward with tests regardless.

katre added a commit that referenced this issue Oct 18, 2018
…lid kext.

This is the rollback of commit 20bfdc6, which in turn rolled back the original commit 19a401c.

This was originally rolled back because breaks external go_rules (#6283)

This is now safe as rules_go has been updated and released.

PiperOrigin-RevId: 217694011
Change-Id: Ib452a98dc0a0882aa15ec9608d13e54f2bd83c28
@katre
Copy link
Member Author

katre commented Oct 18, 2018

@jayconrod After testing the rollfoward, I see this error:

Information about the C++ toolchain API is not accessible anymore through ctx.fragments.cpp (see --incompatible_disable_legacy_cpp_toolchain_skylark_api on http://docs.bazel.build/versions/master/skylark/backward-compatibility.html#disable-legacy-c-configuration-api for migration notes). Use CcToolchainInfo instead.

(From https://buildkite.com/bazel/bazel-with-downstream-projects-bazel/builds/514#916b75fd-9952-47dc-b399-b4f89dad8799).

Does this mean we do need to update projects to use the latest rules_go release before this CL goes in?

Affected projects in Bazel CI:

  • buildtools
  • rules_docker
  • rules_k8s
  • rules_typescript

katre pushed a commit that referenced this issue Oct 23, 2018
*** Reason for rollback ***

Breaks external go_rules (#6283)

*** Original change description ***

macos_bundle: Make "apple_product_type.kernel_extension" produce a valid kext.
PiperOrigin-RevId: 215447343
katre pushed a commit that referenced this issue Oct 23, 2018
*** Reason for rollback ***

Breaks external go_rules (#6283)

*** Original change description ***

macos_bundle: Make "apple_product_type.kernel_extension" produce a valid kext.
PiperOrigin-RevId: 215447343
katre pushed a commit that referenced this issue Oct 24, 2018
*** Reason for rollback ***

Breaks external go_rules (#6283)

*** Original change description ***

macos_bundle: Make "apple_product_type.kernel_extension" produce a valid kext.
PiperOrigin-RevId: 215447343
katre added a commit to katre/bazel that referenced this issue Nov 1, 2018
…lid kext.

This is the rollback of commit 20bfdc6, which in turn rolled back the original commit 19a401c.

This was originally rolled back because breaks external go_rules (bazelbuild#6283)

This is now safe as rules_go has been updated and released.

PiperOrigin-RevId: 217694011
Change-Id: Ib452a98dc0a0882aa15ec9608d13e54f2bd83c28
bazel-io pushed a commit that referenced this issue Nov 1, 2018
…lid kext.

This is the rollback of commit 20bfdc6, which in turn rolled back the original commit 19a401c.

This was originally rolled back because breaks external go_rules (#6283)

This is now safe as rules_go has been updated and released.
Full downstream test run: https://buildkite.com/bazel/bazel-with-downstream-projects-bazel/builds/551

PiperOrigin-RevId: 219678937
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 I'll work on this now. (Assignee required) platform: apple type: bug z-team-Apple Deprecated. Send to rules_apple, or label team-Rules-CPP + platform:apple
Projects
None yet
Development

No branches or pull requests

7 participants