Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build kubectl binary with cgo enabled #469

Closed
pwittrock opened this Issue Nov 23, 2017 · 37 comments

Comments

Projects
None yet
@pwittrock
Copy link
Member

pwittrock commented Nov 23, 2017

cgo is required for certain go std libraries to work. See this article for details.

This breaks kubectl name resolution for some cases. See kubernetes/kubectl#48 for details.

@ixdy

This comment has been minimized.

Copy link
Member

ixdy commented Nov 29, 2017

This starts to get tricky in a hurry; e.g. if you want to want to include kubectl in a container, and it's been built with cgo, you now need to make sure you have the correct libc in your container. Similarly, some older linux distributions may have a libc that is too old.

@BenTheElder

This comment has been minimized.

Copy link
Member

BenTheElder commented Dec 1, 2017

Has anyone mentioned a critical part besides DNS? As far as I've heard there's some hand waving about "critical parts" but it seems this is only DNS so far in my (limited) experience. Perhaps another option is that we can "just" get a better pure-go DNS implementation?

@chrislovecnm

This comment has been minimized.

Copy link
Member

chrislovecnm commented Dec 1, 2017

@BenTheElder we have to shut off cgo DNS in kops for macosx, it was causing problems. Go implementation is working much better. Will need to test that issue.

@pwittrock Was that issue address in go 1.9? I will take a look when I have a chance

@fejta-bot

This comment has been minimized.

Copy link

fejta-bot commented Mar 1, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@fejta-bot

This comment has been minimized.

Copy link

fejta-bot commented Mar 31, 2018

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@jkemp101

This comment has been minimized.

Copy link

jkemp101 commented Apr 27, 2018

@chrislovecnm Do you have anymore info on the issue kops had when using native DNS? Seems like things should work better using the native DNS resolution not worse. This issue is becoming a bigger pain for us as we roll out VPN access to more team members. Some of us are looking at running dnsmasq locally to control name resolution so k8s tools work properly.

@BenTheElder

This comment has been minimized.

Copy link
Member

BenTheElder commented Apr 27, 2018

@jkemp101

This comment has been minimized.

Copy link

jkemp101 commented Apr 27, 2018

@BenTheElder Custom builds of every version of k8s and kops? We obviously don't need every version but it kind of sucks not being able to just download a new release and give it a try. So I compile the "standard" one we will use the most often to administer our clusters and if I need something special I normally "adjust" my kube config so I can run it through a SSH tunnel.

I also don't think hacking build scripts to do custom compiles of my k8s tools is following the best practices to manage production clusters. I should be relying on a build system that runs test and builds the releases in a consistent manner. Running dnsmasq would let me use the same build of the tools everyone else is running.

Can someone point me to the code that actually does the name resolution for normal kubectl commands? I've tried to find it before but get kind of lost. You mentioned using a different DNS implementation. I gave that a try also here miekg/dns#665

@BenTheElder

This comment has been minimized.

Copy link
Member

BenTheElder commented Apr 27, 2018

@BenTheElder Custom builds of every version of k8s and kops?

You'd only need to compile the binary(ies) that actually run on mac, not all of k8s / kops (IE kubectl, kops cli), which is pretty quick and simpler than running dnsmasq. We also support skewed kubectl so you don't necessarily even need to bulid for every release..

I also don't think hacking build scripts to do custom compiles of my k8s tools is following the best practices to manage production clusters.

You don't need to make any modifications or any real hacking.. One obvious way is make generated_files && CGO_ENABLED=1 go install k8s.io/kubernetes/cmd/kubectl.

I should be relying on a build system that runs test and builds the releases in a consistent manner. Running dnsmasq would let me use the same build of the tools everyone else is running.

Well the build tools run in linux containers (usually on top of kubernetes!)
And everyone actually uses many different builds of kubectl. Just in CI we often use:

  • freshly built kubectl
  • kubectl built by CI from other release branches (skew testing)
  • kubectl fresh installed from the gcloud sdk (which notably only installs the "current" version, because of skew support)
  • kubectl baked into the test image

You mentioned using a different DNS implementation.

Er, I was a bit hopeful that the go stdlib implementation should be improved, i'm not sure how trivially DNS can be swapped out in kubectl.

@dims

This comment has been minimized.

Copy link
Member

dims commented Apr 27, 2018

@BenTheElder @pwittrock may be this tip will help? golang/go#12503 (comment)

It's possible to intercept DNS requests generated by the net package by modifying the Dial func of DefaultResolver
@jkemp101

This comment has been minimized.

Copy link

jkemp101 commented Apr 27, 2018

Remember I picked up this thread with a question about what issues kops had when they tried using cgo and ended up switching back to pure go. Not knowing what those issues were and compiling it with cgo anyway to run against my production clusters isn't something I would do without more information.

The whole recompile versus using dnsmasq does highlight an observation I've had in other k8s issues. For operations/infrastructure folks running dnsmasq locally is trivial and might be something they are doing anyway to have fine control over their DNS environment. But a programmer looks at recompiling as the trivial solution versus diving into low level DNS configurations. The easier fix is really based on your background/skills.

@BenTheElder

This comment has been minimized.

Copy link
Member

BenTheElder commented Apr 27, 2018

@BenTheElder

This comment has been minimized.

Copy link
Member

BenTheElder commented Apr 27, 2018

cc @justinsb @chrislovecnm for Kops

FWIW, compiling kubectl with GCO enabled really is pretty quick and trivial, our "official releases" don't mean all that much, every cloud is doing their own build and release seperately anyhow and kubectl is widely compatible with them.

The only difference for kubectl with or without cgo should be in portability of the binary and the DNS resolution.

@greenboxal

This comment has been minimized.

Copy link

greenboxal commented May 1, 2018

The problem with this is that there's no documentation at all about the problem. I was only able to track down this issue because I'm aware of Go's shenanigans.

Ignoring the discussion about whether the default build should work or not with the system DNS (which IMHO, it's a complete absurd), this should be documented somewhere including ways of circumventing the issue, probably with instructions on how to build with CGO.

For people having problems with macOS, I created a helper that forces resolv.conf to use the system DNS configuration, it might help someone: https://github.com/greenboxal/dns-heaven

@dims

This comment has been minimized.

Copy link
Member

dims commented May 1, 2018

underlying issue from golang - golang/go#16345

@bitglue

This comment has been minimized.

Copy link

bitglue commented May 1, 2018

@greenboxal Go name resolution on macOS does work, it only doesn't work with some builds of kubectl because it's built with the netgo build tag. I would suggest for dns-heaven that in lieu of a bunch of shell parsing and whatnot, you simply call net.LookupIP. Running with GODEBUG=netdns=9 is informative.


I'm unable to find a record of what the "issues" were with the cgo resolver were, though portability seems like a legitimate one. I would imagine this is mostly a concern for container builds, and is a non-issue on macOS where people don't often decide to recompile the entire OS with an alternate libc.

I'd propose then that kubernetes releases use netgo for linux builds, where people do crazy things with libc (or where libc may not exist at all in the container), and the go resolver behavior more closely matches what most libc resolvers do, and users have a higher tolerance for fiddling. Do not use netgo on macOS builds, where libc is homogeneous, and there's more likely an expectation that tools work without fiddling (so more time can be spent doing important things, like yak shaving on Linux).

This seems to be the philosophy adopted by the go resolver which presently uses a libc resolver by default on macOS, but under usual circumstances will use a pure go resolver on Linux.

@BenTheElder

This comment has been minimized.

Copy link
Member

BenTheElder commented May 1, 2018

@greenboxal

This comment has been minimized.

Copy link

greenboxal commented May 1, 2018

@bitglue The idea was to provide something similar to systemd-resolved on macOS, and use the stub DNS server on resolv.conf. The problem with net.LookupIP is that I can only proxy A or AAAA queries.

@bitglue

This comment has been minimized.

Copy link

bitglue commented May 1, 2018

another underlying golang issue: golang/go#12524

@ixdy

This comment has been minimized.

Copy link
Member

ixdy commented May 2, 2018

/remove-lifecycle rotten

@ixdy

This comment has been minimized.

Copy link
Member

ixdy commented May 2, 2018

The kubernetes release process currently builds all binaries for all platforms (linux, mac, windows) from a docker image running on a linux machine, using cross tools as necessary to cross-build C dependencies.

We explicitly disable cgo on all platforms for kubectl (along with a few other binaries); this makes distribution easier, since we don't have to worry about what particular libc is being used in a deployment.

I suspect the only way to correctly use cgo and link against libc on mac would be to run the build on a mac, which is currently infeasible for the project.

If someone has a better alternative, I'm happy to review PRs.

@greenboxal

This comment has been minimized.

Copy link

greenboxal commented May 2, 2018

I don't think the solution here has anything to do with Kubernetes... Having to compile with CGO is not a real solution.

I don't know what is the current status on Golang about this issue but, the main argument is that they wanted to have consistent DNS resolution across platforms. Being pragmatic, that's a bad idea. If I'm running anything on macOS (natively), I expect everything to work as in macOS, not as in Linux or any other platform.

The problem is... in macOS's case, Golang is not 100% to blame. The system does a poor job of having a homogenous DNS stack. resolv.conf is populated with the primary DNS server from the primary iface, even though the system supports configuration much more complex than that.

@jkemp101 what was your DNS config that didn't work without cgo? Scoped queries?

@jkemp101

This comment has been minimized.

Copy link

jkemp101 commented May 2, 2018

I am using an OpenVPN based VPN with split DNS. K8s names need to be resolved against private Route53 zones over the VPN but everything else uses my default DNS server of 8.8.8.8.

@bitglue

This comment has been minimized.

Copy link

bitglue commented May 2, 2018

the main argument is that they wanted to have consistent DNS resolution across platforms.

I don't think that's it, because the default behavior of Go on macOS is to use the libc resolver, which is the "normal" behavior of any program, Go or otherwise.

Go also has logic to disable the pure Go resolver in cases where it would not provide the behavior expected on the platform. From https://golang.org/pkg/net/#hdr-Name_Resolution:

By default the pure Go resolver is used, because a blocked DNS request consumes only a goroutine, while a blocked C call consumes an operating system thread. When cgo is available, the cgo-based resolver is used instead under a variety of conditions: on systems that do not let programs make direct DNS requests (OS X), when the LOCALDOMAIN environment variable is present (even if empty), when the RES_OPTIONS or HOSTALIASES environment variable is non-empty, when the ASR_CONFIG environment variable is non-empty (OpenBSD only), when /etc/resolv.conf or /etc/nsswitch.conf specify the use of features that the Go resolver does not implement, and when the name being looked up ends in .local or is an mDNS name.


The system does a poor job of having a homogenous DNS stack.

I don't think that's it, either. Almost nothing on macOS reads /etc/resolv.conf. Exceptions are dig, host, and nslookup. But these are tools for generating DNS queries, not resolving hostnames. Their divergent behavior is not limited to macOS: you'll also find they don't read /etc/hosts or /etc/nsswitch.conf on Linux.

Building with cgo seems like a perfectly fine solution: it's a shame none of the documented install methods use it. I imagine it wouldn't be difficult to get homebrew to build with cgo, at least (though I acknowledge this isn't the place). It sounds like the binary releases are more difficult since they are cross-compiled, though that too seems possible.

Alternately the pure go resolver may evolve behavior that more closely matches the libc resolver's behavior, though that doesn't seem likely to happen soon. I'm not sure if it's even possible, since I'm not aware of a way to integrate with the system configuration framework which does not involve linking with a C library.

@ixdy

This comment has been minimized.

Copy link
Member

ixdy commented May 2, 2018

@bitglue the cross-compiling gist you linked doesn't discuss linux -> darwin.

From what I recall, using cgo on darwin requires using Xcode to compile, which is a bit less trivial than just installing the gcc crosstool chain in a container image.

@bitglue

This comment has been minimized.

Copy link

bitglue commented May 2, 2018

You're right 😞

Maybe the gonative tool linked in the original issue is more feasible? I haven't tried it yet, but seems the issue here is really just the net package, and if a binary release of that for OS X can be obtained and injected in the Linux build container there's no requirement for any additional cross-compiling headaches as the problematic part is already compiled.

https://inconshreveable.com/04-30-2014/cross-compiling-golang-programs-with-native-libraries/

bks7 added a commit to bks7/homebrew-core that referenced this issue May 23, 2018

Add support to build kubectl with cgo
Modify formula to support build dynamic binary with CGO_ENABLED=1.
Building kubectl without cgo breaks dns name resolution since it
uses /etc/resolv.conf for getting resolvers while nothing in macos
should use it for name resolution.
More information about this issue can be obtained here -
kubernetes/release#469

@bks7 bks7 referenced this issue May 23, 2018

Closed

Add support to build kubectl with cgo #28187

4 of 4 tasks complete

BrewTestBot added a commit to BrewTestBot/homebrew-core that referenced this issue May 23, 2018

Add support to build kubectl with cgo
Modify formula to support build dynamic binary with CGO_ENABLED=1.
Building kubectl without cgo breaks dns name resolution since it
uses /etc/resolv.conf for getting resolvers while nothing in macos
should use it for name resolution.
More information about this issue can be obtained here -
kubernetes/release#469
@bitglue

This comment has been minimized.

Copy link

bitglue commented May 23, 2018

It's easy to get homebrew to build with cgo enabled, but they are reluctant to accept the patch. Could a kubernetes/release member bless it? Homebrew/homebrew-core#28187

@cblecker

This comment has been minimized.

Copy link
Member

cblecker commented May 23, 2018

Just saw this as I was pinged by the homebrew maintainers.

My initial reaction: Ugh, this sucks. It's a go "problem", but not an easy one to solve. I'm interested in gonative as a solution, but would need to understand how it works. Also curious if anyone from @kubernetes/sig-cli-maintainers has any experience with it.

I'm initially reluctant to diverge how the binaries are distributed between the official releases, and homebrew. The patch in the linked PR isn't as bad because it's not by default.. but still uneasy about it.

Interested in the opinion of @ixdy, @BenTheElder, @pwittrock, @kubernetes/sig-cli-maintainers.

@BenTheElder

This comment has been minimized.

Copy link
Member

BenTheElder commented May 23, 2018

IMHO the ideal (not necessarily feasible) situations from most desirable to least are:

  • someone makes the non CGO DNS resolution match the native expectations (seems unlikely...)
  • someone works out using a solution for the upstream containerized builds with a tool like gonative to build kubectl with the CGO net package, we use this in the cross build / official builds
    • this should probably just be kubectl for mac specifically, regressing to non-static binaries elsewhere would probably cause many headaches
  • downstream builds kubectl with CGO enabled, with minimal build forking

On that note Homebrew/homebrew-core#28187 seems pretty reasonable to me, it would be nice if we provided a better way to accomplish the override currently accomplished by:

# Delete kubectl binary from KUBE_STATIC_LIBRARIES
inreplace "hack/lib/golang.sh", " kubectl", ""
@ixdy

This comment has been minimized.

Copy link
Member

ixdy commented May 23, 2018

I think you're on the right track with option 3 (and the homebrew PR).

There's already code to allow explicitly building a binary with CGO disabled (ironically giving kubectl as an example, even though it's in KUBE_STATIC_LIBRARIES:

  # Allow individual overrides--e.g., so that you can get a static build of
  # kubectl for inclusion in a container.
  if [ -n "${KUBE_STATIC_OVERRIDES:+x}" ]; then
    for e in "${KUBE_STATIC_OVERRIDES[@]}"; do [[ "$1" == *"/$e" ]] && return 0; done;
  fi

(source)

I'd be happy to extend this with a KUBE_DYNAMIC_OVERRIDES or KUBE_CGO_OVERRIDES environment variable (or whatever, naming is hard) which the homebrew script could set.

Another option would be to automatically enable CGO on kubectl when compiling for darwin from darwin.
That might be a little too magical, though, since it wouldn't be enabled when cross-compiled.

For bazel (which isn't really used for any official releases yet), it'd be easy to have kubectl-pure and kubectl-cgo targets which enable/disable cgo, and you could depend on whichever you prefer for the environment in quesiton.

@ixdy

This comment has been minimized.

Copy link
Member

ixdy commented May 23, 2018

(There's some precedent for automagic cgo settings when cross compiling from linux/amd64 to other linuxes here, but that's not quite the same issue being discussed.)

@cblecker

This comment has been minimized.

Copy link
Member

cblecker commented May 23, 2018

I'm continuing reading, but came across this: golang/go@b615ad8

Still pondering..

@ixdy

This comment has been minimized.

Copy link
Member

ixdy commented May 23, 2018

Created kubernetes/kubernetes#64219 to support overriding the cgo-disabled state of kubectl at build time.

@bitglue

This comment has been minimized.

Copy link

bitglue commented May 23, 2018

If the homebrew formula is still under consideration I'd like to advocate for cgo enabled by default. Since the only reason for disabling it seems to be to do a cross-compile, and that's a non-issue with homebrew, I don't see a reason to make users go through obscure hoops to get expected DNS behavior.

Regarding gonative, there's a bit on how it works here: https://inconshreveable.com/04-30-2014/cross-compiling-golang-programs-with-native-libraries/

My understanding is the crux of it is extracting net.a from the macOS release of go.

@ixdy

This comment has been minimized.

Copy link
Member

ixdy commented May 23, 2018

My worry with gonative is that it doesn't appear to have been touched in 2-3 years, so it's not clear if it still works or will continue working in the future.

jingax10 pushed a commit to jingax10/kubernetes that referenced this issue May 31, 2018

Kubernetes Submit Queue
Merge pull request kubernetes#64219 from ixdy/hack-lib-golang-cgo-ove…
…rrides

Automatic merge from submit-queue (batch tested with PRs 64338, 64219, 64486, 64495, 64347). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add KUBE_CGO_OVERRIDES env var to force enabling CGO

**What this PR does / why we need it**: as detailed in kubernetes/release#469 (and elsewhere), there is a desire to have `kubectl` built with CGO enabled on mac OS.

There currently isn't a great way to do this in our official cross builds, but we should allow mac users to build their own kubectl with CGO enabled if they desire, e.g. through homebrew.

This change enables that; you can now do `KUBE_CGO_OVERRIDES=kubectl make WHAT=cmd/kubectl` and get a cgo-enabled `kubectl`.

The default build outputs remain unchanged.

**Release note**:

```release-note
kubectl built for darwin from darwin now enables cgo to use the system-native C libraries for DNS resolution. Cross-compiled kubectl (e.g. from an official kubernetes release) still uses the go-native netgo DNS implementation. 
```

/assign @BenTheElder @cblecker 
cc @bks7 @bitglue
@fejta-bot

This comment has been minimized.

Copy link

fejta-bot commented Aug 21, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@cblecker

This comment has been minimized.

Copy link
Member

cblecker commented Aug 21, 2018

This was resolved :)

/remove-lifecycle stale
/close

marpaia pushed a commit to marpaia/release that referenced this issue Feb 21, 2019

Merge pull request kubernetes#469 from pivotal-k8s/RelBranchMgmtShadows
Add 1.14 release branch management shadows
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.