Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Umbrella issue: migrate kubernetes to Go 1.12.1 #75372

Closed
spiffxp opened this Issue Mar 14, 2019 · 33 comments

Comments

@spiffxp
Copy link
Member

spiffxp commented Mar 14, 2019

The Go release team hopes to have 1.12.1 out sometime between today and tomorrow #74890 (comment)

We believe this will fix an issue we're working around in one area of our upgrade tests, but may be lurking in all of our vendored code #75305

Let's use this issue to put together the steps to roll forward. Here's what I gathered from #sig-testing, comment what I've left out or gotten wrong and I'll edit the issue description

when 1.12.1 hits the road, looks like there are 3 pieces that we need to update?

also the kubernetes/kubernetes WORKSPACE file:

go_version = "1.12",
(but maybe you fold that into update rules_go)
similar to #74632

/milestone v1.14
/priority critical-urgent
/kind cleanup

@spiffxp

This comment has been minimized.

Copy link
Member Author

spiffxp commented Mar 14, 2019

FYI @bradfitz
FYI @justinsb @liggitt @neolit123 @dims @tpepper
I don't want to voluntold one of you to lead this but you've all got more context than I
/assign @amwat
I know release team's test-infra role needs to be involved

/sig release
/sig testing

@spiffxp

This comment has been minimized.

Copy link
Member Author

spiffxp commented Mar 14, 2019

/assign @munnerz
since @dims pinged you in #sig-testing

@ixdy

This comment has been minimized.

Copy link
Member

ixdy commented Mar 14, 2019

@jayconrod is pretty good about publishing new rules_go releases as soon as a new upstream go version is released, but it's pretty much up to us to update our workspace dependencies.

@jayconrod

This comment has been minimized.

Copy link

jayconrod commented Mar 14, 2019

bazelbuild/rules_go#1994 will add support for the new Go releases. I'll cherry-pick that back to the 0.18, 0.17, and 0.16 branches when CI is happy and tag new releases after that. Should be ready tomorrow.

@bradfitz

This comment has been minimized.

Copy link

bradfitz commented Mar 14, 2019

Go 1.12.1 is out: https://golang.org/dl/

(And a Go 1.11.x update.)

@jayconrod

This comment has been minimized.

Copy link

jayconrod commented Mar 14, 2019

rules_go 0.18.1, 0.17.2, and 0.16.8 are now out with support for the new Go versions.

@neolit123

This comment has been minimized.

Copy link
Member

neolit123 commented Mar 15, 2019

revert for the workaround is on hold here:
#75393

@liggitt

This comment has been minimized.

Copy link
Member

liggitt commented Mar 15, 2019

looks like the golang:1.12.1 docker image isn't available yet. once that's available, we'll need to rebuild our cross-build image, then #75390 can proceed

@liggitt

This comment has been minimized.

Copy link
Member

liggitt commented Mar 15, 2019

golang:1.12.1 image publish is pending at docker-library/official-images#5550

@mariantalla

This comment has been minimized.

Copy link
Contributor

mariantalla commented Mar 15, 2019

List of related issues that we believe will be unblocked:

@dims

This comment has been minimized.

Copy link
Member

dims commented Mar 15, 2019

Plan of record after the sig-release burndown meeting:

  • The 1.12.1 golang image is not yet ready (we are waiting on docker-library/official-images#5550)
  • We can't wait till Monday for the 1.12.1 image
  • So, let's try hack up kube-cross build to download the 1.12.1 binaries while leaving FROM golang:1.12.0 as-is
  • @BenTheElder has the pen the hack and @dims to review
  • @amwat will help publish the new kube-cross image and update test-infra
  • @hoegaarden will check if there is any go references elsewhere in release tooling
  • @spiffxp to write up the steps if we have to do the worst case scenario - "switch everything back to 1.11.5"
  • NOTE: we will not go to 1.11.6 as that is a new unknown entity as well.
@hoegaarden

This comment has been minimized.

Copy link
Member

hoegaarden commented Mar 15, 2019

@hoegaarden will check if there is any go references elsewhere in release tooling

I couldn't find any reference to a specific version of go in anago/gcb/gcbmgr.

@BenTheElder

This comment has been minimized.

Copy link
Member

BenTheElder commented Mar 15, 2019

Upgrade to go1.12.1 is almost ready to go, just waiting on the kube-cross image to get promoted #75413

@imkin

This comment has been minimized.

Copy link
Contributor

imkin commented Mar 15, 2019

@dims

This comment has been minimized.

Copy link
Member

dims commented Mar 15, 2019

Here's the test-infra change from @amwat kubernetes/test-infra#11804

@dims

This comment has been minimized.

Copy link
Member

dims commented Mar 15, 2019

docker-library/official-images#5550 has merged, but @amwat @BenTheElder and i are going ahead with what we had planned earlier.

@BenTheElder is in the middle of a CL request for kube-cross with a hacked Dockerfile in #75413

@amwat has a kubekins-test image, but that's under control (different GCS bucket)

@dims

This comment has been minimized.

Copy link
Member

dims commented Mar 15, 2019

@dims

This comment has been minimized.

Copy link
Member

dims commented Mar 15, 2019

docker pull gcr.io/google-containers/kube-cross:v1.12.1-1 works ..
docker pull k8s.gcr.io/kube-cross:v1.12.1-1 does not work yet ... probably need to wait for some more time

docker pull k8s.gcr.io/kube-cross:v1.12.0-1 works for sure (that's the existing image that we use)

@dims

This comment has been minimized.

Copy link
Member

dims commented Mar 15, 2019

looks like the image made it all the way now

@dims

This comment has been minimized.

Copy link
Member

dims commented Mar 15, 2019

unholding #75393 so that can merge

@dims

This comment has been minimized.

Copy link
Member

dims commented Mar 15, 2019

waiting on pull-kubernetes-kubemark-e2e-gce-big to go green in #75413

@dims

This comment has been minimized.

Copy link
Member

dims commented Mar 15, 2019

@liggitt can you please approve #75413 ?

@dims

This comment has been minimized.

Copy link
Member

dims commented Mar 15, 2019

@neolit123

This comment has been minimized.

Copy link
Member

neolit123 commented Mar 16, 2019

EDIT: looks like a false alarm according to @BenTheElder
not all changes are up yet.

@spiffxp @dims

the SIGABRT is back after using 1.12.1:

I0316 02:37:10.603] fatal error: sync: inconsistent mutex state
I0316 02:37:10.606] 
I0316 02:37:10.606] goroutine 292 [running]:
I0316 02:37:10.606] runtime.throw(0x4b8393e, 0x1e)
I0316 02:37:10.606] 	/usr/local/go/src/runtime/panic.go:617 +0x72 fp=0xc002065e30 
...

https://prow.k8s.io/log?id=2847&job=ci-kubernetes-e2e-gce-new-master-upgrade-cluster
https://k8s-testgrid.appspot.com/sig-release-master-upgrade#gce-new-master-upgrade-cluster

see my comment from here:
#75305 (comment)

so this could simply be a case of resource exhaustion for other unknown reasons, somehow matching the 1.12 switch timing.

for now, my suggesting is to apply our fix:
#75305

roll the release with 1.12.1 and investigate better fixes post-release.

@justinsb

This comment has been minimized.

Copy link
Member

justinsb commented Mar 16, 2019

I'd hope we haven't had enough time yet @neolit123 but I'm trying to figure it out....

@tpepper

This comment has been minimized.

Copy link
Contributor

tpepper commented Mar 16, 2019

We've finally got #75413 (updates to 1.12.1) and #75393 (revert workaround) merged.

The revert merged first and was caught in a CI run, which in a way nicely failed:
https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-new-master-upgrade-cluster/2847
"fatal error: inconsistent mutex state"
This is nice because it is expected and again gives us a feeling the problem is common somehow in this one test case.

With the updates to 1.12.1 bits merged we're awaiting soak of this e2e over the weekend and watching for failures in test runs >= 2848. We're specifically hoping for green runs and no "fatal error: inconsistent mutex state". Output will arrive at https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-new-master-upgrade-cluster/2848 with that test finishing probably by 10am Mar 16 Pacific time, and if the issue is not fixed we could see the old error message before midnight tonight even (hoping this is not the case though!).

@neolit123

This comment has been minimized.

Copy link
Member

neolit123 commented Mar 16, 2019

ok, looks like https://prow.k8s.io/log?id=2848&job=ci-kubernetes-e2e-gce-new-master-upgrade-cluster
doesn't have the problem with go 1.12.1.

@dims

This comment has been minimized.

Copy link
Member

dims commented Mar 16, 2019

Whew!!

@dims

This comment has been minimized.

Copy link
Member

dims commented Mar 16, 2019

@spiffxp

This comment has been minimized.

Copy link
Member Author

spiffxp commented Mar 16, 2019

Checking in now that https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-new-master-upgrade-cluster/2848 has completed. The "inconsistent mutex state" failure didn't crop up. One flake that appears unrelated (has happened prior to go1.12 migration)

Agree we are clear to move to go1.12.1 the "correct" way.

@liggitt

This comment has been minimized.

Copy link
Member

liggitt commented Mar 19, 2019

we're now on 1.12.1 official images

/close

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Mar 19, 2019

@liggitt: Closing this issue.

In response to this:

we're now on 1.12.1 official images

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.