Upgrade k8s-infra prow build clusters from v1.14 to v1.15 #1120

spiffxp · 2020-08-07T16:10:13Z

v1.14 deprecation was announced here: https://cloud.google.com/kubernetes-engine/docs/release-notes#coming-soon-20200722

The clusters should have been automatically upgraded: https://cloud.google.com/kubernetes-engine/docs/release-notes#scheduled_automatic_upgrades

This issue is to confirm whether they have been, and if not, initiate such an upgrade

/area prow
/wg k8s-infra
/sig testing

spiffxp · 2020-08-08T20:59:19Z

We're sitting on 1.14.10-gke.42 for the control plane, and 1.14.10-gke.37 for the main node pool

spiffxp · 2020-08-31T21:28:45Z

We're still sitting on 1.14. I am waiting until we've drained the bulk of outstanding v1.20 PR's (ref: https://groups.google.com/g/kubernetes-dev/c/YXGBa6pxLzo/discussion) before explicitly triggering this.

There's still a chance it'll happen when we're not watching though.

spiffxp · 2020-09-11T16:27:45Z

Upgrading k8s-infra-prow-build's control plane from 1.14.10-gke.42 to 1.15.12-gke.17

spiffxp · 2020-09-11T17:05:09Z

Control plane upgraded, next the greenhouse nodepool. This may disrupt bazel-based jobs, though they should fall back to not using the cache when it's unavailable.

spiffxp · 2020-09-11T17:46:55Z

Greenhouse nodepool upgraded. Waiting until kubernetes/test-infra#19182 (comment) is resolved before proceeding with the main nodepool

spiffxp · 2020-09-11T18:38:24Z

/assign
OK, prow was reverted and sinker is behaving as expected again.

spiffxp · 2020-09-11T18:46:14Z

Upgrading k8s-infra-prow-build's default node pool from 1.14.10-gke.42 to 1.15.12-gke.17
(4-5 min per node, currently 41 nodes... check back in 2 hours)

spiffxp · 2020-09-11T20:22:15Z

29/41 nodes done

spiffxp · 2020-09-11T22:42:51Z

And everything is at 1.15.12-gke.17 after ~4 hours

... so I think that node-by-node-upgrade and cluster-autoscaling aren't the best of friends. I suspect this would have gone more quickly if we had spun up an entirely new nodepool that was on 1.15.12 to begin with, and cordoned the the old nodepool.

I would like to move up to v1.16 next but I think we'll leave it here for the weekend

spiffxp · 2020-09-11T23:17:46Z

As a rough guess of "how disruptive was this?" I filtered down to kubernetes/kubernetes jobs on prow.k8s.io. Accepting that some presubmits are just gonna fail (on top of whatever flakiness may be out there), this looks reasonably non-disruptive. Specifically I'm looking at the left half of the graph (now - 6h ago)

Another guess - look at the plank dashboard. No increase in jobs hitting failure state
(https://monitoring.prow.k8s.io/d/e1778910572e3552a935c2035ce80369/plank-dashboard?orgId=1&from=now-12h&to=now&var-cluster=k8s-infra-prow-build&var-org=All&var-repo=All&var-state=$__all&var-type=$__all&var-group_by_1=type&var-group_by_2=state&var-group_by_3=cluster)

spiffxp · 2020-09-11T23:17:54Z

/close

k8s-ci-robot · 2020-09-11T23:18:07Z

@spiffxp: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

spiffxp · 2020-09-12T03:08:16Z

/reopen
onnnn the other hand, @tpepper has noticed a number jobs seem to stuck in running for a while
e.g. https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/94470/pull-kubernetes-node-e2e/1304501511241338880/ say "7h" in progress

https://prow.k8s.io/tide-history?repo=kubernetes%2Fkubernetes&branch=master shows tide last issued a TRIGGER action around 1:50pm PT

k8s-ci-robot · 2020-09-12T03:08:28Z

@spiffxp: Reopened this issue.

In response to this:

/reopen
onnnn the other hand, @tpepper has noticed a number jobs seem to stuck in running for a while
e.g. https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/94470/pull-kubernetes-node-e2e/1304501511241338880/ say "7h" in progress

https://prow.k8s.io/tide-history?repo=kubernetes%2Fkubernetes&branch=master shows tide last issued a TRIGGER action around 1:50pm PT

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

spiffxp · 2020-09-12T03:13:54Z

looking at that node e2e job, I get this for prowjob yaml: https://prow.k8s.io/prowjob?prowjob=a56a5463-f464-11ea-a7c8-9eb8089ce657

pulling some useful fields from that

cluster: k8s-infra-prow-build
pod_name: a56a5463-f464-11ea-a7c8-9eb8089ce657

let's go look at the build cluster

spiffxp@cloudshell:~ (k8s-infra-prow-build)$ k get pods -n test-pods | grep a56a5463-f464-11ea-a7c8-9eb8089ce657
a56a5463-f464-11ea-a7c8-9eb8089ce657   1/1     Terminating   0          7h43m

let's see, are there other pods stuck in Terminating status?

$ k get pods -n test-pods | grep Terminating
01c66c46-f46d-11ea-8508-9a96569b20f1   1/1     Terminating   0          6h45m
14aea8d7-f45d-11ea-8508-9a96569b20f1   2/2     Terminating   0          8h
1c10ce48-f475-11ea-8508-9a96569b20f1   2/2     Terminating   0          5h47m
1fee4dd0-f46e-11ea-8508-9a96569b20f1   2/2     Terminating   0          6h37m
30bd3711-f46b-11ea-8508-9a96569b20f1   1/1     Terminating   0          6h58m
31079e2c-f46e-11ea-9130-56f34bfb1616   1/1     Terminating   0          6h37m
310be890-f46e-11ea-9130-56f34bfb1616   1/1     Terminating   0          6h37m
36910195-f46a-11ea-8508-9a96569b20f1   1/1     Terminating   0          7h5m
3c38c3b4-f469-11ea-8508-9a96569b20f1   1/1     Terminating   0          7h12m
3ca1a5cf-f45a-11ea-8233-42bc8ee613a9   1/1     Terminating   0          8h
44edcb66-f461-11ea-9284-967bb86a2b2f   1/1     Terminating   0          8h
5009aedd-f45e-11ea-8c01-ca1150a2ee58   1/1     Terminating   0          8h
50103df1-f45e-11ea-8c01-ca1150a2ee58   1/1     Terminating   0          8h
51784dc1-f450-11ea-8233-42bc8ee613a9   1/1     Terminating   0          10h
533d30e5-f456-11ea-8233-42bc8ee613a9   1/1     Terminating   0          9h
5681a11b-f471-11ea-8508-9a96569b20f1   1/1     Terminating   0          6h14m
584553d6-f477-11ea-8508-9a96569b20f1   1/1     Terminating   0          5h31m
6101bc0d-f470-11ea-a7c8-9eb8089ce657   1/1     Terminating   0          6h21m
6105dc9e-f470-11ea-a7c8-9eb8089ce657   1/1     Terminating   0          6h21m
6ca76e16-f46e-11ea-a7c8-9eb8089ce657   2/2     Terminating   0          6h35m
6ef7f131-f473-11ea-8508-9a96569b20f1   1/1     Terminating   0          5h59m
70f9109e-f466-11ea-8508-9a96569b20f1   2/2     Terminating   0          7h32m
7de725fe-f46a-11ea-8508-9a96569b20f1   2/2     Terminating   0          7h3m
7ff7423f-f45d-11ea-8508-9a96569b20f1   2/2     Terminating   0          8h
826a3f3c-f454-11ea-8233-42bc8ee613a9   1/1     Terminating   0          9h
84f080c7-f471-11ea-9130-56f34bfb1616   2/2     Terminating   0          6h13m
8ed24720-f47a-11ea-8508-9a96569b20f1   1/1     Terminating   0          5h8m
a56a5463-f464-11ea-a7c8-9eb8089ce657   1/1     Terminating   0          7h45m
a56f764b-f464-11ea-a7c8-9eb8089ce657   2/2     Terminating   0          7h45m
a92edf86-f46f-11ea-8508-9a96569b20f1   1/1     Terminating   0          6h26m
acf9b650-f468-11ea-8508-9a96569b20f1   1/1     Terminating   0          7h16m
acfc2c7d-f468-11ea-8508-9a96569b20f1   2/2     Terminating   0          7h16m
b2a12f4d-f467-11ea-8508-9a96569b20f1   1/1     Terminating   0          7h23m
b696febc-f460-11ea-8508-9a96569b20f1   1/1     Terminating   0          8h
c8fb882c-f475-11ea-9130-56f34bfb1616   1/1     Terminating   0          5h42m
c978d235-f463-11ea-8508-9a96569b20f1   1/1     Terminating   0          7h51m
cbf83c22-f477-11ea-9130-56f34bfb1616   2/2     Terminating   0          5h28m
dfa5f616-f437-11ea-a379-bea988342348   1/1     Terminating   0          13h
eb769c01-f45d-11ea-8508-9a96569b20f1   1/1     Terminating   0          8h
f18dab5a-f476-11ea-9284-967bb86a2b2f   1/1     Terminating   0          5h34m
f1c2fa92-f476-11ea-9284-967bb86a2b2f   2/2     Terminating   0          5h34m
f2f1c361-f462-11ea-8508-9a96569b20f1   2/2     Terminating   0          7h57m
f65273f6-f470-11ea-9130-56f34bfb1616   2/2     Terminating   0          6h17m
fa67cfe1-f467-11ea-8508-9a96569b20f1   2/2     Terminating   0          7h21m

Possible mitigations:

try running /test foo on the afflicted PR(s) and see if that triggers a new job
delete the pod, see if plank reschedules

Also:

why are things stuck in terminating?

Definitely think migrating to a new node pool is the upgrade path to use next time

spiffxp · 2020-09-12T03:26:05Z

https://console.cloud.google.com/logs/viewer?project=k8s-infra-prow-build&minLogLevel=0&expandAll=false&timestamp=2020-09-12T03:20:19.152000000Z&customFacets=&limitCustomFacetWidth=true&dateRangeStart=2020-09-11T19:20:19.000Z&dateRangeEnd=2020-09-11T21:20:19.404Z&interval=CUSTOM&resource=k8s_cluster&scrollTimestamp=2020-09-11T19:47:46.065886000Z&filters=text:a56a5463-f464-11ea-a7c8-9eb8089ce657&angularJsUrl=%2Flogs%2Fviewer%3Fproject%3Dk8s-infra-prow-build%26minLogLevel%3D0%26expandAll%3Dfalse%26timestamp%3D2020-09-12T03:20:19.152000000Z%26customFacets%3D%26limitCustomFacetWidth%3Dtrue%26dateRangeStart%3D2020-09-11T19:20:19.000Z%26dateRangeEnd%3D2020-09-11T21:20:19.404Z%26interval%3DCUSTOM%26resource%3Dk8s_cluster%26scrollTimestamp%3D2020-09-11T19:47:25.914570000Z%26filters%3Dtext:a56a5463-f464-11ea-a7c8-9eb8089ce657

12:26pm pod create call
12:38pm pod delete call
(every minute thereafter) pod delete call

what's trying to delete it? principalEmail: "system:serviceaccount:kube-system:pod-garbage-collector"

spiffxp · 2020-09-12T03:33:50Z

ok, does manually deleting do any better?

spiffxp@cloudshell:~ (k8s-infra-prow-build)$ k delete -n test-pods pod a56a5463-f464-11ea-a7c8-9eb8089ce657
pod "a56a5463-f464-11ea-a7c8-9eb8089ce657" deleted
# hangs...
^C
spiffxp@cloudshell:~ (k8s-infra-prow-build)$ k get pods -n test-pods | grep a56a5463-f464-11ea-a7c8-9eb8089ce657
a56a5463-f464-11ea-a7c8-9eb8089ce657   1/1     Terminating   0          8h
spiffxp@cloudshell:~ (k8s-infra-prow-build)$ k delete -n test-pods pod a56a5463-f464-11ea-a7c8-9eb8089ce657 --grace-period=0 --force
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "a56a5463-f464-11ea-a7c8-9eb8089ce657" force deleted
# hangs...
^C
spiffxp@cloudshell:~ (k8s-infra-prow-build)$ k get pods -n test-pods | grep a56a5463-f464-11ea-a7c8-9eb8089ce657
a56a5463-f464-11ea-a7c8-9eb8089ce657   1/1     Terminating   0          8h

no

spiffxp · 2020-09-12T03:50:53Z

spiffxp@cloudshell:~ (k8s-infra-prow-build)$ k describe pod -n test-pods a56a5463-f464-11ea-a7c8-9eb8089ce657
Name:                      a56a5463-f464-11ea-a7c8-9eb8089ce657
Namespace:                 test-pods
Priority:                  0
Node:                      gke-prow-build-pool4-2020082817590115-b6fe6f0c-wzg9/10.128.15.230
Start Time:                Fri, 11 Sep 2020 19:26:11 +0000

How about that node

spiffxp@cloudshell:~ (k8s-infra-prow-build)$ k get nodes | grep gke-prow-build-pool4-2020082817590115-b6fe6f0c-wzg9
spiffxp@cloudshell:~ (k8s-infra-prow-build)$

Looking at logs https://console.cloud.google.com/logs/viewer?project=k8s-infra-prow-build&minLogLevel=0&expandAll=false&timestamp=2020-09-12T03:45:42.461000000Z&customFacets=&limitCustomFacetWidth=true&dateRangeStart=2020-09-11T03:45:42.712Z&dateRangeEnd=2020-09-12T03:45:42.712Z&interval=P1D&resource=k8s_node&scrollTimestamp=2020-09-11T19:38:52.000000000Z&filters=text:gke-prow-build-pool4-2020082817590115-b6fe6f0c-wzg9&angularJsUrl=%2Flogs%2Fviewer%3Fproject%3Dk8s-infra-prow-build%26minLogLevel%3D0%26expandAll%3Dfalse%26timestamp%3D2020-09-12T03:20:19.152000000Z%26customFacets%3D%26limitCustomFacetWidth%3Dtrue%26dateRangeStart%3D2020-09-11T19:20:19.000Z%26dateRangeEnd%3D2020-09-11T21:20:19.404Z%26interval%3DCUSTOM%26resource%3Dk8s_cluster%26scrollTimestamp%3D2020-09-11T19:47:25.914570000Z%26filters%3Dtext:a56a5463-f464-11ea-a7c8-9eb8089ce657

12:37pm node status is NodeNotReady
12:38pm event: DeletingNode

spiffxp · 2020-09-12T04:00:38Z

Seems like we're running into kubernetes/kubernetes#72226

Issuing a /test command will create a new pod. Still not clear how to get rid of the pods stuck in terminating

alejandrox1 · 2020-09-12T04:13:22Z

/would it be reasonable for these kinds of workloads to go down the

kubectl delete po <pod> --grace-period=0 --force

route?

spiffxp · 2020-09-12T04:13:49Z

Issuing that /test command caused the pod to disappear

Last entry in the log for that pod

request: {
   @type: "k8s.io/Patch"    
   metadata: {
    finalizers: null     
   }
  }

So then I manually edited the finalizer for another pod

  finalizers:
  - prow.x-k8s.io/gcsk8sreporter # delete t his line
  labels:
    created-by-prow: "true"
    preset-bazel-remote-cache-enabled: "true"
    preset-bazel-scratch-dir: "true"
    preset-dind-enabled: "true"
    preset-kind-volume-mounts: "true"
    prow.k8s.io/build-id: "1304507550300901377"
    prow.k8s.io/id: fa67cfe1-f467-11ea-8508-9a96569b20f1
    prow.k8s.io/job: ci-kubernetes-kind-ipv6-e2e-parallel-1-19
    prow.k8s.io/type: periodic
  name: fa67cfe1-f467-11ea-8508-9a96569b20f1

spiffxp@cloudshell:~ (k8s-infra-prow-build)$ k edit pods -n test-pods fa67cfe1-f467-11ea-8508-9a96569b20f1
pod/fa67cfe1-f467-11ea-8508-9a96569b20f1 edited
spiffxp@cloudshell:~ (k8s-infra-prow-build)$ k get pods -n test-pods -o=yaml fa67cfe1-f467-11ea-8508-9a96569b20f1
Error from server (NotFound): pods "fa67cfe1-f467-11ea-8508-9a96569b20f1" not found

spiffxp · 2020-09-12T04:14:43Z

@alejandrox1 I tried that (see #1120 (comment)) and it didn't delete

spiffxp · 2020-09-12T04:30:03Z

Everything that had a deletionTimestamp was hung (this was intended for a markdown table, but the formatting looked worse, so fixed-width it is)

$ echo "| created | job | pod | node | "; echo "| --- | --- | --- | --- |"; k get pods -n test-pods --field-selector=status.phase=Running -o=json | jq -r '.items | map(select(.metadata.deletionTimestamp))[] | "|\(.status.startTime) | \(.metadata.labels["prow.k8s.io/job"]) | \(.metadata.name) | \(.spec.nodeName) |"' | sort | tee old-nodepool-jobs
| created | job | pod | node |
| --- | --- | --- | --- |
|2020-09-11T14:05:53Z | ci-kubernetes-e2e-gci-gce-serial | dfa5f616-f437-11ea-a379-bea988342348 | gke-prow-build-pool4-2020082817590115-6b0f3325-1rlt |
|2020-09-11T17:01:01Z | ci-kubernetes-e2e-gce-cos-k8sbeta-serial | 51784dc1-f450-11ea-8233-42bc8ee613a9 | gke-prow-build-pool4-2020082817590115-aee934c3-8fnw |
|2020-09-11T17:31:02Z | ci-kubernetes-e2e-gce-cos-k8sstable1-serial | 826a3f3c-f454-11ea-8233-42bc8ee613a9 | gke-prow-build-pool4-2020082817590115-b6fe6f0c-5j1n |
|2020-09-11T17:44:02Z | ci-kubernetes-gce-conformance-latest | 533d30e5-f456-11ea-8233-42bc8ee613a9 | gke-prow-build-pool4-2020082817590115-b6fe6f0c-hjcr |
|2020-09-11T18:12:02Z | ci-kubernetes-e2e-gci-gce-slow | 3ca1a5cf-f45a-11ea-8233-42bc8ee613a9 | gke-prow-build-pool4-2020082817590115-b6fe6f0c-79j2 |
|2020-09-11T18:32:11Z | ci-kubernetes-integration-master | 14aea8d7-f45d-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-b6fe6f0c-9q3g |
|2020-09-11T18:35:11Z | ci-kubernetes-integration-stable3 | 7ff7423f-f45d-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-b6fe6f0c-5j1n |
|2020-09-11T18:38:11Z | ci-kubernetes-e2e-gci-gce-ingress-canary | eb769c01-f45d-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-6b0f3325-c0w9 |
|2020-09-11T18:41:11Z | pull-kubernetes-e2e-gce-100-performance | 50103df1-f45e-11ea-8c01-ca1150a2ee58 | gke-prow-build-pool4-2020082817590115-b6fe6f0c-dxrc |
|2020-09-11T18:41:11Z | pull-kubernetes-e2e-gce-ubuntu-containerd | 5009aedd-f45e-11ea-8c01-ca1150a2ee58 | gke-prow-build-pool4-2020082817590115-b6fe6f0c-s8k5 |
|2020-09-11T18:58:11Z | ci-kubernetes-e2e-gci-gce-alpha-features | b696febc-f460-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-b6fe6f0c-s8k5 |
|2020-09-11T19:03:36Z | pull-kubernetes-e2e-gce-100-performance | 44edcb66-f461-11ea-9284-967bb86a2b2f | gke-prow-build-pool4-2020082817590115-6b0f3325-8qfl |
|2020-09-11T19:14:11Z | ci-kubernetes-kind-e2e-parallel-1-19 | f2f1c361-f462-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-b6fe6f0c-whxr |
|2020-09-11T19:20:11Z | ci-kubernetes-node-kubelet-features-1-16 | c978d235-f463-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-b6fe6f0c-slgg |
|2020-09-11T19:26:11Z | pull-kubernetes-verify | a56f764b-f464-11ea-a7c8-9eb8089ce657 | gke-prow-build-pool4-2020082817590115-6b0f3325-9nrq |
|2020-09-11T19:39:11Z | periodic-kubernetes-bazel-test-master | 70f9109e-f466-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-6b0f3325-45tc |
|2020-09-11T19:48:11Z | ci-kubernetes-e2e-gci-gce-reboot | b2a12f4d-f467-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-6b0f3325-k9j0 |
|2020-09-11T19:55:11Z | ci-kubernetes-node-kubelet-features-1-17 | acf9b650-f468-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-6b0f3325-c0w9 |
|2020-09-11T19:55:11Z | ci-kubernetes-verify-master | acfc2c7d-f468-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-6b0f3325-t898 |
|2020-09-11T19:59:11Z | ci-kubernetes-e2e-gci-gce-flaky-repro | 3c38c3b4-f469-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-6b0f3325-np72 |
|2020-09-11T20:06:11Z | ci-kubernetes-e2e-gce-cos-k8sbeta-default | 36910195-f46a-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-6b0f3325-kzss |
|2020-09-11T20:08:11Z | ci-kubernetes-integration-beta | 7de725fe-f46a-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-6b0f3325-p0xd |
|2020-09-11T20:13:11Z | ci-kubernetes-e2e-gce-cos-k8sbeta-reboot | 30bd3711-f46b-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-6b0f3325-xfw6 |
|2020-09-11T20:26:11Z | ci-kubernetes-e2e-gce-cos-k8sbeta-slow | 01c66c46-f46d-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-aee934c3-2drd |
|2020-09-11T20:34:11Z | ci-kubernetes-gce-conformance-latest-kubetest2 | 1fee4dd0-f46e-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-aee934c3-zvdk |
|2020-09-11T20:34:41Z | pull-kubernetes-e2e-gce-100-performance | 310be890-f46e-11ea-9130-56f34bfb1616 | gke-prow-build-pool4-2020082817590115-aee934c3-002v |
|2020-09-11T20:34:41Z | pull-kubernetes-e2e-gce-ubuntu-containerd | 31079e2c-f46e-11ea-9130-56f34bfb1616 | gke-prow-build-pool4-2020082817590115-aee934c3-2zd5 |
|2020-09-11T20:37:22Z | pull-kubernetes-verify | 6ca76e16-f46e-11ea-a7c8-9eb8089ce657 | gke-prow-build-pool4-2020082817590115-aee934c3-q882 |
|2020-09-11T20:45:11Z | ci-kubernetes-e2e-gce-cos-k8sstable1-default | a92edf86-f46f-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-aee934c3-7wf1 |
|2020-09-11T20:50:11Z | pull-kubernetes-e2e-gce-100-performance | 6105dc9e-f470-11ea-a7c8-9eb8089ce657 | gke-prow-build-pool4-2020082817590115-aee934c3-qqsj |
|2020-09-11T20:50:11Z | pull-kubernetes-e2e-gce-ubuntu-containerd | 6101bc0d-f470-11ea-a7c8-9eb8089ce657 | gke-prow-build-pool4-2020082817590115-aee934c3-glsn |
|2020-09-11T20:54:41Z | pull-kubernetes-dependencies | f65273f6-f470-11ea-9130-56f34bfb1616 | gke-prow-build-pool4-2020082817590115-aee934c3-2zd5 |
|2020-09-11T20:57:11Z | ci-kubernetes-e2e-gce-cos-k8sbeta-alphafeatures | 5681a11b-f471-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-aee934c3-8fnw |
|2020-09-11T20:58:41Z | pull-kubernetes-e2e-kind-ipv6 | 84f080c7-f471-11ea-9130-56f34bfb1616 | gke-prow-build-pool4-2020082817590115-aee934c3-8fnw |
|2020-09-11T21:12:11Z | ci-kubernetes-e2e-gci-gce | 6ef7f131-f473-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-aee934c3-kcnp |
|2020-09-11T21:24:11Z | ci-kubernetes-verify-stable1 | 1c10ce48-f475-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-aee934c3-t9mn |
|2020-09-11T21:29:11Z | pull-kubernetes-node-e2e | c8fb882c-f475-11ea-9130-56f34bfb1616 | gke-prow-build-pool4-2020082817590115-aee934c3-s2d1 |
|2020-09-11T21:37:11Z | pull-kubernetes-e2e-gce-100-performance | f18dab5a-f476-11ea-9284-967bb86a2b2f | gke-prow-build-pool4-2020082817590115-aee934c3-qxkl |
|2020-09-11T21:37:11Z | pull-kubernetes-verify | f1c2fa92-f476-11ea-9284-967bb86a2b2f | gke-prow-build-pool4-2020082817590115-aee934c3-tbzc |
|2020-09-11T21:40:11Z | ci-kubernetes-e2e-gce-cos-k8sbeta-ingress | 584553d6-f477-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-aee934c3-wfnz |
|2020-09-11T21:43:41Z | pull-kubernetes-integration | cbf83c22-f477-11ea-9130-56f34bfb1616 | gke-prow-build-pool4-2020082817590115-aee934c3-wfjt |
|2020-09-11T22:03:11Z | ci-kubernetes-e2e-gce-master-new-gci-kubectl-skew-stable1 | 8ed24720-f47a-11ea-8508-9a96569b20f1 | gke-prow-build-pool4-2020082817590115-aee934c3-wfnz |

Patched in empty finalizers for everything that had a deletionTimestamp.

$ for p in $(k get pods -n test-pods --field-selector=status.phase=Running -o=json | jq -r '.items | map(select(.metadata.deletionTimestamp) | .metadata.name)[]'); do 
  k patch -n test-pods pod $p --type=json -p='[{"op": "replace", "path": "/metadata/finalizers", "value":[]}]'; 
done
pod/14aea8d7-f45d-11ea-8508-9a96569b20f1 patched
pod/1c10ce48-f475-11ea-8508-9a96569b20f1 patched
pod/1fee4dd0-f46e-11ea-8508-9a96569b20f1 patched
pod/30bd3711-f46b-11ea-8508-9a96569b20f1 patched
pod/31079e2c-f46e-11ea-9130-56f34bfb1616 patched
pod/310be890-f46e-11ea-9130-56f34bfb1616 patched
pod/36910195-f46a-11ea-8508-9a96569b20f1 patched
pod/3c38c3b4-f469-11ea-8508-9a96569b20f1 patched
pod/3ca1a5cf-f45a-11ea-8233-42bc8ee613a9 patched
pod/44edcb66-f461-11ea-9284-967bb86a2b2f patched
pod/5009aedd-f45e-11ea-8c01-ca1150a2ee58 patched
pod/50103df1-f45e-11ea-8c01-ca1150a2ee58 patched
pod/51784dc1-f450-11ea-8233-42bc8ee613a9 patched
pod/533d30e5-f456-11ea-8233-42bc8ee613a9 patched
pod/5681a11b-f471-11ea-8508-9a96569b20f1 patched
pod/584553d6-f477-11ea-8508-9a96569b20f1 patched
pod/6101bc0d-f470-11ea-a7c8-9eb8089ce657 patched
pod/6105dc9e-f470-11ea-a7c8-9eb8089ce657 patched
pod/6ca76e16-f46e-11ea-a7c8-9eb8089ce657 patched
pod/6ef7f131-f473-11ea-8508-9a96569b20f1 patched
pod/70f9109e-f466-11ea-8508-9a96569b20f1 patched
pod/7de725fe-f46a-11ea-8508-9a96569b20f1 patched
pod/7ff7423f-f45d-11ea-8508-9a96569b20f1 patched
pod/826a3f3c-f454-11ea-8233-42bc8ee613a9 patched
pod/84f080c7-f471-11ea-9130-56f34bfb1616 patched
pod/8ed24720-f47a-11ea-8508-9a96569b20f1 patched
pod/a56f764b-f464-11ea-a7c8-9eb8089ce657 patched
pod/a92edf86-f46f-11ea-8508-9a96569b20f1 patched
pod/acf9b650-f468-11ea-8508-9a96569b20f1 patched
pod/acfc2c7d-f468-11ea-8508-9a96569b20f1 patched
pod/b2a12f4d-f467-11ea-8508-9a96569b20f1 patched
pod/b696febc-f460-11ea-8508-9a96569b20f1 patched
pod/c8fb882c-f475-11ea-9130-56f34bfb1616 patched
pod/c978d235-f463-11ea-8508-9a96569b20f1 patched
pod/cbf83c22-f477-11ea-9130-56f34bfb1616 patched
pod/dfa5f616-f437-11ea-a379-bea988342348 patched
pod/eb769c01-f45d-11ea-8508-9a96569b20f1 patched
pod/f18dab5a-f476-11ea-9284-967bb86a2b2f patched
pod/f1c2fa92-f476-11ea-9284-967bb86a2b2f patched
pod/f2f1c361-f462-11ea-8508-9a96569b20f1 patched
pod/f65273f6-f470-11ea-9130-56f34bfb1616 patched

Nothing stuck in terminating anymore

spiffxp@cloudshell:~ (k8s-infra-prow-build)$ k get pods -n test-pods | grep Terminating
spiffxp@cloudshell:~ (k8s-infra-prow-build)$

spiffxp · 2020-09-12T04:36:52Z

Looks like most of the pods that were stuck in Terminating are now running. Calling it a night, we'll see if prow/tide end up picking up their results later.

spiffxp@cloudshell:~ (k8s-infra-prow-build)$ k get pods -n test-pods $(cat patched-pods.txt)
NAME                                   READY   STATUS      RESTARTS   AGE
14aea8d7-f45d-11ea-8508-9a96569b20f1   2/2     Running     0          7m53s
1c10ce48-f475-11ea-8508-9a96569b20f1   2/2     Running     0          7m48s
1fee4dd0-f46e-11ea-8508-9a96569b20f1   2/2     Running     0          7m51s
30bd3711-f46b-11ea-8508-9a96569b20f1   1/1     Running     0          7m51s
31079e2c-f46e-11ea-9130-56f34bfb1616   1/1     Running     0          7m50s
310be890-f46e-11ea-9130-56f34bfb1616   1/1     Running     0          7m50s
36910195-f46a-11ea-8508-9a96569b20f1   1/1     Running     0          7m51s
3c38c3b4-f469-11ea-8508-9a96569b20f1   1/1     Running     0          7m51s
3ca1a5cf-f45a-11ea-8233-42bc8ee613a9   1/1     Running     0          7m53s
44edcb66-f461-11ea-9284-967bb86a2b2f   1/1     Running     0          7m53s
5009aedd-f45e-11ea-8c01-ca1150a2ee58   1/1     Running     0          7m53s
50103df1-f45e-11ea-8c01-ca1150a2ee58   1/1     Running     0          7m54s
51784dc1-f450-11ea-8233-42bc8ee613a9   1/1     Running     0          7m54s
533d30e5-f456-11ea-8233-42bc8ee613a9   1/1     Running     0          7m54s
5681a11b-f471-11ea-8508-9a96569b20f1   1/1     Running     0          7m50s
584553d6-f477-11ea-8508-9a96569b20f1   1/1     Running     0          7m49s
6101bc0d-f470-11ea-a7c8-9eb8089ce657   1/1     Running     0          7m51s
6105dc9e-f470-11ea-a7c8-9eb8089ce657   1/1     Running     0          7m50s
6ca76e16-f46e-11ea-a7c8-9eb8089ce657   2/2     Running     0          7m52s
6ef7f131-f473-11ea-8508-9a96569b20f1   1/1     Running     0          7m51s
70f9109e-f466-11ea-8508-9a96569b20f1   2/2     Running     0          7m54s
7de725fe-f46a-11ea-8508-9a96569b20f1   2/2     Running     0          7m53s
7ff7423f-f45d-11ea-8508-9a96569b20f1   2/2     Running     0          7m55s
826a3f3c-f454-11ea-8233-42bc8ee613a9   1/1     Running     0          7m56s
84f080c7-f471-11ea-9130-56f34bfb1616   2/2     Running     0          7m52s
8ed24720-f47a-11ea-8508-9a96569b20f1   1/1     Running     0          7m50s
a56f764b-f464-11ea-a7c8-9eb8089ce657   2/2     Running     0          7m55s
a92edf86-f46f-11ea-8508-9a96569b20f1   1/1     Running     0          7m53s
acf9b650-f468-11ea-8508-9a96569b20f1   1/1     Running     0          7m56s
acfc2c7d-f468-11ea-8508-9a96569b20f1   2/2     Running     0          7m56s
b2a12f4d-f467-11ea-8508-9a96569b20f1   1/1     Running     0          7m56s
b696febc-f460-11ea-8508-9a96569b20f1   1/1     Running     0          7m57s
c8fb882c-f475-11ea-9130-56f34bfb1616   1/1     Running     0          7m52s
c978d235-f463-11ea-8508-9a96569b20f1   1/1     Running     0          7m58s
cbf83c22-f477-11ea-9130-56f34bfb1616   2/2     Running     0          7m52s
dfa5f616-f437-11ea-a379-bea988342348   1/1     Running     0          7m58s
eb769c01-f45d-11ea-8508-9a96569b20f1   1/1     Running     0          7m58s
f18dab5a-f476-11ea-9284-967bb86a2b2f   0/1     Error       0          7m53s
f1c2fa92-f476-11ea-9284-967bb86a2b2f   2/2     Running     0          7m54s
f2f1c361-f462-11ea-8508-9a96569b20f1   2/2     Running     0          7m59s
f65273f6-f470-11ea-9130-56f34bfb1616   0/2     Completed   0          7m55s

spiffxp · 2020-09-12T04:40:36Z

(also definitely cordoning and migrating to a new pool next time)

spiffxp · 2020-09-12T19:24:35Z

/close
Looks like tide merged everything it needed to

k8s-ci-robot · 2020-09-12T19:24:47Z

@spiffxp: Closing this issue.

In response to this:

/close
Looks like tide merged everything it needed to

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot added area/prow Setting up or working with prow in general, prow.k8s.io, prow build clusters wg/k8s-infra sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Aug 7, 2020

spiffxp mentioned this issue Sep 11, 2020

Add pdb for pods created by prow #1239

Merged

k8s-ci-robot assigned spiffxp Sep 11, 2020

k8s-ci-robot closed this as completed Sep 11, 2020

k8s-ci-robot reopened this Sep 12, 2020

spiffxp mentioned this issue Sep 12, 2020

Automated cherry pick of #93333: Fix an issue when rotated logs of dead containers are not kubernetes/kubernetes#94470

Merged

k8s-ci-robot closed this as completed Sep 12, 2020

spiffxp mentioned this issue Sep 18, 2020

Prow Control Plane Wedges on Unknown Pods kubernetes/test-infra#19274

Closed

spiffxp mentioned this issue Sep 30, 2020

Extract test step periodically failing kubernetes/test-infra#19386

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade k8s-infra prow build clusters from v1.14 to v1.15 #1120

Upgrade k8s-infra prow build clusters from v1.14 to v1.15 #1120

spiffxp commented Aug 7, 2020

spiffxp commented Aug 8, 2020

spiffxp commented Aug 31, 2020

spiffxp commented Sep 11, 2020

spiffxp commented Sep 11, 2020

spiffxp commented Sep 11, 2020

spiffxp commented Sep 11, 2020

spiffxp commented Sep 11, 2020 •

edited

spiffxp commented Sep 11, 2020

spiffxp commented Sep 11, 2020

spiffxp commented Sep 11, 2020

spiffxp commented Sep 11, 2020

k8s-ci-robot commented Sep 11, 2020

spiffxp commented Sep 12, 2020

k8s-ci-robot commented Sep 12, 2020

spiffxp commented Sep 12, 2020

spiffxp commented Sep 12, 2020

spiffxp commented Sep 12, 2020

spiffxp commented Sep 12, 2020

spiffxp commented Sep 12, 2020

alejandrox1 commented Sep 12, 2020

spiffxp commented Sep 12, 2020

spiffxp commented Sep 12, 2020

spiffxp commented Sep 12, 2020

spiffxp commented Sep 12, 2020

spiffxp commented Sep 12, 2020

spiffxp commented Sep 12, 2020

k8s-ci-robot commented Sep 12, 2020

Upgrade k8s-infra prow build clusters from v1.14 to v1.15 #1120

Upgrade k8s-infra prow build clusters from v1.14 to v1.15 #1120

Comments

spiffxp commented Aug 7, 2020

spiffxp commented Aug 8, 2020

spiffxp commented Aug 31, 2020

spiffxp commented Sep 11, 2020

spiffxp commented Sep 11, 2020

spiffxp commented Sep 11, 2020

spiffxp commented Sep 11, 2020

spiffxp commented Sep 11, 2020 • edited

spiffxp commented Sep 11, 2020

spiffxp commented Sep 11, 2020

spiffxp commented Sep 11, 2020

spiffxp commented Sep 11, 2020

k8s-ci-robot commented Sep 11, 2020

spiffxp commented Sep 12, 2020

k8s-ci-robot commented Sep 12, 2020

spiffxp commented Sep 12, 2020

spiffxp commented Sep 12, 2020

spiffxp commented Sep 12, 2020

spiffxp commented Sep 12, 2020

spiffxp commented Sep 12, 2020

alejandrox1 commented Sep 12, 2020

spiffxp commented Sep 12, 2020

spiffxp commented Sep 12, 2020

spiffxp commented Sep 12, 2020

spiffxp commented Sep 12, 2020

spiffxp commented Sep 12, 2020

spiffxp commented Sep 12, 2020

k8s-ci-robot commented Sep 12, 2020

spiffxp commented Sep 11, 2020 •

edited