Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrading a node from kubelet 1.1.3 to 1.2.0 results in containers getting destroyed and recreated #23104

Closed
ghodss opened this issue Mar 17, 2016 · 65 comments
Labels
area/kubelet lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@ghodss
Copy link
Contributor

ghodss commented Mar 17, 2016

We upgrade our nodes by upgrading an RPM that stops the kubelet, updates the binary and starts it again. The behavior I'm seeing (100% consistently and reproducibly) is that when I upgrade the RPM, the new kubelet starts up, and after maybe 3-5 seconds starts killing and restarting the containers on the machine. Usually it does all of them, but sometimes it does a subset.

You can see the log of the new 1.2 kubelet starting up, up to the point it starts killing containers at https://gist.github.com/ghodss/c579579e53355aed6508.

The pods are not evicted from apiserver, and their age does not change.

Downgrading to 1.1 has the same effect of killing and recreating the containers. Note that I had the exact same thing happen when we upgraded from 1.0.3 to 1.1.3. Let me know if you need any other details.

@bgrant0607 @dchen1107 @yujuhong
(marking as P0 per @bgrant0607)

@ghodss ghodss added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. labels Mar 17, 2016
@bgrant0607
Copy link
Member

Looks like #17234 and #19206.

We need a documented policy about docker label and container-name changes.

For 1.2, at minimum, we need to document this under "Action required" in the release notes, and maybe recommend kubectl drain before upgrading each node.

cc @Random-Liu

@Random-Liu
Copy link
Member

@bgrant0607 Thanks, will do it today.

@dchen1107
Copy link
Member

Sorry for the breakage introduced by this. I did raise the concern when we first worked on #15089, unfortunately docker doesn't support mutable container label. We decided to move on with it because we thought upgrading kubelet binary without kubectl drain is illegal since 1.1 release, and our e2e test never test against this scenario. We didn't realize the customer build their own live-upgrade tools.

@Random-Liu
Copy link
Member

I don't think the docker label related thing has anything to do with the container killing, because in fact we only rely on it to get RestartCount and TerminationMessagePath now.

Here is log for one of the pods hello-www-weqj8_default(6fbdc794-ebe3-11e5-9bd3-005056894f70):

I0317 00:40:56.146042   14517 config.go:393] Receiving a new pod "hello-www-weqj8_default(6fbdc794-ebe3-11e5-9bd3-005056894f70)"
I0317 00:40:56.271126   14517 generic.go:138] GenericPLEG: 6fbdc794-ebe3-11e5-9bd3-005056894f70/4f135abee653c45663d245a568bc5a368b569d5fe172f92c509160d8271f022f: non-existent -> running
I0317 00:40:56.271145   14517 generic.go:138] GenericPLEG: 6fbdc794-ebe3-11e5-9bd3-005056894f70/ec6bcafdf896317482884bdc04cc2a554e1d65d6435e941d4d4f8c73a2496e5a: non-existent -> exited
I0317 00:40:56.271161   14517 generic.go:138] GenericPLEG: 6fbdc794-ebe3-11e5-9bd3-005056894f70/d651c655bb50ee11d3d2ddce44bab22d1bb7b5a2984f43602e53abb0b41a471b: non-existent -> exited
I0317 00:40:56.271177   14517 generic.go:138] GenericPLEG: 6fbdc794-ebe3-11e5-9bd3-005056894f70/40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626: non-existent -> running
I0317 00:40:56.309904   14517 manager.go:324] Container inspect result: {ID:4f135abee653c45663d245a568bc5a368b569d5fe172f92c509160d8271f022f Created:2016-03-17 07:38:48.982536096 +0000 UTC Path:/hello-www Args:[-cert-file /certs/local/server.crt -key-file /certs/local/server.key] Config:0xc8204dae00 State:{Running:true Paused:false Restarting:false OOMKilled:false Pid:13560 ExitCode:0 Error: StartedAt:2016-03-17 07:38:49.64624299 +0000 UTC FinishedAt:0001-01-01 00:00:00 +0000 UTC} Image:sha256:8efe20f45ba398ca4ec15162d1137f7471a47bbdcf2f555660bca6c4bfbca3eb Node:<nil> NetworkSettings:0xc820419e00 SysInitPath: ResolvConfPath:/var/lib/docker/containers/40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626/resolv.conf HostnamePath:/var/lib/docker/containers/40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626/hostname HostsPath:/var/lib/kubelet/pods/6fbdc794-ebe3-11e5-9bd3-005056894f70/etc-hosts LogPath:/var/lib/docker/containers/4f135abee653c45663d245a568bc5a368b569d5fe172f92c509160d8271f022f/4f135abee653c45663d245a568bc5a368b569d5fe172f92c509160d8271f022f-json.log Name:/k8s_hello-www.5e13f7ac_hello-www-weqj8_default_6fbdc794-ebe3-11e5-9bd3-005056894f70_5aa6d563 Driver:devicemapper Mounts:[{Name: Source:/box/var/log/metrics/hello-www Destination:/var/metrics Driver: Mode: RW:true} {Name: Source:/box/var/log/raw/hello-www Destination:/var/log/service Driver: Mode: RW:true} {Name: Source:/var/lib/kubelet/pods/6fbdc794-ebe3-11e5-9bd3-005056894f70/volumes/kubernetes.io~secret/default-token-7pgja Destination:/var/run/secrets/kubernetes.io/serviceaccount Driver: Mode:ro RW:false} {Name: Source:/var/lib/kubelet/pods/6fbdc794-ebe3-11e5-9bd3-005056894f70/etc-hosts Destination:/etc/hosts Driver: Mode: RW:true} {Name: Source:/var/lib/kubelet/pods/6fbdc794-ebe3-11e5-9bd3-005056894f70/containers/hello-www/4f135abee653c45663d245a568bc5a368b569d5fe172f92c509160d8271f022f Destination:/dev/termination-log Driver: Mode: RW:true}] Volumes:map[] VolumesRW:map[] HostConfig:0xc820760000 ExecIDs:[] RestartCount:0 AppArmorProfile:}
I0317 00:40:56.310608   14517 manager.go:324] Container inspect result: {ID:ec6bcafdf896317482884bdc04cc2a554e1d65d6435e941d4d4f8c73a2496e5a Created:2016-03-17 06:52:48.553561904 +0000 UTC Path:/hello-www Args:[-cert-file /certs/local/server.crt -key-file /certs/local/server.key] Config:0xc8204db180 State:{Running:false Paused:false Restarting:false OOMKilled:false Pid:0 ExitCode:2 Error: StartedAt:2016-03-17 06:52:49.90519467 +0000 UTC FinishedAt:2016-03-17 07:38:48.902221695 +0000 UTC} Image:sha256:8efe20f45ba398ca4ec15162d1137f7471a47bbdcf2f555660bca6c4bfbca3eb Node:<nil> NetworkSettings:0xc8200a8200 SysInitPath: ResolvConfPath:/var/lib/docker/containers/40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626/resolv.conf HostnamePath:/var/lib/docker/containers/40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626/hostname HostsPath:/var/lib/kubelet/pods/6fbdc794-ebe3-11e5-9bd3-005056894f70/etc-hosts LogPath:/var/lib/docker/containers/ec6bcafdf896317482884bdc04cc2a554e1d65d6435e941d4d4f8c73a2496e5a/ec6bcafdf896317482884bdc04cc2a554e1d65d6435e941d4d4f8c73a2496e5a-json.log Name:/k8s_hello-www.b7cd18c3_hello-www-weqj8_default_6fbdc794-ebe3-11e5-9bd3-005056894f70_726e5496 Driver:devicemapper Mounts:[{Name: Source:/box/var/log/raw/hello-www Destination:/var/log/service Driver: Mode: RW:true} {Name: Source:/var/lib/kubelet/pods/6fbdc794-ebe3-11e5-9bd3-005056894f70/volumes/kubernetes.io~secret/default-token-7pgja Destination:/var/run/secrets/kubernetes.io/serviceaccount Driver: Mode:ro RW:false} {Name: Source:/var/lib/kubelet/pods/6fbdc794-ebe3-11e5-9bd3-005056894f70/etc-hosts Destination:/etc/hosts Driver: Mode: RW:true} {Name: Source:/var/lib/kubelet/pods/6fbdc794-ebe3-11e5-9bd3-005056894f70/containers/hello-www/726e5496 Destination:/dev/termination-log Driver: Mode: RW:true} {Name: Source:/box/var/log/metrics/hello-www Destination:/var/metrics Driver: Mode: RW:true}] Volumes:map[] VolumesRW:map[] HostConfig:0xc8207602c0 ExecIDs:[] RestartCount:0 AppArmorProfile:}
I0317 00:40:56.311228   14517 manager.go:324] Container inspect result: {ID:d651c655bb50ee11d3d2ddce44bab22d1bb7b5a2984f43602e53abb0b41a471b Created:2016-03-17 04:52:34.644130351 +0000 UTC Path:/hello-www Args:[-cert-file /certs/local/server.crt -key-file /certs/local/server.key] Config:0xc8204db500 State:{Running:false Paused:false Restarting:false OOMKilled:false Pid:0 ExitCode:2 Error: StartedAt:2016-03-17 04:52:35.496405052 +0000 UTC FinishedAt:2016-03-17 06:52:48.098149192 +0000 UTC} Image:sha256:8efe20f45ba398ca4ec15162d1137f7471a47bbdcf2f555660bca6c4bfbca3eb Node:<nil> NetworkSettings:0xc8200a8500 SysInitPath: ResolvConfPath:/var/lib/docker/containers/40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626/resolv.conf HostnamePath:/var/lib/docker/containers/40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626/hostname HostsPath:/var/lib/kubelet/pods/6fbdc794-ebe3-11e5-9bd3-005056894f70/etc-hosts LogPath:/var/lib/docker/containers/d651c655bb50ee11d3d2ddce44bab22d1bb7b5a2984f43602e53abb0b41a471b/d651c655bb50ee11d3d2ddce44bab22d1bb7b5a2984f43602e53abb0b41a471b-json.log Name:/k8s_hello-www.5e13f7ac_hello-www-weqj8_default_6fbdc794-ebe3-11e5-9bd3-005056894f70_63a831f5 Driver:devicemapper Mounts:[{Name: Source:/var/lib/kubelet/pods/6fbdc794-ebe3-11e5-9bd3-005056894f70/containers/hello-www/d651c655bb50ee11d3d2ddce44bab22d1bb7b5a2984f43602e53abb0b41a471b Destination:/dev/termination-log Driver: Mode: RW:true} {Name: Source:/box/var/log/metrics/hello-www Destination:/var/metrics Driver: Mode: RW:true} {Name: Source:/box/var/log/raw/hello-www Destination:/var/log/service Driver: Mode: RW:true} {Name: Source:/var/lib/kubelet/pods/6fbdc794-ebe3-11e5-9bd3-005056894f70/volumes/kubernetes.io~secret/default-token-7pgja Destination:/var/run/secrets/kubernetes.io/serviceaccount Driver: Mode:ro RW:false} {Name: Source:/var/lib/kubelet/pods/6fbdc794-ebe3-11e5-9bd3-005056894f70/etc-hosts Destination:/etc/hosts Driver: Mode: RW:true}] Volumes:map[] VolumesRW:map[] HostConfig:0xc820760580 ExecIDs:[] RestartCount:0 AppArmorProfile:}
I0317 00:40:56.311864   14517 manager.go:324] Container inspect result: {ID:40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626 Created:2016-03-17 01:58:20.143609537 +0000 UTC Path:/pause Args:[] Config:0xc8204dba40 State:{Running:true Paused:false Restarting:false OOMKilled:false Pid:17057 ExitCode:0 Error: StartedAt:2016-03-17 01:58:20.541583289 +0000 UTC FinishedAt:0001-01-01 00:00:00 +0000 UTC} Image:sha256:b22161b02377eca14b84e9542259a15baa268043116197ecec8dd8c373f7b0ac Node:<nil> NetworkSettings:0xc8200a8600 SysInitPath: ResolvConfPath:/var/lib/docker/containers/40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626/resolv.conf HostnamePath:/var/lib/docker/containers/40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626/hostname HostsPath:/var/lib/docker/containers/40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626/hosts LogPath:/var/lib/docker/containers/40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626/40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626-json.log Name:/k8s_POD.c4072b55_hello-www-weqj8_default_6fbdc794-ebe3-11e5-9bd3-005056894f70_956170d6 Driver:devicemapper Mounts:[] Volumes:map[] VolumesRW:map[] HostConfig:0xc820760840 ExecIDs:[] RestartCount:0 AppArmorProfile:}
I0317 00:40:56.311929   14517 generic.go:299] PLEG: Write status for hello-www-weqj8/default: &{ID:6fbdc794-ebe3-11e5-9bd3-005056894f70 Name:hello-www-weqj8 Namespace:default IP:10.20.5.131 ContainerStatuses:[0xc8204b7420 0xc8204b7880 0xc8200ac0e0 0xc8200ac460]} (err: <nil>)
I0317 00:40:59.181815   14517 manager.go:802] Added container: "/docker/40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626" (aliases: [k8s_POD.c4072b55_hello-www-weqj8_default_6fbdc794-ebe3-11e5-9bd3-005056894f70_956170d6 40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626], namespace: "docker")
I0317 00:41:00.093877   14517 generic.go:138] GenericPLEG: 6fbdc794-ebe3-11e5-9bd3-005056894f70/ec6bcafdf896317482884bdc04cc2a554e1d65d6435e941d4d4f8c73a2496e5a: exited -> non-existent
I0317 00:41:00.093894   14517 generic.go:138] GenericPLEG: 6fbdc794-ebe3-11e5-9bd3-005056894f70/d651c655bb50ee11d3d2ddce44bab22d1bb7b5a2984f43602e53abb0b41a471b: exited -> non-existent
I0317 00:41:00.560830   14517 manager.go:324] Container inspect result: {ID:4f135abee653c45663d245a568bc5a368b569d5fe172f92c509160d8271f022f Created:2016-03-17 07:38:48.982536096 +0000 UTC Path:/hello-www Args:[-cert-file /certs/local/server.crt -key-file /certs/local/server.key] Config:0xc820392540 State:{Running:true Paused:false Restarting:false OOMKilled:false Pid:13560 ExitCode:0 Error: StartedAt:2016-03-17 07:38:49.64624299 +0000 UTC FinishedAt:0001-01-01 00:00:00 +0000 UTC} Image:sha256:8efe20f45ba398ca4ec15162d1137f7471a47bbdcf2f555660bca6c4bfbca3eb Node:<nil> NetworkSettings:0xc820232700 SysInitPath: ResolvConfPath:/var/lib/docker/containers/40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626/resolv.conf HostnamePath:/var/lib/docker/containers/40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626/hostname HostsPath:/var/lib/kubelet/pods/6fbdc794-ebe3-11e5-9bd3-005056894f70/etc-hosts LogPath:/var/lib/docker/containers/4f135abee653c45663d245a568bc5a368b569d5fe172f92c509160d8271f022f/4f135abee653c45663d245a568bc5a368b569d5fe172f92c509160d8271f022f-json.log Name:/k8s_hello-www.5e13f7ac_hello-www-weqj8_default_6fbdc794-ebe3-11e5-9bd3-005056894f70_5aa6d563 Driver:devicemapper Mounts:[{Name: Source:/box/var/log/metrics/hello-www Destination:/var/metrics Driver: Mode: RW:true} {Name: Source:/box/var/log/raw/hello-www Destination:/var/log/service Driver: Mode: RW:true} {Name: Source:/var/lib/kubelet/pods/6fbdc794-ebe3-11e5-9bd3-005056894f70/volumes/kubernetes.io~secret/default-token-7pgja Destination:/var/run/secrets/kubernetes.io/serviceaccount Driver: Mode:ro RW:false} {Name: Source:/var/lib/kubelet/pods/6fbdc794-ebe3-11e5-9bd3-005056894f70/etc-hosts Destination:/etc/hosts Driver: Mode: RW:true} {Name: Source:/var/lib/kubelet/pods/6fbdc794-ebe3-11e5-9bd3-005056894f70/containers/hello-www/4f135abee653c45663d245a568bc5a368b569d5fe172f92c509160d8271f022f Destination:/dev/termination-log Driver: Mode: RW:true}] Volumes:map[] VolumesRW:map[] HostConfig:0xc8203c8000 ExecIDs:[] RestartCount:0 AppArmorProfile:}
I0317 00:41:00.561804   14517 manager.go:324] Container inspect result: {ID:40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626 Created:2016-03-17 01:58:20.143609537 +0000 UTC Path:/pause Args:[] Config:0xc8204da700 State:{Running:true Paused:false Restarting:false OOMKilled:false Pid:17057 ExitCode:0 Error: StartedAt:2016-03-17 01:58:20.541583289 +0000 UTC FinishedAt:0001-01-01 00:00:00 +0000 UTC} Image:sha256:b22161b02377eca14b84e9542259a15baa268043116197ecec8dd8c373f7b0ac Node:<nil> NetworkSettings:0xc820419a00 SysInitPath: ResolvConfPath:/var/lib/docker/containers/40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626/resolv.conf HostnamePath:/var/lib/docker/containers/40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626/hostname HostsPath:/var/lib/docker/containers/40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626/hosts LogPath:/var/lib/docker/containers/40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626/40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626-json.log Name:/k8s_POD.c4072b55_hello-www-weqj8_default_6fbdc794-ebe3-11e5-9bd3-005056894f70_956170d6 Driver:devicemapper Mounts:[] Volumes:map[] VolumesRW:map[] HostConfig:0xc820556b00 ExecIDs:[] RestartCount:0 AppArmorProfile:}
I0317 00:41:00.561888   14517 generic.go:299] PLEG: Write status for hello-www-weqj8/default: &{ID:6fbdc794-ebe3-11e5-9bd3-005056894f70 Name:hello-www-weqj8 Namespace:default IP:10.20.5.131 ContainerStatuses:[0xc8204b62a0 0xc8207a6540]} (err: <nil>)
I0317 00:41:02.422951   14517 manager.go:802] Added container: "/docker/4f135abee653c45663d245a568bc5a368b569d5fe172f92c509160d8271f022f" (aliases: [k8s_hello-www.5e13f7ac_hello-www-weqj8_default_6fbdc794-ebe3-11e5-9bd3-005056894f70_5aa6d563 4f135abee653c45663d245a568bc5a368b569d5fe172f92c509160d8271f022f], namespace: "docker")
I0317 00:41:06.269721   14517 volumes.go:234] Making a volume.Cleaner for volume kubernetes.io~secret/default-token-7pgja of pod 6fbdc794-ebe3-11e5-9bd3-005056894f70
I0317 00:41:06.270010   14517 kubelet.go:1974] volume "6fbdc794-ebe3-11e5-9bd3-005056894f70/default-token-7pgja", still has a container running "6fbdc794-ebe3-11e5-9bd3-005056894f70", skipping teardown
I0317 00:41:06.270149   14517 kubelet.go:2391] SyncLoop (UPDATE, "api"): "kubernetes-kubelet-prometheus-collector-sandbox-compute-node01.dev.box.net_default(5018811e-ec13-11e5-9bd3-005056894f70), hello-www-wh1xp_default(6fbdc6f1-ebe3-11e5-9bd3-005056894f70), nginx-r1ywb_default(dd6a5996-ec11-11e5-9bd3-005056894f70), hello-www-iz7od_default(6fbe6659-ebe3-11e5-9bd3-005056894f70), octoproxy-8n13f_default(01e665c4-ebe9-11e5-9bd3-005056894f70), nginx-2-59scr_default(bde29b54-ec0b-11e5-9bd3-005056894f70), hello-www-weqj8_default(6fbdc794-ebe3-11e5-9bd3-005056894f70)"
I0317 00:41:06.272904   14517 kubelet.go:3238] Generating status for "hello-www-weqj8_default(6fbdc794-ebe3-11e5-9bd3-005056894f70)"
I0317 00:41:06.273385   14517 manager.go:1679] Found pod infra container for "hello-www-weqj8_default(6fbdc794-ebe3-11e5-9bd3-005056894f70)"
I0317 00:41:06.273420   14517 manager.go:1692] Pod infra container looks good, keep it "hello-www-weqj8_default(6fbdc794-ebe3-11e5-9bd3-005056894f70)"
I0317 00:41:06.273457   14517 manager.go:1716] pod "hello-www-weqj8_default(6fbdc794-ebe3-11e5-9bd3-005056894f70)" container "hello-www" exists as 4f135abee653c45663d245a568bc5a368b569d5fe172f92c509160d8271f022f
I0317 00:41:06.273484   14517 manager.go:1736] pod "hello-www-weqj8_default(6fbdc794-ebe3-11e5-9bd3-005056894f70)" container "hello-www" hash changed (1578366892 vs 3083671747), it will be killed and re-created.
I0317 00:41:06.273494   14517 manager.go:1784] Got container changes for pod "hello-www-weqj8_default(6fbdc794-ebe3-11e5-9bd3-005056894f70)": {StartInfraContainer:false InfraChanged:false InfraContainerId:40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626 ContainersToStart:map[0:pod "hello-www-weqj8_default(6fbdc794-ebe3-11e5-9bd3-005056894f70)" container "hello-www" hash changed (1578366892 vs 3083671747), it will be killed and re-created.] ContainersToKeep:map[40c80cadd4678457ea579e41afd9440d7828c09c0c562240175f6789ad051626:-1]}
I0317 00:41:06.273513   14517 manager.go:1813] Killing unwanted container "hello-www"(id={"docker" "4f135abee653c45663d245a568bc5a368b569d5fe172f92c509160d8271f022f"}) for pod "hello-www-weqj8_default(6fbdc794-ebe3-11e5-9bd3-005056894f70)"
I0317 00:41:06.311979   14517 manager.go:855] Destroyed container: "/docker/4f135abee653c45663d245a568bc5a368b569d5fe172f92c509160d8271f022f" (aliases: [k8s_hello-www.5e13f7ac_hello-www-weqj8_default_6fbdc794-ebe3-11e5-9bd3-005056894f70_5aa6d563 4f135abee653c45663d245a568bc5a368b569d5fe172f92c509160d8271f022f], namespace: "docker")
I0317 00:41:06.402666   14517 manager.go:1921] Creating container &{Name:hello-www Image:docker-registry-vip.dev.box.net/jenkins/hello-www:0.0.0-4.2016_02_16_17_01_12.b98c8 Command:[/hello-www -cert-file /certs/local/server.crt -key-file /certs/local/server.key] Args:[] WorkingDir: Ports:[{Name: HostPort:0 ContainerPort:80 Protocol:TCP HostIP:} {Name: HostPort:0 ContainerPort:443 Protocol:TCP HostIP:}] Env:[{Name:POD_NAME Value: ValueFrom:0xc82046f020}] Resources:{Limits:map[] Requests:map[]} VolumeMounts:[{Name:metrics-logs ReadOnly:false MountPath:/var/metrics} {Name:service-logs ReadOnly:false MountPath:/var/log/service} {Name:default-token-7pgja ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount}] LivenessProbe:<nil> ReadinessProbe:<nil> Lifecycle:<nil> TerminationMessagePath:/dev/termination-log ImagePullPolicy:IfNotPresent SecurityContext:<nil> Stdin:false StdinOnce:false TTY:false} in pod hello-www-weqj8_default(6fbdc794-ebe3-11e5-9bd3-005056894f70)

Looks like the pod contains 4 containers.
2 of them are dead at the start of the log:

I0317 00:40:56.271145   14517 generic.go:138] GenericPLEG: 6fbdc794-ebe3-11e5-9bd3-005056894f70/ec6bcafdf896317482884bdc04cc2a554e1d65d6435e941d4d4f8c73a2496e5a: non-existent -> exited
I0317 00:40:56.271161   14517 generic.go:138] GenericPLEG: 6fbdc794-ebe3-11e5-9bd3-005056894f70/d651c655bb50ee11d3d2ddce44bab22d1bb7b5a2984f43602e53abb0b41a471b: non-existent -> exited

1 of them is killed because container hash has changed:

I0317 00:41:06.273484   14517 manager.go:1736] pod "hello-www-weqj8_default(6fbdc794-ebe3-11e5-9bd3-005056894f70)" container "hello-www" hash changed (1578366892 vs 3083671747), it will be killed and re-created.

Why is the container hash changed?

@yujuhong
Copy link
Contributor

Clarification: We still get most of the information (UID, hash, name, etc) from the docker name itself, specifically to handle backwards compatibility.

EDIT: strike the following statement, which is untrue.
However, I think one breaking change is #20615 (for future docker v1.10) that changed the docker name.
I am still looking at the logs to see if there are any other issues.

@dchen1107
Copy link
Member

Then it could be a new default function which changes PodSpec / ContainerSpec from 1.1 to 1.2 indeed, and kubelet kills the containers.

@Random-Liu
Copy link
Member

@ghodss What is the previous version you were using before upgrading?

@dchen1107
Copy link
Member

it is kubelet 1.1.3 from the issue's title

@Random-Liu
Copy link
Member

@dchen1107 Silly me...Saw that...I tried to find it in the issue content... :)

@yujuhong
Copy link
Contributor

I uploaded a PR #23141 to fix one particular problem: kubelet starts pod cleanup routine prematurely and kills desired pods.

The remaining problem is that the UID/container hash of the static pods seem to have changed after upgrade, as @dchen1107 said in #23104 (comment)

@ghodss
Copy link
Contributor Author

ghodss commented Mar 17, 2016

I too am inclined to believe it's not the labels issue. I recall seeing similar statements about the labels in the logs before and after I actually did the upgrade, but unfortunately I don't still have those logs.

Does kubelet do defaulting? I thought that was in the apiserver.

@yujuhong
Copy link
Contributor

Does kubelet do defaulting? I thought that was in the apiserver.

We use the same library to decode and validate a pod from non-apiserer sources.

func tryDecodeSinglePod(data []byte, defaultFn defaultFunc) (parsed bool, pod *api.Pod, err error) {

I too am inclined to believe it's not the labels issue. I recall seeing similar statements about the labels in the logs before and after I actually did the upgrade, but unfortunately I don't still have those logs.

Yeah, the label-related messages are loud but harmless. We plan to switch to relying on labels (and stopped polluting the docker names) in the future. We will also be able to filter pods by labels, etc.

@yujuhong
Copy link
Contributor

More background: For pods from non-apiserver sources, kubelet hashes the entire pod object and generate a UID for it. Any change in the pod spec leads to a new UID, and is considered a different pod. This is necessary because kubelet doesn't know how to validate updates to pods, so treating it as a new pod is simpler and safer.

@yujuhong
Copy link
Contributor

Forgot to paste the difference between v1.1.3 and a v1.2 pods:

5,7c5,7
<     kubernetes.io/config.hash: 6c5833c13a26d9feec183f5c07e2616c
<     kubernetes.io/config.mirror: 6c5833c13a26d9feec183f5c07e2616c
<     kubernetes.io/config.seen: 2016-03-17T20:02:13.569465999Z
---
>     kubernetes.io/config.hash: c4b327659415ab12151d07426a992997
>     kubernetes.io/config.mirror: c4b327659415ab12151d07426a992997
>     kubernetes.io/config.seen: 2016-03-17T20:42:17.167752241Z
9c9
<   creationTimestamp: 2016-03-17T20:03:24Z
---
>   creationTimestamp: 2016-03-17T20:43:06Z
12c12
<   resourceVersion: "281317"
---
>   resourceVersion: "282589"
14c14
<   uid: 4e1a1dae-ec7b-11e5-8837-42010af00002
---
>   uid: d9ef3d90-ec80-11e5-bc73-42010af00002
36a37
>   securityContext: {}
48c49
<     lastTransitionTime: null
---
>     lastTransitionTime: 2016-03-17T20:42:53Z
52c53
<   - containerID: docker://892363f197fcc40827f575a66628c933a7f65c8753465d0ec626fe492b238d31
---
>   - containerID: docker://86bb9f34d1b6c00213134df62b8dde5741e75466bb9b9e0941151142b403c209
61c62
<         startedAt: 2016-03-17T20:03:14Z
---
>         startedAt: 2016-03-17T20:42:53Z
64c65
<   podIP: 10.245.2.4
---
>   podIP: 10.245.2.5

The non-nil securityContext seems to be the problem. The rest (status and annotations) is not relevant.

@ghodss
Copy link
Contributor Author

ghodss commented Mar 18, 2016

  1. Have we added something to the release notes saying that you should expect your pods to bounce if you upgrade the kubelet?
  2. Do we have ideas on how to prevent this in 1.3? We can handle a full cluster bounce today, but I'd really rather not have to do deal with it in future upgrades.

@yujuhong
Copy link
Contributor

Have we added something to the release notes saying that you should expect your pods to bounce if you upgrade the kubelet?

This applies only for the static pods (manifest files), but yes, we'll add it to the release note.

Do we have ideas on how to prevent this in 1.3? We can handle a full cluster bounce today, but I'd really rather not have to do deal with it in future upgrades.

We've discussed some options, e.g., hash the versioned object before converting the the internal objects. I am not sure there are no corner cases, and we'll need to think about it some more.
/cc @dchen1107 @lavalamp @bgrant0607

@lavalamp
Copy link
Member

  1. We should figure out something to test here and start testing it so that we don't do this again. (Maybe that static pods don't get bounced on kubelet upgrade in one of the upgrade tests?)
  2. I think hashing the versioned object instead of the internal object (i.e., before defaulting and conversion functions) is the best way to go.
  3. Hashing the bytes of the file is my second choice.
  4. Either way, when we make this stable, there'll be one more bounce event unless we go to pains to prevent it.
  5. If necessary, we could make a patch to fix this that undoes the new default before hashing. But applying that would probably bounce @ghodss's pods again, too, at this point.

@yujuhong
Copy link
Contributor

I think hashing the versioned object instead of the internal object (i.e., before defaulting and conversion functions) is the best way to go.
Hashing the bytes of the file is my second choice.

Both options indicate that we won't restart the pod when defaults change. I am not sure that's a desired behavior.

@lavalamp
Copy link
Member

Right. That's something for @dchen1107 to decide.

@lavalamp
Copy link
Member

But if you do want to restart on default change, then this is WAI...

@bgrant0607
Copy link
Member

Defaults are represented in the serialized, versioned form, also.

@ghodss
Copy link
Contributor Author

ghodss commented Mar 18, 2016

This applies only for the static pods (manifest files), but yes, we'll add it to the release note.

I don't think this is right. In my example above, the pods come from apiserver, not static manifest files.

@yujuhong
Copy link
Contributor

I don't think this is right. In my example above, the pods come from apiserver, not static manifest files.

That was mainly because of a race condition in kubelet, and should be fixed by #23141
Also, if the pods were drained from the node before the upgrade (which is recommended), non-static pods will be fine.

@yujuhong
Copy link
Contributor

I don't think this is right. In my example above, the pods come from apiserver, not static manifest files.

My bad. @dchen1107 pointed out that since the SecurityContext is per container, the container hash will also change in the apiserver. The containers in the regular pods will then be recreated.
However, draining the pods from the node will help in this case.

@dchen1107
Copy link
Member

I sent #23227 to update release notes to include this as known issue for 1.2 release. We are going to continue discussing how to handle this. So far all proposed solutions above couldn't resolve the current problem caused by the defaulting without draining.

But it is terribly bad we didn't catch this issue due to lack of understanding our users. The assumption we had is no one is doing the live-upgrading because kube-push is broken since 1.1 release is totally wrong.

@yujuhong yujuhong reopened this Mar 25, 2016
@yujuhong
Copy link
Contributor

In general though, given the delta in work required, I would wholeheartedly request a generic solution to this problem that prevented pod restarts on upgrade.

IMO, we should first establish the best practices for upgrading a cluster, so that we can test against it :-)

@ghodss
Copy link
Contributor Author

ghodss commented Jun 28, 2016

@lavalamp @bgrant0607 Do you know if upgrading kubelet from 1.2.0 to 1.3.0 will bounce the containers?

@lavalamp
Copy link
Member

@davidopp did our manual upgrade testing include looking for this?

@davidopp
Copy link
Member

Yes, upgrading kubelet from 1.2 to 1.3 restarts the containers on the node. I thought this was WAI? We can look into it if that's not supposed to happen.

@ghodss
Copy link
Contributor Author

ghodss commented Jun 29, 2016

It would be much preferable if it didn't restart the pods. It makes the upgrade/downgrade maintenance much easier to schedule, perform, and roll back, especially in bare metal environments.

@piosz
Copy link
Member

piosz commented Jun 29, 2016

If you are using cluster/gce/upgrade.sh script this is expected bheavior. We use MIG rolling-update feature which basically recreates new VMs using the new startup script. kube-push was supposed to do in-place upgrades but it doesn't work.

@davidopp
Copy link
Member

Yeah, from the first entry in the issue it looks like Sam was doing a manual version of kube-push. It would be great if we could test that scenario.

@yujuhong yujuhong added the sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. label Jun 29, 2016
@yujuhong
Copy link
Contributor

Yeah, from the first entry in the issue it looks like Sam was doing a manual version of kube-push. It would be great if we could test that scenario.

I agree that we should test all supported upgrade processes, but where can I them in the documentation?

If in-place upgrade is supported officially, we should prioritize this issue.

@davidopp
Copy link
Member

I think there's a spectrum--we don't have to say we "officially support" in-place upgrade (and definitely nobody is recommending to make kube-up.sh work in time for the 1.3 release), but it seems like it would be nice if we can ensure that upgrading kubelet doesn't cause the containers to restart.

OTOH, my understanding is that we qualify only one specific Docker version with each Kubelet version, so I guess we're expecting people to upgrade Kubelet and Docker together. Since Docker restart kills all your containers (or does the latest version make this not happen anymore?), then maybe this isn't something worth worrying about until Docker upgrade can be done without killing containers.

@ghodss
Copy link
Contributor Author

ghodss commented Jun 29, 2016

We haven't seen any specific issues reported with Kube 1.2 and Docker 1.10, so we've already upgraded our prod clusters to Docker 1.10 (which was a pain because of the killing of containers). Also I would imagine in the future that hopefully if a new version of Docker breaks kube, we'd do a point release for it so that kube is forwards compatible with newer Docker versions.

And yes, Docker is moving towards not having to restart containers on upgrades.

@ghodss
Copy link
Contributor Author

ghodss commented Jun 29, 2016

I think there's a spectrum--we don't have to say we "officially support" in-place upgrade (and definitely nobody is recommending to make kube-up.sh work in time for the 1.3 release), but it seems like it would be nice if we can ensure that upgrading kubelet doesn't cause the containers to restart.

That's all I'm asking for.

@yujuhong
Copy link
Contributor

it would be nice if we can ensure that upgrading kubelet doesn't cause the containers to restart.

My point is that we already know that there are issues (e.g, api changes) that may cause kubelet to restart the containers. AFAIK, nothing has changed after that. Unless this is prioritized, it's unlikely that these issues will get resolved automatically. That's why "nice-to-have" may not mean much...

OTOH, my understanding is that we qualify only one specific Docker version with each Kubelet version, so I guess we're expecting people to upgrade Kubelet and Docker together. Since Docker restart kills all your containers (or does the latest version make this not happen anymore?), then maybe this isn't something worth worrying about until Docker upgrade can be done without killing containers.

We only test one docker version exhaustively, but generally each release is compatible with 2~3 docker versions. 1.3 release should be compatible with docker v1.9 ~v1.11. Red Hat also has their own recommended dcoker version for their environment.

But yes, if user has to upgrade docker, all containers will get restarted.

@yujuhong
Copy link
Contributor

FWIW, I did a manual upgrade from v1.2 to v1.3 and did not notice any container get restarted.

  1. Create a 1.2 cluster
  2. Create some pause pods
  3. Run ./cluster/gce/upgrade.sh to upgrade the master node (i.e., creating a new master instance).
  4. Swap the kubelet binary built with HEAD on the rest of the nodes

Most of the default cluster addon pods got recreated to upgrade to newer versions.

@nkwangleiGIT
Copy link
Contributor

Seeing the same issue when upgrade kubelet from v1.0.1 to v1.1.4, all containers will be recreated after restart kubelet, any workaroud that we can use to prevent container recreation?
After all, it's not expected to see container recreated(data in container itself will be all lost) after upgrade kubelet, thanks!

@yujuhong
Copy link
Contributor

@nkwangleiGIT it's probably the best for you to drain all the pods from the node before upgrading kubelet to avoid any unnecessary container restarts. Upgrading kubelet in place isn't recommended officially. If data in your containers needs to be preserved across restarts, they would be better off stored in a persistent volume.

@nkwangleiGIT
Copy link
Contributor

@yujuhong ok, I see, so we need to drain all pods from node first before in-place upgrade kubelet.

BTW, is this the future design, or we plan to make it better? I expect we can support in-place kubelet upgrade without recreate containers, thanks.

@davidopp
Copy link
Member

The issue for fixing in-place upgrade is #17397 (the title may be a little misleading, as we may do it differently than kube-push.sh). IMO "not restarting containers" is something we should consider as a possible requirement, but until we know exactly how we will address the issue, it's probably too early to commit to it.

@fejta-bot
Copy link

Issues go stale after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 16, 2017
@yujuhong
Copy link
Contributor

yujuhong commented Jan 3, 2018

Forgot to post a reason before closing the issue.

Container hash change needs to be addressed if in-place upgrade is going to guarantee no container restarts. I think we can fold this into the upgrade issues (#6099)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubelet lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests