-
Notifications
You must be signed in to change notification settings - Fork 38.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"kubecfg update pods" returns an error #1712
Comments
I think your interpretation is correct. |
We had deferred updating a pod until a later date, because there are basically two different kinds of updates. There are updates that require that the container be restarted (updating the image for example) and there are updates that you can do in place (increasing the amount of CPU) Right now, every update requires a pod restart, and if you are going to do that, then you may as well just create a new pod, and then tear down the other pod. That said, there's no reason not to just do the update in place, and let the kubelet restart the pod, so I'll file a PR to do that, the kubelet already does hashing and will do the appropriate thing. |
As far as I understand it right now problem is that if I tear down a pod to make an update, aren't I going to loose the "emptyDir" volume which if you look at my pod example is currently part of the pod. In my case I want to keep the same volume but just upgrade one container, I don't want to loose the volume. Also what if I have a pod that has few other containers, but I only really want to update one container, by tearing down the pod aren't I going to loose all those containers. If that's the case we kind of loose the usefulness of docker modularity since it becomes all or nothing situation. I kind of expected the pods to be more of a loose grouping or encapsulation of containers for the purposes of management and services exposure and not so tightly connected to become one new entity where operations are performed always on all containers. |
Yes, you're definitely correct. It's implemented this way on the kubelet, we just need to get the right data into the right places... I think the PR will be done this evening. --brendan |
Oh ok that's good to hear then :) Thanks for confirming that. |
#1713 has the fix. |
Are we going to support in-place update, and defer non-in-place one? Also increasing a container's CPU limit cannot always be treated as in-place update. For example, scheduler already overcommit a node on CPU, and user wants to increase one container's CPU on that node, which might result in a pod rescheduling. |
I'm defering in place update. Right now the kubelet hashes the container config, and if it changes it restarts the container. So it restarts on any change. We'll use that behavior for now. |
IMO: Make a new pod, delete the old one. If you need zero-latency restarts, you do that by starting up replicationControllers and turning the old one down while you turn the new one up (we'll automate this in the future). With the possible exception of metadata, pods should be immutable after they're scheduled. |
The issue with losing one's emtpyDir means that you're using it wrong. :) EmptyDirs should only be used for transient data. We need @erictune's durable data (or @brendandburns' PD) for the use case where you want your data to outlive your pod. |
@lavalamp Really good point on author's initial use case. In this case, there is no any pod update allowed. We should either remove that operation completely or implement it as what we suggested: make a new pod, and delete the old one, also document it properly. |
Clarification: Actually emptyDir volume is also apply to Pod, not individual Container itself. In this case, it works in the same way as durable data or PD. @brendandburns I am not sure why Container is restarted in your test case. If supporting in-place-update is a separate issue. |
Simple pod updates should be simple - not supporting it is going out of our In-place (non-murderous) updates are a problem for later. Updating a pod and changing, e.g. a container image or command line will On Fri, Oct 10, 2014 at 11:22 AM, Dawn Chen notifications@github.com
|
@thockin Are you really sure changing the image is simple? Auto scaling tools may have entirely different ideas about the resource requirements of the new image. Really, "update" is an action performed by a config system, built on top of the primatives we provide. I'm arguing that "update" is not a primative operation we should provide. |
There's a certain beauty to immutability, or at least controlled As a thought exercise, let's ban UPDATE entirely as a verb. We can add new Changing a container image or commandline flag means creating a new pod and At first blush, maybe it can work... But in order for it to work, we I would be willing to entertain the idea, but it needs to be fleshed out a On Fri, Oct 10, 2014 at 11:43 AM, Daniel Smith notifications@github.com
|
offline discussion w/ @thockin There's 3 things that "update" could mean.
We don't provide 3 yet, and 1 is important only at the kubelet level--it's a subset of 2, which is what apiserver needs to check and enforce on an update request. My opposition here is that I understood 3 when perhaps people are OK with 2. I think we need to clearly list the items that are in each category. 2 includes some things only knowable at runtime-- does this pod still fit where it is? If the answer to that is no, we should reject the update request, and it'd be up to a higher layer (the config system) to perform an intelligent update. |
According to 2, it is more complicated than the thought. Ok, the user wants to increase the memory limit, and apiserver did feasibility checking, and it is ok, then the update request send to the node. But the node actually is over-commit with memory due to sudden root usage increase. What node should do? Accepting such update which might result in another container being OOM-killed? Or rejecting such update at node level by doing another feasibility checking? Or living with current kubelet's implementation, killing the updated one since Container config is changed, and restart a new one with new updated limit? The last one might still end with a overcommit situation. Maybe just wait for upper layer to correct it later? |
I think you do it and deal with the repercussions after the fact On Fri, Oct 10, 2014 at 1:04 PM, Dawn Chen notifications@github.com wrote:
|
I described 3 choices we could do above, and which one you preferred here? And your answer decide who should deal with the repercussions in this situation then? If Kubelet accepts such update which might result in another container being OOM-killed: Other container in another Pod owned by another user might be killed due to the update. If Kubelet rejects such update at node level by doing another feasibility checking, then Kubelet needs more intelligent, which is diverge away from our current design. Also we might end up with a rescheduling cascading issue. If Kubelet lives with the new implementation, and wait for upper layer to correct its decision. But you might end up to kill another container during the race window anyway. Why not only allow very limit set of in-place update? and document them explicitly? Combining 2 and 3 together at client API level, so that the user never could game the system. For autopilot / auto-resizing, we could provide the power user API for them separately. |
What attracted me to docker is ability to split up the application into independent modules and being able to update and change out pieces without affecting the rest of the application. I think that if you make pods more and more tightly glued together you are moving away from one of the main advantages of docker. You end up turning pods into just another type of container. ( in that case why even bother with pods, I'll just build my app all in one container ) I don't know where you guys ultimately end up taking kubernetes, but I think you should keep kubernetes flexible and be capable of supporting various architecture designs not just one that's easiest to implement. Here is my must have kubernetes list:
I understand that some of these things might not be easy to implement, but I think they are essential to keeping kubernetes versatile and actually ahead of the other tools. |
Hey Frank, My PR handles the first case. We also recently merged in support for PD volumes on GCE (network storage) that handle both your second and third points. Of course, we also want to support network storage on platforms other than GCE, but someone needs to go and do that implementation. Hopefully I'll get my PR in today for the update behavior. |
Thanks @brendanburns. Actually having some kind of durable data that is tied to a group of loosely connected containers is essential for the project I am working on. If kubernetes can't provide that then I it will not work my case. I was originally going to code my own custom docker orchestration tool until I found out about kubernetes and it's capabilities. I've looked at bunch of other tools but kubernetes seemed like most promising. Hopefully it stays that way :) |
@brendandburns I am using the new version after the #1713 merge and now when I am trying to update the pod I am getting an "invalid value" error:
Only property that is different in the update is the 2nd container "Image" value: Any idea ? Here is the full json pod example {
"id": "app-instance-dev",
"kind": "Pod",
"apiVersion": "v1beta1",
"desiredState": {
"manifest": {
"version": "v1beta1",
"id": "app-instance-dev",
"containers": [{
"name": "framework",
"image": "registry.domain/user/dev-framework:v0.1.301",
"ports": [{
"containerPort": 80,
"hostPort": 80
}],
"volumeMounts": [{"name": "data", "mountPath":"/data"}],
"imagePullPolicy": "PullIfNotPresent",
"lifecycle" : {
"postStart" : {
"exec" : {
"command" : ["/opt/conf/poststart.sh"]
}
}
}
},
{
"name": "editor",
"image": "registry.domain/user/dev-editor:v0.1.101",
"ports": [{
"containerPort": 3131,
"hostPort": 3131
}],
"volumeMounts": [{"name": "data", "mountPath":"/data"}],
"imagePullPolicy": "PullIfNotPresent"
}],
"volumes": [{
"name": "data",
"source": {"emptyDir": {}}
}]
}
},
"labels": {
"name": "app-instance"
}
} |
@brendandburns whitelist not behaving as expected? |
Spent little time debugging this error and so far I think that issue seems to be with pod update validation :
// ValidatePodUpdate tests to see if the update is legal
func ValidatePodUpdate(newPod, oldPod *api.Pod) errs.ErrorList {
allErrs := errs.ErrorList{}
newpodCount := len(newPod.DesiredState.Manifest.Containers)
oldpodCount := len(oldPod.DesiredState.Manifest.Containers)
if newpodCount != oldpodCount {
allErrs = append(allErrs, errs.NewFieldInvalidMessage("DesiredState.Manifest.Containers", newPod.DesiredState.Manifest.Containers, fmt.Sprintf("Two pods do not have equal number of containers new pod has %v items and old has %v items", newpodCount, oldpodCount )))
return allErrs
}
pod := *newPod
pod.Labels = oldPod.Labels
pod.TypeMeta.ResourceVersion = oldPod.TypeMeta.ResourceVersion
// Tricky, we need to copy the container list so that we don't overwrite the update
var newContainers []api.Container
for ix, container := range pod.DesiredState.Manifest.Containers {
container.Image = oldPod.DesiredState.Manifest.Containers[ix].Image
newContainers = append(newContainers, container)
}
pod.DesiredState.Manifest.Containers = newContainers
if !reflect.DeepEqual(&pod, oldPod) {
allErrs = append(allErrs, errs.NewFieldInvalidMessage("DesiredState.Manifest.Containers", newPod.DesiredState.Manifest.Containers, fmt.Sprintf("Two pods are not equal, \nnew pod :\n %+v \n, \nold pod :\n %+v ", &pod, oldPod )))
}
return allErrs
} I ended up adding additional log and error details to validation system so it prints out the pod data it is comparing kubernetes-master-0 instance # kubecfg -c framework-devtest.json update pods/app-instance-devtest
F1015 20:05:21.652444 18466 kubecfg.go:403] Got request error: Pod "app-instance-devtest" is invalid: DesiredState.Manifest.Containers: invalid value '[{framework registry.domain/user/dev-framework:v0.1.301 [] [{ 80 80 TCP }] [] 0 0 [{data false /data}] <nil> <nil> false PullIfNotPresent} {editor registry.domain/user/dev-editor:v0.1.102 [] [{ 3131 3131 TCP }] [] 0 0 [{data false /data}] <nil> <nil> false PullIfNotPresent}]'
### Two pods are not equal,
new pod :
&{TypeMeta:{Kind: ID:app-instance-devtest CreationTimestamp:0001-01-01 00:00:00 +0000 UTC SelfLink: ResourceVersion:1605354 APIVersion: Namespace:default UID: Annotations:map[]} Labels:map[name:app-instance] DesiredState:{Manifest:{Version:v1beta1 ID:app-instance-devtest UUID: Volumes:[{Name:data Source:0xc208144ac0}] Containers:[{Name:framework Image:registry.domain/user/dev-framework:v0.1.301 Command:[] WorkingDir: Ports:[{Name: HostPort:80 ContainerPort:80 Protocol:TCP HostIP:}] Env:[] Memory:0 CPU:0 VolumeMounts:[{Name:data ReadOnly:false MountPath:/data}] LivenessProbe:<nil> Lifecycle:<nil> Privileged:false ImagePullPolicy:PullIfNotPresent} {Name:editor Image:registry.domain/user/dev-editor:v0.1.102 Command:[] WorkingDir: Ports:[{Name: HostPort:3131 ContainerPort:3131 Protocol:TCP HostIP:}] Env:[] Memory:0 CPU:0 VolumeMounts:[{Name:data ReadOnly:false MountPath:/data}] LivenessProbe:<nil> Lifecycle:<nil> Privileged:false ImagePullPolicy:PullIfNotPresent}] RestartPolicy:{Always:0xce0358 OnFailure:<nil> Never:<nil>}} Status: Host:10.208.199.166 HostIP: PodIP: Info:map[]} CurrentState:{Manifest:{Version: ID: UUID: Volumes:[] Containers:[] RestartPolicy:{Always:<nil> OnFailure:<nil> Never:<nil>}} Status: Host: HostIP: PodIP: Info:map[]}}
,
old pod :
&{TypeMeta:{Kind: ID:app-instance-devtest CreationTimestamp:2014-10-15 15:05:50 +0000 UTC SelfLink: ResourceVersion:1605354 APIVersion: Namespace:default UID: Annotations:map[]} Labels:map[name:app-instance] DesiredState:{Manifest:{Version:v1beta1 ID:app-instance-devtest UUID:c049d533-547c-11e4-94e2-bc764e20a883 Volumes:[{Name:data Source:0xc2080f22a0}] Containers:[{Name:framework Image:registry.domain/user/dev-framework:v0.1.301 Command:[] WorkingDir: Ports:[{Name: HostPort:80 ContainerPort:80 Protocol:TCP HostIP:}] Env:[] Memory:0 CPU:0 VolumeMounts:[{Name:data ReadOnly:false MountPath:/data}] LivenessProbe:<nil> Lifecycle:<nil> Privileged:false ImagePullPolicy:PullIfNotPresent} {Name:editor Image:registry.domain/user/dev-editor:v0.1.102 Command:[] WorkingDir: Ports:[{Name: HostPort:3131 ContainerPort:3131 Protocol:TCP HostIP:}] Env:[] Memory:0 CPU:0 VolumeMounts:[{Name:data ReadOnly:false MountPath:/data}] LivenessProbe:<nil> Lifecycle:<nil> Privileged:false ImagePullPolicy:PullIfNotPresent}] RestartPolicy:{Always:0xce0358 OnFailure:<nil> Never:<nil>}} Status:Running Host:10.208.199.166 HostIP: PodIP: Info:map[]} CurrentState:{Manifest:{Version: ID: UUID: Volumes:[] Containers:[] RestartPolicy:{Always:<nil> OnFailure:<nil> Never:<nil>}} Status:Waiting Host: HostIP: PodIP: Info:map[]}}
### So it seems its trying ot compare the old pod with new pod and since they don't match it errors out. Maybe only try and compare few important properties instead ? Or maybe do the same for outer parts of the pod by copying the outer parts before comparing, like it was done for containers ? var newContainers []api.Container
for ix, container := range pod.DesiredState.Manifest.Containers {
container.Image = oldPod.DesiredState.Manifest.Containers[ix].Image
newContainers = append(newContainers, container)
}
pod.DesiredState.Manifest.Containers = newContainers |
It's supposed to whitelist fields that are allowed to change. I see from your new error messages that somehow it lost the creation timestamp and UUID, so it's correct to not hose your system by accepting it. This is a bit tricky-- should we fix by having the client get the current object, update non-meta fields, and then update? Or fix the server to accept these and fix up those fields? @brendandburns thoughts on how to fix this? You can see why whitelisting was a good idea... |
Check out #1794 which has merged. This cleans up the validation a little, by only looking at the container manifest, and also gives a working update of an update, you do need to copy the UUID across Let me know if you still hit problems if you sync to head. More permanently, we need to use reflection to reflect across the object and find fields that are different, and then run those through a whitelist. |
@brendanburns , Ok so I merged the latest head and I've pushed the changes to my cluster with the proper UUID. Now it doesn't error out on the update but it also doesn't restart the docker containers with new images. kubernetes-master-0 instance # kubecfg list pods
ID Image(s) Host Labels Status
---------- ---------- ---------- ---------- ----------
app-instance-dev registry.domain/user/dev-framework:v0.1.302,registry.domain/user/dev-editor:v0.1.102 10.208.199.187/ name=app-instance Running Docker: kubernetes-minion-0 ~ # docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0a22c35031ae registry.domain/user/dev-framework:v0.1.301 "supervisord -n" 36 minutes ago Up 36 minutes k8s_framework.4097bc4c_app-instance-dev.etcd_d5e3892a-53da-11e4-94e2-bc764e20a883_1c098906
f388f2f68db9 registry.domain/user/dev-editor:v0.1.101 "supervisord -c /etc" 27 hours ago Up 27 hours k8s_editor.a73bb88e_app-instance-dev.etcd_d5e3892a-53da-11e4-94e2-bc764e20a883_74672843
4fee3e9d1f6d kubernetes/pause:latest "/pause" 27 hours ago Up 27 hours 0.0.0.0:80->80/tcp, 0.0.0.0:3131->3131/tcp k8s_net.a7d39cb1_app-instance-dev.etcd_d5e3892a-53da-11e4-94e2-bc764e20a883_598881ed I've checked the logs of all of the services and didn't see anything unusual Shouldn't be restarting those containers that have different images ? |
@dchen1107 Does the container image make its way into the hash we use? I thought we hashed the whole thing-- I would have expected all the containers to restart. |
In theory the hashing that we do should include the image. This is a corner of code that isn't well tested in the current integration test setup. I will extend my e2e test to try to validate that image update works. sorry for the problems! |
Thanks guys for looking into this, let me know if you need me to do any debugging or testing on this. |
K I think I have solved this issue. case UPDATE:
//TODO: implement updates of containers
"glog.Warningf(""Containers updated, not implemented [%s]"", kl.hostname)"
continue case UPDATE section wasn't enabled , it was just skipping and never running the pod syncing You can see the fix at my fork darkgaro@a336f3f What I have done is just added the same code as from SET case case SET:
glog.V(3).Infof("Containers changed [%s]", kl.hostname)
kl.pods = u.Pods
kl.pods = filterHostPortConflicts(kl.pods)
case UPDATE:
//TODO: implement updates of containers
kl.pods = u.Pods
kl.pods = filterHostPortConflicts(kl.pods)
glog.V(3).Infof("Containers updated, Trying to sync containers [%s]", kl.hostname)
//continue I am not sure if UPDATE case should do anything different then SET case with the pods. I am assuming most of the logic is done inside handler.SyncPods(). If that's the case then proabably UPDATE can be made as fall-through to SET, for now I left it as separate. I've tested this fix live and kubelet restarted the container as it was supposed to, so it looks like that was it. |
Thanks for debugging! I'll send a PR today to fix this (and try to expand Brendan
|
No problem. |
#1865 sent with the Thanks! On Fri, Oct 17, 2014 at 4:54 AM, Faruk Brbovic notifications@github.com
|
tx |
Fixed by #1865 |
I am trying to update an existing pod with :
/opt/bin/kubecfg -c framework-dev.json update pods/app-instance-dev
but as soon as I run the above command I get an error :
F1010 01:49:29.804442 21400 kubecfg.go:395] Got request error: unimplemented!
Original pod:
new pod :
Only difference is that new pod is retrieving a new image tag.
Does that mean that this functionality is not implemented yet ? if so is there then no way to update a POD ??? that seems a pretty essential functionality. Maybe I am missing something here ?
Thanks
The text was updated successfully, but these errors were encountered: