New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better support for sidecar containers in batch jobs #25908
Comments
/sub |
Also using a liveness problem as suggested here http://stackoverflow.com/questions/36208211/sidecar-containers-in-kubernetes-jobs doesn't work since the pod will be considered failed and the overall job will not be considered successful. |
How about we declared a job success probe so that the Job can probe it to detect success instead of waiting for the pod to return 0. |
Can probe run against a container that has already exited, or would there Another option is to designate certain exit codes as having special meaning. Both "Success for the entire pod" or "failure for the entire pod" are both This would need to be on the Pod object, so that is a big API change. On Thu, Sep 22, 2016 at 1:41 PM, Ming Fang notifications@github.com wrote:
|
@erictune Good point; we can't probe an exited container. Can we designate a particular container in the pod as the "completion" container so that when that container exits we can say the job is completed? The sidecar containers tends to be long lived for things like log shipping and monitoring. |
Have you looked into this doc point 3, described in details in here where you basically don't set
Personally, these look to me more like RS, rather than a job, but that's my personal opinion and most importantly I don't know full details of your setup. Generally, there are following discussions #17244 and #30243 that are touching this topic as well. |
@soltysh the link you sent above, point 3 references pod completion and not container completion. |
The two containers can share an emptyDir, and the first container and write an "I'm exiting now" message to a file and the other can exit when it sees that message. |
@erictune I have a use case which I think falls in this bucket and I am hoping you could guide me in the right direction since there doesn't seem to be any official recommended way to solve this problem. I am using the client-go library to code everything below: So, I have a job that basically runs a tool in a one container pod. As soon as the tool finishes running, it is supposed to produce a results file. I can't seem to capture this results file because as soon as the tool finishes running, the pod deletes and I lose the results file. I was able to capture this results file if I used But, I understand that's not recommended and ideal for production containers. So, I used So, should I be solving my problem using the sidecar container pattern as well? Basically, do what you suggested above. Start 2 containers in the pod whenever the job starts. 1 container runs the job and as soon as the job gets done, drops a message that gets picked up by the other container which then grabs the result file and stores it somewhere? I fail to understand why we would need 2 containers in the first place. Why can't the job container do all this by itself? That is, finish the job, save the results file somewhere, access it/read it and store it somewhere. |
@anshumanbh I'd suggest you:
|
@soltysh I don't want the file to be stored permanently. On every run, I just want to compare that result with the last result. So, the way I was thinking of doing this was committing to a github repository on every run and then doing a diff to see what changed. So, in order to do that, I just need to store the result temporarily somewhere so that I can access it to send it to Github. Make sense? |
@anshumanbh perfectly clear, and still that doesn't fall into the category of side-car container. All you want to achieve is currently doable with what jobs provide. |
@soltysh so considering I want to go for option 3 from the list you suggested above, how would I go about implementing it? The problem I am facing is that as soon as the job finishes, the container exits and I lose the file. If I don't have the file, how do I upload it to shared drive like S3/Google Drive/Dropbox? I can't modify the job's code to automatically upload it somewhere before it quits so unfortunately I would have to first run the job and then save the file somewhere.. |
If you can't modify job's code, you need to wrap it in such a way to be able to upload the file. If what you're working with is an image already just extend it with the copying code. |
@soltysh yes, that makes sense. I could do that. However, the next question I have is - suppose I need to run multiple jobs (think about it as running different tools) and none of these tools have the uploading part built in them. So, now, I would have to build that wrapper and extend each one of those tools with the uploading part. Is there a way I can just write the wrapper/extension once and use it for all the tools? Wouldn't the side car pattern fit in that case? |
Yeah, it could. Although I'd try with multiple containers inside the same pod, pattern. Iow. your pod is running the job container and alongside additional one waiting for the output and uploading that. Not sure how feasible is this but you can give it a try already. |
Gentle ping -- sidecar awareness would make management of microservice proxies such as Envoy much more pleasant. Is there any progress to share? The current state of things is that each container needs bundled tooling to coordinate lifetimes, which means we can't directly use upstream container images. It also significantly complicates the templates, as we have to inject extra argv and mount points. An earlier suggestion was to designate some containers as a "completion" container. I would like to propose the opposite -- the ability to designate some containers as "sidecars". When the last non-sidecar container in a Pod terminates, the Pod should send Example config, when container containers:
- name: main
image: gcr.io/some/image:latest
command: ["/my-batch-job/bin/main", "--config=/config/my-job-config.yaml"]
- name: envoy
image: lyft/envoy:latest
sidecar: true
command: ["/usr/local/bin/envoy", "--config-path=/my-batch-job/etc/envoy.json"] |
For reference, here's the bash madness I'm using to simulate desired sidecar behavior: containers:
- name: main
image: gcr.io/some/image:latest
command: ["/bin/bash", "-c"]
args:
- |
trap "touch /tmp/pod/main-terminated" EXIT
/my-batch-job/bin/main --config=/config/my-job-config.yaml
volumeMounts:
- mountPath: /tmp/pod
name: tmp-pod
- name: envoy
image: gcr.io/our-envoy-plus-bash-image:latest
command: ["/bin/bash", "-c"]
args:
- |
/usr/local/bin/envoy --config-path=/my-batch-job/etc/envoy.json &
CHILD_PID=$!
(while true; do if [[ -f "/tmp/pod/main-terminated" ]]; then kill $CHILD_PID; fi; sleep 1; done) &
wait $CHILD_PID
if [[ -f "/tmp/pod/main-terminated" ]]; then exit 0; fi
volumeMounts:
- mountPath: /tmp/pod
name: tmp-pod
readOnly: true
volumes:
- name: tmp-pod
emptyDir: {} |
@jmillikin-stripe I like this idea, although I'm not sure if this follows the principal of treating some containers differently in a Pod or introducing dependencies between them. I'll defer to @erictune for the final call. Although, have you checked #17244, would this type of solution fit your use-case? This is what @erictune mentioned a few comments before:
|
I think Kubernetes may need to be flexible about the principal of not treating containers differently. We (Stripe) don't want to retrofit third-party code such as Envoy to have Lamprey-style lifecycle hooks, and trying to adopt an Envelope-style exec inversion would be much more complex than letting Kubelet terminate specific sidecars.
I'm very strongly opposed to Kubernetes or Kubelet interpreting error codes at a finer granularity than "zero or non-zero". Borglet's use of exit code magic numbers was an unpleasant misfeature, and it would be much worse in Kubernetes where a particular container image could be either a "main" or "sidecar" in different Pods. |
Maybe additional lifecycle hooks would be sufficient to solve this? Could be:
This could also define a means to define custom policies to restart a container - or even start containers that are not started by default to allow some daisy chaining of containers (when container a finishes then start container b) |
Also missing this. We run a job every 30 min that needs a VPN client for connectivity but there seem to be a lot of use cases where this could be very useful (for example stuff that needs kubectl proxy). Currently, I am using EDIT: in my usecase, it would be completely sufficient to have some property in the job spec to mark a container as the terminating one and have the job monitor that one for exit status and kill the remaining ones. |
I just got hit by this issue when using the Cloud SQL Proxy and I'm in disbelief that there isn't a way to tell a (Cron)Job which containers to take into account to consider it completed. I was expecting something like: template:
spec:
targets:
- container1
- container2 But after reading the documentation, nothing. There's no way. Then I found this issue along with some others about sidecars in Jobs and also nothing except yet another sidecar with kubexit or some manual process monitoring/killing. How is this frozen and not a priority instead? |
Note that AWS ECS has had the https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html |
@mvanholsteijn that's interesting. Something that I think has gone on with this issue is that the perfect has become the enemy of the good. There are a lot of people who don't need anything more than to be able to designate a container (or n containers?) as "the important container" among multiple containers in a pod. If we could do just that, then jobs, for instance, could be considered complete when the designated container(s) exit. But the issue has turned into defining container startup and shutdown order, etc., etc. That sounds like it would be a great feature, but my intuition says that's not what most of us following this issue need. Our needs are basic and a simple boolean field for marking a container "essential" is really all we need-- and would not likely interfere with efforts in the future to express container startup and shutdown order. |
Thank you for your response. How fast can you get the I consider Kubernetes the father of the concept of running multiple containers together as a "pod": The simple fact that this concept breaks as soon as you run a job or cronjob, and is left broken for almost 5 years, is beyond my comprehension. |
please move discussions to the KEP: please see the latest updates to the KEP here: last update was here: /close |
For an alpha feature, shouldn't the first try solve the Job conundrum? Most if not all jobs have one container per pod (before sidecars get inserted). That would be a huge leap into getting an idea of how to solve it for other workloads too, in my modest opinion. |
Can we get an update on this @dims ? The initial KEP kubernetes/enhancements#753 that you linked to appears to have been closed with this mesage:
Also this comment on your other thread about the "last update" says:
I like many others just encountered this issue and spent a bunch of time trying to figure out if there is a solution and ultimately a work around and would love to have an update. Also for what its worth, I agree with @krancour 's comment that a simple 90% solution is warranted at this point. |
@justinmchase and others proposed a simple solution here kubernetes/enhancements#2872 based on above discussions and input from @dims, It would be be helpful to give feedback on the proposal. |
Would like to mention another workaround that I don't see called out here yet. Instead of using a shared volume to communicate between sidecar/keystone containers, you could use |
@arianf Can you share an example, even if pusedocode, of how to do this? Would it just be killing all processes other than your own? |
Here is an example, I'm using nginx as the sidecar because it's easy to run without any configuration: Note the caveat with this approach is that you need to make sure the keystone container doesn't finish before the sidecar container has started up, or else it won't be able to kill the other process. apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: shared-process-cronjob
spec:
schedule: "* * * * *"
concurrencyPolicy: Forbid
successfulJobsHistoryLimit: 2
failedJobsHistoryLimit: 2
jobTemplate:
spec:
template:
spec:
shareProcessNamespace: true
restartPolicy: Never
containers:
- name: keystone
image: busybox
command:
- /bin/sh
- -c
- >-
echo "Starting work at $(date)";
sleep 100;
echo "Ending work at $(date)";
pkill -SIGTERM nginx
- name: sidecar
image: nginx You can see the pod was listed as "Completed": $ k get pods
shared-process-cronjob-1667249160-zmwl7 0/2 Completed 0 2m10s And nginx exit code was 0: $ k describe pods/shared-process-cronjob-1667249160-zmwl7 | grep -A10 "sidecar:"
sidecar:
Container ID: containerd://2853be81e63f8ef78440f11a315f8cffce6264abc54ed17a12de9deae5ba4b1f
Image: nginx
Image ID: docker.io/library/nginx@sha256:943c25b4b66b332184d5ba6bb18234273551593016c0e0ae906bab111548239f
Port: <none>
Host Port: <none>
State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 31 Oct 2022 13:46:50 -0700
Finished: Mon, 31 Oct 2022 13:48:28 -0700 |
We use in-pod supervisor (https://github.com/taylorchu/wait-for) to solve the sidecar container issue. It is simple and can be used even if you don't use k8s. |
@dims wrote on Feb 7, 2021:
If you read the thread, they just came to a conclusion "We need an essential container flag". But then it get's closed and discussions redirected to an already closed (on Oct 21, 2020) issue kubernetes/enhancements#753 ? |
Folks, Please see kubernetes/enhancements#3761 (New KEP for sidecar containers) thanks, |
Consider a Job with two containers in it -- one which does the work and then terminates, and another which isn't designed to ever explicitly exit but provides some sort of supporting functionality like log or metric collection.
What options exist for doing something like this? What options should exist?
Currently the Job will keep running as long as the second container keeps running, which means that the user has to modify the second container in some way to detect when the first one is done so that it can cleanly exit as well.
This question was asked on Stack Overflow a while ago with no better answer than to modify the second container to be more Kubernetes-aware, which isn't ideal. Another customer has recently brought this up to me as a pain point for them.
@kubernetes/goog-control-plane @erictune
The text was updated successfully, but these errors were encountered: