New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support for sidecar containers in batch jobs #25908

Open
a-robinson opened this Issue May 19, 2016 · 82 comments

Comments

@a-robinson
Member

a-robinson commented May 19, 2016

Consider a Job with two containers in it -- one which does the work and then terminates, and another which isn't designed to ever explicitly exit but provides some sort of supporting functionality like log or metric collection.

What options exist for doing something like this? What options should exist?

Currently the Job will keep running as long as the second container keeps running, which means that the user has to modify the second container in some way to detect when the first one is done so that it can cleanly exit as well.

This question was asked on Stack Overflow a while ago with no better answer than to modify the second container to be more Kubernetes-aware, which isn't ideal. Another customer has recently brought this up to me as a pain point for them.

@kubernetes/goog-control-plane @erictune

@soltysh

This comment has been minimized.

Contributor

soltysh commented May 23, 2016

/sub

@erictune erictune added the sig/apps label Jul 7, 2016

@mingfang

This comment has been minimized.

mingfang commented Sep 22, 2016

Also using a liveness problem as suggested here http://stackoverflow.com/questions/36208211/sidecar-containers-in-kubernetes-jobs doesn't work since the pod will be considered failed and the overall job will not be considered successful.

@mingfang

This comment has been minimized.

mingfang commented Sep 22, 2016

How about we declared a job success probe so that the Job can probe it to detect success instead of waiting for the pod to return 0.
Once the probe returns success, then the pod can be terminated.

@erictune

This comment has been minimized.

Member

erictune commented Sep 22, 2016

Can probe run against a container that has already exited, or would there
be a race where it is being torn down?

Another option is to designate certain exit codes as having special meaning.

Both "Success for the entire pod" or "failure for the entire pod" are both
useful.

This would need to be on the Pod object, so that is a big API change.

On Thu, Sep 22, 2016 at 1:41 PM, Ming Fang notifications@github.com wrote:

How about we declared a job success probe so that the Job can probe it to
detect success instead of waiting for the pod to return 0.

Once the probe returns success, then the pod can be terminated.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#25908 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHuudjrpVtef6U35RWRlZr3mDKcCRo7oks5qsugRgaJpZM4IiqQH
.

@mingfang

This comment has been minimized.

mingfang commented Sep 23, 2016

@erictune Good point; we can't probe an exited container.

Can we designate a particular container in the pod as the "completion" container so that when that container exits we can say the job is completed?

The sidecar containers tends to be long lived for things like log shipping and monitoring.
We can force terminate them once the job is completed.

@soltysh

This comment has been minimized.

Contributor

soltysh commented Sep 26, 2016

Can we designate a particular container in the pod as the "completion" container so that when that container exits we can say the job is completed?

Have you looked into this doc point 3, described in details in here where you basically don't set .spec.completions and as soon as first container finishes with 0 exit code the jobs is done.

The sidecar containers tends to be long lived for things like log shipping and monitoring.
We can force terminate them once the job is completed.

Personally, these look to me more like RS, rather than a job, but that's my personal opinion and most importantly I don't know full details of your setup.

Generally, there are following discussions #17244 and #30243 that are touching this topic as well.

@mingfang

This comment has been minimized.

mingfang commented Sep 26, 2016

@soltysh the link you sent above, point 3 references pod completion and not container completion.

@erictune erictune added the sig/node label Oct 6, 2016

@erictune

This comment has been minimized.

Member

erictune commented Oct 6, 2016

The two containers can share an emptyDir, and the first container and write an "I'm exiting now" message to a file and the other can exit when it sees that message.

@anshumanbh

This comment has been minimized.

anshumanbh commented Feb 10, 2017

@erictune I have a use case which I think falls in this bucket and I am hoping you could guide me in the right direction since there doesn't seem to be any official recommended way to solve this problem.

I am using the client-go library to code everything below:

So, I have a job that basically runs a tool in a one container pod. As soon as the tool finishes running, it is supposed to produce a results file. I can't seem to capture this results file because as soon as the tool finishes running, the pod deletes and I lose the results file.

I was able to capture this results file if I used HostPath as a VolumeSource and since I am running minikube locally, the results file gets saved onto my workstation.

But, I understand that's not recommended and ideal for production containers. So, I used EmptyDir as suggested above. But, again, if I do that, I can't really capture it because it gets deleted with the pod itself.

So, should I be solving my problem using the sidecar container pattern as well?

Basically, do what you suggested above. Start 2 containers in the pod whenever the job starts. 1 container runs the job and as soon as the job gets done, drops a message that gets picked up by the other container which then grabs the result file and stores it somewhere?

I fail to understand why we would need 2 containers in the first place. Why can't the job container do all this by itself? That is, finish the job, save the results file somewhere, access it/read it and store it somewhere.

@soltysh

This comment has been minimized.

Contributor

soltysh commented Feb 14, 2017

@anshumanbh I'd suggest you:

  1. use a persistent storage, you save the result file
  2. use hostPath mount, which is almost the same as 1, and you've already tried it
  3. upload the result file to a known remote location (s3, google drive, dropbox), generally any kind of shared drive
@anshumanbh

This comment has been minimized.

anshumanbh commented Feb 14, 2017

@soltysh I don't want the file to be stored permanently. On every run, I just want to compare that result with the last result. So, the way I was thinking of doing this was committing to a github repository on every run and then doing a diff to see what changed. So, in order to do that, I just need to store the result temporarily somewhere so that I can access it to send it to Github. Make sense?

@soltysh

This comment has been minimized.

Contributor

soltysh commented Feb 20, 2017

@anshumanbh perfectly clear, and still that doesn't fall into the category of side-car container. All you want to achieve is currently doable with what jobs provide.

@anshumanbh

This comment has been minimized.

anshumanbh commented Feb 22, 2017

@soltysh so considering I want to go for option 3 from the list you suggested above, how would I go about implementing it?

The problem I am facing is that as soon as the job finishes, the container exits and I lose the file. If I don't have the file, how do I upload it to shared drive like S3/Google Drive/Dropbox? I can't modify the job's code to automatically upload it somewhere before it quits so unfortunately I would have to first run the job and then save the file somewhere..

@soltysh

This comment has been minimized.

Contributor

soltysh commented Feb 23, 2017

If you can't modify job's code, you need to wrap it in such a way to be able to upload the file. If what you're working with is an image already just extend it with the copying code.

@anshumanbh

This comment has been minimized.

anshumanbh commented Feb 23, 2017

@soltysh yes, that makes sense. I could do that. However, the next question I have is - suppose I need to run multiple jobs (think about it as running different tools) and none of these tools have the uploading part built in them. So, now, I would have to build that wrapper and extend each one of those tools with the uploading part. Is there a way I can just write the wrapper/extension once and use it for all the tools?

Wouldn't the side car pattern fit in that case?

@soltysh

This comment has been minimized.

Contributor

soltysh commented Feb 23, 2017

Yeah, it could. Although I'd try with multiple containers inside the same pod, pattern. Iow. your pod is running the job container and alongside additional one waiting for the output and uploading that. Not sure how feasible is this but you can give it a try already.

@jmillikin-stripe

This comment has been minimized.

Contributor

jmillikin-stripe commented Jun 14, 2017

Gentle ping -- sidecar awareness would make management of microservice proxies such as Envoy much more pleasant. Is there any progress to share?

The current state of things is that each container needs bundled tooling to coordinate lifetimes, which means we can't directly use upstream container images. It also significantly complicates the templates, as we have to inject extra argv and mount points.

An earlier suggestion was to designate some containers as a "completion" container. I would like to propose the opposite -- the ability to designate some containers as "sidecars". When the last non-sidecar container in a Pod terminates, the Pod should send TERM to the sidecars. This would be analogous to the "background thread" concept found in many threading libraries, e.g. Python's Thread.daemon.

Example config, when container main ends the kubelet would kill envoy:

containers:
  - name: main
    image: gcr.io/some/image:latest
    command: ["/my-batch-job/bin/main", "--config=/config/my-job-config.yaml"]
  - name: envoy
    image: lyft/envoy:latest
    sidecar: true
    command: ["/usr/local/bin/envoy", "--config-path=/my-batch-job/etc/envoy.json"]
@jmillikin-stripe

This comment has been minimized.

Contributor

jmillikin-stripe commented Jun 14, 2017

For reference, here's the bash madness I'm using to simulate desired sidecar behavior:

containers:
  - name: main
    image: gcr.io/some/image:latest
    command: ["/bin/bash", "-c"]
    args:
      - |
        trap "touch /tmp/pod/main-terminated" EXIT
        /my-batch-job/bin/main --config=/config/my-job-config.yaml
    volumeMounts:
      - mountPath: /tmp/pod
        name: tmp-pod
  - name: envoy
    image: gcr.io/our-envoy-plus-bash-image:latest
    command: ["/bin/bash", "-c"]
    args:
      - |
        /usr/local/bin/envoy --config-path=/my-batch-job/etc/envoy.json &
        CHILD_PID=$!
        (while true; do if [[ -f "/tmp/pod/main-terminated" ]]; then kill $CHILD_PID; fi; sleep 1; done) &
        wait $CHILD_PID
        if [[ -f "/tmp/pod/main-terminated" ]]; then exit 0; fi
    volumeMounts:
      - mountPath: /tmp/pod
        name: tmp-pod
        readOnly: true
volumes:
  - name: tmp-pod
    emptyDir: {}
@soltysh

This comment has been minimized.

Contributor

soltysh commented Aug 2, 2017

I would like to propose the opposite -- the ability to designate some containers as "sidecars". When the last non-sidecar container in a Pod terminates, the Pod should send TERM to the sidecars.

@jmillikin-stripe I like this idea, although I'm not sure if this follows the principal of treating some containers differently in a Pod or introducing dependencies between them. I'll defer to @erictune for the final call.

Although, have you checked #17244, would this type of solution fit your use-case? This is what @erictune mentioned a few comments before:

Another option is to designate certain exit codes as having special meaning.

@jmillikin-stripe

This comment has been minimized.

Contributor

jmillikin-stripe commented Aug 2, 2017

@jmillikin-stripe I like this idea, although I'm not sure if this follows the principal of treating some containers differently in a Pod or introducing dependencies between them. I'll defer to @erictune for the final call.

I think Kubernetes may need to be flexible about the principal of not treating containers differently. We (Stripe) don't want to retrofit third-party code such as Envoy to have Lamprey-style lifecycle hooks, and trying to adopt an Envelope-style exec inversion would be much more complex than letting Kubelet terminate specific sidecars.

Although, have you checked #17244, would this type of solution fit your use-case? This is what @erictune mentioned a few comments before:

Another option is to designate certain exit codes as having special meaning.

I'm very strongly opposed to Kubernetes or Kubelet interpreting error codes at a finer granularity than "zero or non-zero". Borglet's use of exit code magic numbers was an unpleasant misfeature, and it would be much worse in Kubernetes where a particular container image could be either a "main" or "sidecar" in different Pods.

@msperl

This comment has been minimized.

msperl commented Aug 5, 2017

Maybe additional lifecycle hooks would be sufficient to solve this?

Could be:

  • PostStop: with a means to trigger lifecycle events on other containers in the pod (I.e trigger stop)
  • PeerStopped: signal that a "peer" container in the pod has died - possibly with exit code as an argument

This could also define a means to define custom policies to restart a container - or even start containers that are not started by default to allow some daisy chaining of containers (when container a finishes then start container b)

@oxygen0211

This comment has been minimized.

oxygen0211 commented Sep 6, 2017

Also missing this. We run a job every 30 min that needs a VPN client for connectivity but there seem to be a lot of use cases where this could be very useful (for example stuff that needs kubectl proxy). Currently, I am using jobSpec.concurrencyPolicy: Replace as a workaround but of course this only works if a.) you can live without parallel job runs and b.) Job execution time is shorter than scheduling interval.

EDIT: in my usecase, it would be completely sufficient to have some property in the job spec to mark a container as the terminating one and have the job monitor that one for exit status and kill the remaining ones.

@dims

This comment has been minimized.

Member

dims commented May 27, 2018

persistent-long-term-problem :(

@bgrant0607

This comment has been minimized.

Member

bgrant0607 commented Jun 5, 2018

@mcfedr

This comment has been minimized.

mcfedr commented Jul 11, 2018

Without meaning to complicate the matter further, it would also be useful to be able to run a "sidecar" style container alongside initContainers.

My use case is similar to people here, I need to run cloud sql proxy at the same time as an initContainer that runs database migrations. As initContainers are run one at a time I cannot see a way to do this, except to run the proxy as a deployment+service, but I expect there are other use cases (log management etc) where that wouldn't be a suitable work around.

@mhuxtable

This comment has been minimized.

mhuxtable commented Jul 11, 2018

@mcfedr There's a reasonably active enhancement proposal which might appreciate that observation regarding init container behaviour. It's unclear to me whether that's in-scope for this proposal, or a related improvement, but I think it's sufficiently related that it makes sense to raise for consideration.

Potential implementation/compatibility problems notwithstanding, your ideal model would presumably be for sidecar init containers to run concurrently with non-sidecar init containers, which continue to run sequentially as now, and for the sidecars to terminate prior to the main sequence containers starting up?

@philicious

This comment has been minimized.

philicious commented Aug 2, 2018

for what its worth, I would also like to express the need for ignoring sidecars still running like CloudSQL Proxy et.al.

@stiko

This comment has been minimized.

stiko commented Aug 7, 2018

I managed to kill cloudsql container after 30 seconds since I know my script wouldn't take this long. Here is my approach:

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: schedule
spec:
  concurrencyPolicy: Forbid
  schedule: "*/10 * * * *"
  startingDeadlineSeconds: 40
  jobTemplate:
    spec:
      completions: 1
      template:
        spec:
          containers:
          - image: someimage
            name: imagename
            args:
            - php
            - /var/www/html/artisan
            - schedule:run
          - command: ["sh", "-c"]
            args:
            - /cloud_sql_proxy -instances=cloudsql_instance=tcp:3306 -credential_file=some_secret_file.json & pid=$! && (sleep 30 && kill -9 $pid 2>/dev/null)
            image: gcr.io/cloudsql-docker/gce-proxy:1.11
            imagePullPolicy: IfNotPresent
            name: cloudsql
            resources: {}
            volumeMounts:
            - mountPath: /secrets/cloudsql
              name: secretname
              readOnly: true
          restartPolicy: OnFailure
          volumes:
          - name: secretname
            secret:
              defaultMode: 420
              secretName: secretname

And it is working for me.
Do you guys see any drawback in this approach?

@kilianc

This comment has been minimized.

kilianc commented Aug 17, 2018

Since I think they are related and easily adaptable for CronJobs as well, this is my solution: GoogleCloudPlatform/cloudsql-proxy#128 (comment)

It is based on one of the workarounds posted here but uses preStop because it's meant to be for deployments. Trapping the sidecar would work wonderfully tho.

@celamb4

This comment has been minimized.

celamb4 commented Oct 26, 2018

Following this issue. Also using cloud_sql_proxy container as side car in cronjob
I used the timeout implementation by @stiko

@cfontes

This comment has been minimized.

cfontes commented Nov 7, 2018

Just adding to the conversation the solution proposed by @oxygen0211 on using Replace is a decent workaround for now, be sure to check it out if you run into this issue like I did.

#25908 (comment)

@Joseph-Irving

This comment has been minimized.

Contributor

Joseph-Irving commented Nov 30, 2018

We have got this KEP provisionally approved kubernetes/community#2148, we still have a few things we need to agree on but hopefully it will get to a place where work can start on it relatively soon. Note KEPs will be moving to https://github.com/kubernetes/enhancements on the 30th, so if you want to follow along it will be over there.

whynowy pushed a commit to whynowy/eventing-sources that referenced this issue Nov 30, 2018

Derek Wang Derek Wang
Implemented Cron Job source
Used Deployment as receive adapter instead of CronJob becasue of
the batched job side car issue.

kubernetes/kubernetes#25908

If this issue is resolved in the future, we can switch to CronJob.

whynowy pushed a commit to whynowy/eventing-sources that referenced this issue Nov 30, 2018

Derek Wang Derek Wang
Implemented Cron Job source
Used Deployment as receive adapter instead of CronJob becasue of
the batched job side car issue.

kubernetes/kubernetes#25908

If this issue is resolved in the future, we can switch to CronJob.

whynowy pushed a commit to whynowy/eventing-sources that referenced this issue Nov 30, 2018

Derek Wang Derek Wang
Implemented Cron Job source
Used Deployment as receive adapter instead of CronJob becasue of
the batched job side car issue.

kubernetes/kubernetes#25908

If this issue is resolved in the future, we can switch to use CronJob.

whynowy pushed a commit to whynowy/eventing-sources that referenced this issue Dec 1, 2018

Derek Wang Derek Wang
Implemented Cron Job source
Used Deployment as receive adapter instead of CronJob because of
the batched job side car issue.

kubernetes/kubernetes#25908

If this issue is resolved in the future, we can switch to use CronJob.

whynowy added a commit to whynowy/eventing-sources that referenced this issue Dec 1, 2018

Implemented CronJob source
Used Deployment as receive adapter instead of CronJob because of
the batched job side car issue.

kubernetes/kubernetes#25908

If this issue is resolved in the future, we can switch to use CronJob.

whynowy added a commit to whynowy/eventing-sources that referenced this issue Dec 1, 2018

Implemented CronJob resource
Used Deployment as receive adapter instead of CronJob because of
the batched job side car issue.

kubernetes/kubernetes#25908

If this issue is resolved in the future, we can switch to use CronJob.

whynowy added a commit to whynowy/eventing-sources that referenced this issue Dec 3, 2018

Implemented CronJob source
Used Deployment as receive adapter instead of CronJob because of
the batched job side car issue.

kubernetes/kubernetes#25908

If this issue is resolved in the future, we can switch to use CronJob.

whynowy added a commit to whynowy/eventing-sources that referenced this issue Dec 5, 2018

Implemented CronJob source
Used Deployment as receive adapter instead of CronJob because of
the batched job side car issue.

kubernetes/kubernetes#25908

If this issue is resolved in the future, we can switch to use CronJob.

knative-prow-robot added a commit to knative/eventing-sources that referenced this issue Dec 5, 2018

Implemented CronJob source (#135)
* Implemented CronJob source

Used Deployment as receive adapter instead of CronJob because of
the batched job side car issue.

kubernetes/kubernetes#25908

If this issue is resolved in the future, we can switch to use CronJob.

* Fixup!

* Use DeepDerivative to compare spec changes and validate schedule string

* No need to run goroutine for cron

* Use CR unstructured client for cronjob source reconcile

* Remove unnecessary Addressable scheme register for cronjobsource
@janosroden

This comment has been minimized.

janosroden commented Dec 10, 2018

Until the sidecar support arrives you can use a docker-level solution which can be easily removed later: https://gist.github.com/janosroden/78725e3f846763aa3a660a6b2116c7da

It uses a privileged container with a mounted docker socket and standard kubernetes labels to manage containers in the job.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment