New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed CRON jobs in k8s #2156

Closed
jeefy opened this Issue Nov 4, 2014 · 26 comments

Comments

Projects
None yet
@jeefy
Member

jeefy commented Nov 4, 2014

Talked with @brendandburns and @jbeda about this briefly.

Being able to submit a job that fires in the k8s cluster periodically (like Chronos) would be a good feature to offer. Something that mirrors the replicationController (periodicController was @jbeda off the cuff name)

JSON would also be pretty similar (in my mind) as well, mirroring replicationController's, with the addition of a timing attribute and possibly a means to notify output.

Thoughts? (Typing this up quick before I run to a bunch of meetings)

@brendandburns

This comment has been minimized.

Show comment
Hide comment
@brendandburns

brendandburns Nov 4, 2014

Contributor

Yes, I think we should do this. Let's implement it as a plugin to the api
server.

Brendan
On Nov 4, 2014 11:16 AM, "Jeffrey Sica" notifications@github.com wrote:

Talked with @brendandburns https://github.com/brendandburns and @jbeda
https://github.com/jbeda about this briefly.

Being able to submit a job that fires in the k8s cluster periodically
(like Chronos) would be a good feature to offer. Something that mirrors the
replicationController (periodicController was @jbeda
https://github.com/jbeda off the cuff name)

JSON would also be pretty similar (in my mind) as well, mirroring
replicationController's, with the addition of a timing attribute and
possibly a means to notify output.

Thoughts? (Typing this up quick before I run to a bunch of meetings)


Reply to this email directly or view it on GitHub
#2156.

Contributor

brendandburns commented Nov 4, 2014

Yes, I think we should do this. Let's implement it as a plugin to the api
server.

Brendan
On Nov 4, 2014 11:16 AM, "Jeffrey Sica" notifications@github.com wrote:

Talked with @brendandburns https://github.com/brendandburns and @jbeda
https://github.com/jbeda about this briefly.

Being able to submit a job that fires in the k8s cluster periodically
(like Chronos) would be a good feature to offer. Something that mirrors the
replicationController (periodicController was @jbeda
https://github.com/jbeda off the cuff name)

JSON would also be pretty similar (in my mind) as well, mirroring
replicationController's, with the addition of a timing attribute and
possibly a means to notify output.

Thoughts? (Typing this up quick before I run to a bunch of meetings)


Reply to this email directly or view it on GitHub
#2156.

@bgrant0607

This comment has been minimized.

Show comment
Hide comment
@bgrant0607

bgrant0607 Nov 4, 2014

Member

+1. We should finish #170 to make this easier.

Member

bgrant0607 commented Nov 4, 2014

+1. We should finish #170 to make this easier.

@bgrant0607

This comment has been minimized.

Show comment
Hide comment
@bgrant0607

bgrant0607 Nov 4, 2014

Member

@smarterclayton Have you thought about a cron-like controller in OpenShift? Almost every service at Google uses such a thing internally, such as for periodically regenerating serving data.

Member

bgrant0607 commented Nov 4, 2014

@smarterclayton Have you thought about a cron-like controller in OpenShift? Almost every service at Google uses such a thing internally, such as for periodically regenerating serving data.

@smarterclayton

This comment has been minimized.

Show comment
Hide comment
@smarterclayton

smarterclayton Nov 4, 2014

Contributor

I think we definitely assumed there had to be something like it. In OpenShift today we find 20% of apps opt to use cron (which runs inside the container, driven by a host level cron process - scheduled docker exec). I had originally assumed similarities between cluster wide parallel docker exec at an imperative level (kubectl run command --on pod-labels) and a controller that would do the same on a period. I hadn't thought very far into the resource isolation aspects (ie do you want to do the same thing but with pods).

On Nov 4, 2014, at 4:48 PM, bgrant0607 notifications@github.com wrote:

@smarterclayton Have you thought about a cron-like controller in OpenShift? Almost every service at Google uses such a thing internally, such as for periodically regenerating serving data.


Reply to this email directly or view it on GitHub.

Contributor

smarterclayton commented Nov 4, 2014

I think we definitely assumed there had to be something like it. In OpenShift today we find 20% of apps opt to use cron (which runs inside the container, driven by a host level cron process - scheduled docker exec). I had originally assumed similarities between cluster wide parallel docker exec at an imperative level (kubectl run command --on pod-labels) and a controller that would do the same on a period. I hadn't thought very far into the resource isolation aspects (ie do you want to do the same thing but with pods).

On Nov 4, 2014, at 4:48 PM, bgrant0607 notifications@github.com wrote:

@smarterclayton Have you thought about a cron-like controller in OpenShift? Almost every service at Google uses such a thing internally, such as for periodically regenerating serving data.


Reply to this email directly or view it on GitHub.

@smarterclayton

This comment has been minimized.

Show comment
Hide comment
@smarterclayton

smarterclayton Nov 6, 2014

Contributor

The creating pods to spin off doesn't solve a lot of the common "in container" operations you might want to schedule (invoke DB stats collection, trim logs, etc). I'd like to see a discussion about that aspect.

Contributor

smarterclayton commented Nov 6, 2014

The creating pods to spin off doesn't solve a lot of the common "in container" operations you might want to schedule (invoke DB stats collection, trim logs, etc). I'd like to see a discussion about that aspect.

@brendandburns

This comment has been minimized.

Show comment
Hide comment
@brendandburns

brendandburns Nov 6, 2014

Contributor

I started to poke at this. One big problem, is that despite the apiserver
being modular, we heavily rely on IsAPIObject to be implemented in the api
package for all API objects (for encode/decode).

We need to either be ok with there being a bunch of non-core types in those
files, or have some schema based encode/decode option.

--brendan

On Wed, Nov 5, 2014 at 9:20 PM, Clayton Coleman notifications@github.com
wrote:

The creating pods to spin off doesn't solve a lot of the common "in
container" operations you might want to schedule (invoke DB stats
collection, trim logs, etc). I'd like to see a discussion about that aspect.


Reply to this email directly or view it on GitHub
#2156 (comment)
.

Contributor

brendandburns commented Nov 6, 2014

I started to poke at this. One big problem, is that despite the apiserver
being modular, we heavily rely on IsAPIObject to be implemented in the api
package for all API objects (for encode/decode).

We need to either be ok with there being a bunch of non-core types in those
files, or have some schema based encode/decode option.

--brendan

On Wed, Nov 5, 2014 at 9:20 PM, Clayton Coleman notifications@github.com
wrote:

The creating pods to spin off doesn't solve a lot of the common "in
container" operations you might want to schedule (invoke DB stats
collection, trim logs, etc). I'd like to see a discussion about that aspect.


Reply to this email directly or view it on GitHub
#2156 (comment)
.

@jeefy

This comment has been minimized.

Show comment
Hide comment
@jeefy

jeefy Nov 6, 2014

Member

From a user perspective, I'd like to define both internal (docker exec?) and external (new pod) jobs through a single object. There's merit in allowing both, and I like the idea of defining "in container" jobs at the k8s level when you consider people will be using third party containers that they can't (or won't) embed cron jobs into.

Being able to go "I want usage stats using dumped every hour from every pod labeled 'prod' and 'db' using 'docker exec ....' " as well as "I want to generate our nightly reports and email them out, spin up the reports pod" in the same breath would be ideal.

Member

jeefy commented Nov 6, 2014

From a user perspective, I'd like to define both internal (docker exec?) and external (new pod) jobs through a single object. There's merit in allowing both, and I like the idea of defining "in container" jobs at the k8s level when you consider people will be using third party containers that they can't (or won't) embed cron jobs into.

Being able to go "I want usage stats using dumped every hour from every pod labeled 'prod' and 'db' using 'docker exec ....' " as well as "I want to generate our nightly reports and email them out, spin up the reports pod" in the same breath would be ideal.

@smarterclayton

This comment has been minimized.

Show comment
Hide comment
@smarterclayton

smarterclayton Nov 6, 2014

Contributor

----- Original Message -----

From a user perspective, I'd like to define both internal (docker exec?) and
external (new pod) jobs through a single object. There's merit in allowing
both, and I like the idea of defining "in container" jobs at the k8s level
when you consider people will be using third party containers that they
can't (or won't) embed cron jobs into.

I think the challenge is that they are different fundamental actions - one is declarative "I want this pod to be created once every X hours", while the imperative "I want to run this command and collect the output and statistics on all of these pods" needs some object / container to mediate the action, gather the output, retry on failures, etc.

It would be worth considering whether we could offer a simple container that could do that action for you so that you could define it as a pod to run (with potentially some API magic making it easier to do). That allows the pod to be the unit of execution, and the logs and success of the total action to be defined in the pod.

Image: kubernetes/docker-executor
Env:
SCRIPT: "bunch of arbitrary script"
CREDENTIALS: "my kube credentials"
SELECTOR: "pod label selector"
Command: /a/remoteExecutionBinary

Binary takes the script ENV and runs it on all hosts:

  #!/bin/bash
  kubectl exec --token=$CREDENTIALS -l -l $SELECTOR -- /bin/bash -c $SCRIPT

The internal job can then become an external job, and the CLI can support the external job.

The challenge with internal job execution is the streaming of potentially very large sets of data (tar omcf - / | cat) across the wire from hundreds of pods. That requires some sort of proxy to mediate the execution of the job, buffer, etc. We haven't talked about it much yet.

Being able to go "I want usage stats using dumped every hour from every pod
labeled 'prod' and 'db' using 'docker exec ....' " as well as "I want to
generate our nightly reports and email them out, spin up the reports pod" in
the same breath would be ideal.


Reply to this email directly or view it on GitHub:
#2156 (comment)

Contributor

smarterclayton commented Nov 6, 2014

----- Original Message -----

From a user perspective, I'd like to define both internal (docker exec?) and
external (new pod) jobs through a single object. There's merit in allowing
both, and I like the idea of defining "in container" jobs at the k8s level
when you consider people will be using third party containers that they
can't (or won't) embed cron jobs into.

I think the challenge is that they are different fundamental actions - one is declarative "I want this pod to be created once every X hours", while the imperative "I want to run this command and collect the output and statistics on all of these pods" needs some object / container to mediate the action, gather the output, retry on failures, etc.

It would be worth considering whether we could offer a simple container that could do that action for you so that you could define it as a pod to run (with potentially some API magic making it easier to do). That allows the pod to be the unit of execution, and the logs and success of the total action to be defined in the pod.

Image: kubernetes/docker-executor
Env:
SCRIPT: "bunch of arbitrary script"
CREDENTIALS: "my kube credentials"
SELECTOR: "pod label selector"
Command: /a/remoteExecutionBinary

Binary takes the script ENV and runs it on all hosts:

  #!/bin/bash
  kubectl exec --token=$CREDENTIALS -l -l $SELECTOR -- /bin/bash -c $SCRIPT

The internal job can then become an external job, and the CLI can support the external job.

The challenge with internal job execution is the streaming of potentially very large sets of data (tar omcf - / | cat) across the wire from hundreds of pods. That requires some sort of proxy to mediate the execution of the job, buffer, etc. We haven't talked about it much yet.

Being able to go "I want usage stats using dumped every hour from every pod
labeled 'prod' and 'db' using 'docker exec ....' " as well as "I want to
generate our nightly reports and email them out, spin up the reports pod" in
the same breath would be ideal.


Reply to this email directly or view it on GitHub:
#2156 (comment)

@smarterclayton

This comment has been minimized.

Show comment
Hide comment
@smarterclayton

smarterclayton Mar 6, 2015

Contributor

@soltysh please link your proposal here once it's more complete

Contributor

smarterclayton commented Mar 6, 2015

@soltysh please link your proposal here once it's more complete

@soltysh

This comment has been minimized.

Show comment
Hide comment
@soltysh

soltysh Mar 6, 2015

Contributor

@smarterclayton will do so as soon as I gather some more feedback around that.

In the meantime I have a question for @jeefy. Given what @smarterclayton proposed with the ability to run in-container commands (eg. using #3763) are you convinced on having just one job type which will, similarly to RCs, specify which image to run and the schedule?

Contributor

soltysh commented Mar 6, 2015

@smarterclayton will do so as soon as I gather some more feedback around that.

In the meantime I have a question for @jeefy. Given what @smarterclayton proposed with the ability to run in-container commands (eg. using #3763) are you convinced on having just one job type which will, similarly to RCs, specify which image to run and the schedule?

@xudifsd

This comment has been minimized.

Show comment
Hide comment
@xudifsd

xudifsd Mar 12, 2015

Contributor

Hi, I'm interested in implementing this as gsoc work, but it seems @soltysh is already doing this, I'm not sure if this still is a gsoc idea?

Contributor

xudifsd commented Mar 12, 2015

Hi, I'm interested in implementing this as gsoc work, but it seems @soltysh is already doing this, I'm not sure if this still is a gsoc idea?

@soltysh

This comment has been minimized.

Show comment
Hide comment
@soltysh

soltysh Mar 12, 2015

Contributor

@xudifsd me and a couple of folks we are currently working on a proposal to drive the discussion in this topic. I can't tell you when exactly the implementation of it will start.

Contributor

soltysh commented Mar 12, 2015

@xudifsd me and a couple of folks we are currently working on a proposal to drive the discussion in this topic. I can't tell you when exactly the implementation of it will start.

@bgrant0607

This comment has been minimized.

Show comment
Hide comment
@bgrant0607

bgrant0607 Mar 13, 2015

Member

@soltysh Does that imply you plan to build it?

@xudifsd Sorry, it's a bit hard to tell what people will start working on.

Member

bgrant0607 commented Mar 13, 2015

@soltysh Does that imply you plan to build it?

@xudifsd Sorry, it's a bit hard to tell what people will start working on.

@soltysh

This comment has been minimized.

Show comment
Hide comment
@soltysh

soltysh Mar 13, 2015

Contributor

On Fri, Mar 13, 2015 at 5:44 AM, Brian Grant notifications@github.com
wrote:

@soltysh https://github.com/soltysh Does that imply you plan to build
it?

Definitely I'll be one of the implementers.

@xudifsd https://github.com/xudifsd Sorry, it's a bit hard to tell what
people will start working on.


Reply to this email directly or view it on GitHub
#2156 (comment)
.

Contributor

soltysh commented Mar 13, 2015

On Fri, Mar 13, 2015 at 5:44 AM, Brian Grant notifications@github.com
wrote:

@soltysh https://github.com/soltysh Does that imply you plan to build
it?

Definitely I'll be one of the implementers.

@xudifsd https://github.com/xudifsd Sorry, it's a bit hard to tell what
people will start working on.


Reply to this email directly or view it on GitHub
#2156 (comment)
.

@smarterclayton

This comment has been minimized.

Show comment
Hide comment
@smarterclayton

smarterclayton Mar 21, 2015

Contributor

http://queue.acm.org/detail.cfm?id=2745840 A little light reading for the weekend.

Contributor

smarterclayton commented Mar 21, 2015

http://queue.acm.org/detail.cfm?id=2745840 A little light reading for the weekend.

@bgrant0607

This comment has been minimized.

Show comment
Hide comment
@bgrant0607

bgrant0607 Mar 23, 2015

Member

Thanks for the pointer. Yes, Stepan was TL of our internal cron service.

Member

bgrant0607 commented Mar 23, 2015

Thanks for the pointer. Yes, Stepan was TL of our internal cron service.

@bgrant0607

This comment has been minimized.

Show comment
Hide comment
@bgrant0607
Member

bgrant0607 commented May 12, 2015

@davidopp

This comment has been minimized.

Show comment
Hide comment
@davidopp

davidopp Aug 22, 2015

Member

#11980 is implementing a version of this.

Member

davidopp commented Aug 22, 2015

#11980 is implementing a version of this.

@DrRibosome

This comment has been minimized.

Show comment
Hide comment
@DrRibosome

DrRibosome Jun 2, 2016

just wondering if this feature is still in the works

DrRibosome commented Jun 2, 2016

just wondering if this feature is still in the works

@soltysh

This comment has been minimized.

Show comment
Hide comment
@soltysh

soltysh Jun 2, 2016

Contributor

It's still being worked on, unfortunately it didn't meet the 1.3 deadline so it'll sleep into 1.4.

Contributor

soltysh commented Jun 2, 2016

It's still being worked on, unfortunately it didn't meet the 1.3 deadline so it'll sleep into 1.4.

@eghobo

This comment has been minimized.

Show comment
Hide comment
@eghobo

eghobo Jun 4, 2016

@soltysh: sorry to hear that this feature missed 1.3 ):. do you have any ETA in mind? we have big interest at this feature, how we can help?

eghobo commented Jun 4, 2016

@soltysh: sorry to hear that this feature missed 1.3 ):. do you have any ETA in mind? we have big interest at this feature, how we can help?

@soltysh

This comment has been minimized.

Show comment
Hide comment
@soltysh

soltysh Jun 6, 2016

Contributor

@eghobo we're waiting for 1.3 to land and additionally we need to figure out some multi-versioning problems. But the general idea is to have it as soon as possible.

Contributor

soltysh commented Jun 6, 2016

@eghobo we're waiting for 1.3 to land and additionally we need to figure out some multi-versioning problems. But the general idea is to have it as soon as possible.

@bgrant0607 bgrant0607 added area/batch and removed area/workflow labels Jul 12, 2016

untoldone added a commit to untoldone/kubernetes.github.io that referenced this issue Jul 16, 2016

@SEJeff

This comment has been minimized.

Show comment
Hide comment
@SEJeff

SEJeff Jul 27, 2016

Contributor

Not seeing a link to it in this issue, so to help the next person looking, this is fixed in git via #24970. It implements the scheduled job api which seemingly uses a normal crontab style API for job scheduling.

Contributor

SEJeff commented Jul 27, 2016

Not seeing a link to it in this issue, so to help the next person looking, this is fixed in git via #24970. It implements the scheduled job api which seemingly uses a normal crontab style API for job scheduling.

@soltysh

This comment has been minimized.

Show comment
Hide comment
@soltysh

soltysh Jul 27, 2016

Contributor

@SEJeff not quite, that PR introduces only the API part. The remaining (not merged yet) parts are tracked in kubernetes/features#19.

Contributor

soltysh commented Jul 27, 2016

@SEJeff not quite, that PR introduces only the API part. The remaining (not merged yet) parts are tracked in kubernetes/features#19.

@SEJeff

This comment has been minimized.

Show comment
Hide comment
@SEJeff

SEJeff Jul 27, 2016

Contributor

Gotcha, thanks

Contributor

SEJeff commented Jul 27, 2016

Gotcha, thanks

@erictune

This comment has been minimized.

Show comment
Hide comment
@erictune

erictune Aug 26, 2016

Member

ScheduledJob is alpha in 1.4.

Member

erictune commented Aug 26, 2016

ScheduledJob is alpha in 1.4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment