Job controller proposal #11746

soltysh · 2015-07-23T12:05:01Z

To continue discussion started in openshift/origin#3693

// cc @smarterclayton @timothysc @pmorie @bgrant0607 @davidopp @nwendling @derekwaynecarr

googlebot · 2015-07-23T12:05:02Z

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project, in which case you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.

If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
If you signed the CLA as a corporation, please let us know the company's name.

k8s-bot · 2015-07-23T12:05:15Z

Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist")

If this message is too spammy, please complain to ixdy.

mikedanese · 2015-07-23T17:33:10Z

@davidopp @bprashanth #1624

nikhiljindal · 2015-07-23T20:24:15Z

/sub

alex-mohr · 2015-07-23T20:57:47Z

FWIW, I'd be curious as to whether you've thought about how much of what you need to implement for Jobs is specific to them, or whether it'd be useful to split out some of the functionality into a separate standalone object. That is, instead of Job -> Pod, perhaps Job -> {ForeverPod, RunToCompletionPod, Pet} -> Pod(s) would also make sense?

erictune · 2015-07-23T21:59:14Z

When jobSpec.taskCount > 1, the pods must be doing something different. So, they must be getting their instructions on what to do differently from something external to the PodSpec, such as a remote service, or shared volume. In other words, different pods of the same job are working on different bits of data -- call those different work units (shards in map-reduce terminology).

I think users care about which pods are doing which work units. For example, if there are repeated failures on the same work unit, users usually want to know about that so they can debug. Or they want to know which work units are taking the longest so they can change their sharding scheme. But the JobScheduler is oblivious to what work unit a pod does. Does this make it less useful to users? Or are those things handled by a different component. If the latter, I'm having trouble envisioning how it all fits together, so maybe you could present an example (pseudocode) of a map-reduce master that uses Job as a building block?

erictune · 2015-07-23T22:08:29Z

How would you envision running a map-reduce which has a master which starts a variable number of workers (e.g based on the size of the input files or other heuristics)? A Job of size 1 that runs a master, and then the master starts the workers as a Job with a computed size, and the master does not exit with success until all the work is done?

How are separate map, shuffle and reduce stages implemented? A Mapper Job followed by a Shuffler Job followed by a Reducer Job? Can I make those phases overlapping, while still respecting a total resource or pod count quota? What does that look like?

soltysh · 2015-07-24T13:38:54Z

@alex-mohr can you elaborate more on your Job -> {ForeverPod, RunToCompletionPod, Pet} -> Pod(s) suggestion? My understanding for the three objects you've mentioned between a Job and a Pod is basically the same. Iow. all of them represent a job that will be run to a completion, and the actual time it's being run depends entirely on the author of the Job. The only one that might be stepping out, imho, would be ForeverPod, which could be implemented with current ReplicationController as well, but again that depends on the Job author. He might create a Job, without any constraints which might end up running forever.

googlebot · 2015-07-24T13:38:55Z

CLAs look good, thanks!

soltysh · 2015-07-24T13:49:52Z

@erictune I guess you're referring to a discussion @davidopp and @bgrant0607 had in #1624 (starting from here). Let's discuss that topic in depth during next community hangout, as it was agreed in #1624.

timothysc · 2015-07-24T15:23:38Z

docs/proposals/job.md

+## Motivation
+Jobs are needed for executing multi-pod computation to completion; a good example
+here would be the ability to implement any type of batch oriented tasks or a MapReduce
+or Hadoop style workload.


Please remove "or a MapReduce or Hadoop style workload"

alex-mohr · 2015-07-24T17:43:59Z

@soltysh I was getting at whether Pod is building block you want for Job, or whether there's an intermediate Thing (regardless of what we actually call it) that might be useful either independently or in other contexts. See last para of #1624 (comment)

erictune · 2015-07-24T17:51:28Z

@alex-mohr
I'd been thinking the same thing. I'd been calling it an AtLeastOncePod when I talked about it to myself.
It's spec just contains PodSpec and its status is just a summary of the Pod(s) created and their intermediate and final outcome.

soltysh · 2015-07-24T18:10:07Z

@erictune @alex-mohr my understanding of the Job is, it will be a supplement to a ReplicationController, which is responsible for running an app/task/whatever your image does for ever; whereas Job runs to a completion. Obviously those runs will be represented by a intermediate object, you'll be actually getting status from (in my proposal it's called the JobExecution, but that's debatable). Which maps to Borg analogy in the following way:

k8s	Borg
Job	Job
JobExecution	Task
Pod	Attempt

Additionally I agree that adding the ability to assign "virtual amount" of work per job execution will be viable. I'll update my proposal after the weekend, if you don't mind.

mikedanese · 2015-07-25T01:13:03Z

docs/proposals/job.md

+}
+
+// JobExec represents the current state of a single execution of a Job.
+type JobExec struct {


This name shouldn't be abbreviated (so JobExecution). Task is also appropriate. Is there any opposition to using that?

Task might deserves to be a top level api object in it's own right.

We should also implement ActiveDeadlineSeconds on pods (attempts) if there is not already a way to do that..

Task is extremely overloaded in cluster management, but I think it's a good choice here.

Can you explain what you mean by making it a top-level object? As-is it can presumably be incorporated into other types, if that's what you were after?

I think we do already have ActiveDeadlineSeconds on pods: https://github.com/GoogleCloudPlatform/kubernetes/blob/affba42a0520ecf6bab040fb7971284ef9bf450a/pkg/kubelet/kubelet.go#L1358

By "top-level object", I mean it should have a dedicated REST path in the apiserver and storage path in etcd (like replication controller, node, pod, endpoint, see this), not be embedded in the JobStatus like it is in the current revision of this proposal. It also makes sense that Task would have it's own Spec and Status.

What would a task resource add compared to pods?

My intention was to replicate the behavior of ReplicationController with run-once pods in mind. Which means, I agree with Brian for not creating top-level object for JobExecution. The only difference between JE and Pods is the former groups certain amount of Pods, but that does not deserve its own object, imho.

erictune · 2015-08-17T19:55:58Z

other than the one suggestion, LGTM.

erictune · 2015-08-17T19:56:30Z

I'm sure there will be updates to this as you implement, so happy to merge this now.

soltysh · 2015-08-17T20:02:02Z

@erictune let me change the MaxParallelism to Parallelism as you suggested and let's merge it.

soltysh · 2015-08-17T20:06:27Z

@erictune changed MaxParallelism to Parallelism, now it's ready for merge. Thank you!

soltysh · 2015-08-17T20:08:22Z

Once we have this in, I'll update the ScheduledJob proposal (#11980) to match API proposed here.

soltysh · 2015-08-17T20:14:12Z

Fixed travis failure.

bgrant0607 · 2015-08-17T20:28:36Z

docs/proposals/job.md

+    // job should be run with. Defaults to 1.
+    Completions *int
+
+    // Selector is a label query over pods that should match the pod count.


This comment is out of date.

bgrant0607 · 2015-08-17T20:29:31Z

I don't think we need further iterations on the proposal at this point. It's pretty minimalistic.

You might need to rebase in order to make shippable pass.

soltysh · 2015-08-17T20:30:50Z

I'll update that comment, I just did a rebase to make travis happy, will look into shippable as well.

bgrant0607 · 2015-08-17T20:32:34Z

I don't see any difference in your shippable.yml from that at master/HEAD, so I just restarted Shippable in case it was a random failure.

soltysh · 2015-08-17T20:34:36Z

I've updated Selector comment, hopefully Shippable will like me more now.

mikedanese · 2015-08-17T20:39:58Z

Shippable failure is caused by a github outage. https://status.github.com/messages

roberthbailey · 2015-08-17T20:51:32Z

I kicked shippable to get a green status prior to merging.

Job controller proposal

soltysh · 2015-08-18T13:07:38Z

@pmorie to answer your questions:

There's no field on JobStatus which indicates the overall status of the job -- I think this would be much easier to reason about than an array of JobConditions; any thoughts?

Still you'll be searching for particular condition JobSucceeded, if that value is True then you're all set.

Is there a distinct condition for a job which has been created, but for which no pods have yet been scheduled?
Is there a distinct condition for a job which has currently executing worker pods? (I asked about this above, including here for completeness)

Nope for both. This is the difference between phases and conditions. There are no direct phases you can observe a job to be in, iow. no state machine. Conditions, as stated in here "...represent the latest available observations of an object's current state...". There's an issue regarding that topic I recommend reading.

Is it worth giving some preliminary treatment to detecting overall failure of a job?

Can you elaborate on it a bit? Do you mean something like prematurely killing a job if we know it won't reach the desired Completions?

timothysc · 2015-08-24T20:19:31Z

docs/proposals/job.md

+## Motivation
+
+Jobs are needed for executing multi-pod computation to completion; a good example
+here would be the ability to implement any type of batch oriented tasks.


should remove "any" b/c - workflow DAGs or graphs are not supported.

Job controller proposal

googlebot added the cla: no label Jul 23, 2015

soltysh mentioned this pull request Jul 23, 2015

Job controller proposal openshift/origin#3693

Closed

mikedanese assigned davidopp Jul 23, 2015

googlebot added cla: yes and removed cla: no labels Jul 24, 2015

timothysc reviewed Jul 24, 2015
View reviewed changes

mikedanese reviewed Jul 25, 2015
View reviewed changes

erictune added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 17, 2015

soltysh force-pushed the job_controller_proposal branch from 1fcded7 to 01731a7 Compare August 17, 2015 20:05

soltysh force-pushed the job_controller_proposal branch from 01731a7 to d978719 Compare August 17, 2015 20:13

bgrant0607 reviewed Aug 17, 2015
View reviewed changes

Job controller proposal

688f3da

soltysh force-pushed the job_controller_proposal branch from d978719 to 688f3da Compare August 17, 2015 20:34

roberthbailey added a commit that referenced this pull request Aug 17, 2015

Merge pull request #11746 from soltysh/job_controller_proposal

19bb04f

Job controller proposal

roberthbailey merged commit 19bb04f into kubernetes:master Aug 17, 2015

soltysh deleted the job_controller_proposal branch August 18, 2015 12:05

soltysh mentioned this pull request Aug 19, 2015

New Job resource #12910

Merged

timothysc reviewed Aug 24, 2015
View reviewed changes

ghodss mentioned this pull request Sep 1, 2015

Higher level image and deployment concepts in Kubernetes #503

Closed

boulos mentioned this pull request Nov 11, 2015

Possible future improvements for Job object #14186

Open

soltysh mentioned this pull request Nov 13, 2015

Be able to specify Job's deadline #17242

Closed

soltysh mentioned this pull request May 18, 2016

ScheduledJob client #25475

Merged

xingzhou pushed a commit to xingzhou/kubernetes that referenced this pull request Dec 15, 2016

Merge pull request kubernetes#11746 from soltysh/job_controller_proposal

4d077b2

Job controller proposal

soltysh mentioned this pull request Jan 12, 2018

Ability to start CronJobs manually #47538

Closed

Job controller proposal #11746

Job controller proposal #11746

Conversation

soltysh commented Jul 23, 2015

googlebot commented Jul 23, 2015

k8s-bot commented Jul 23, 2015

mikedanese commented Jul 23, 2015

nikhiljindal commented Jul 23, 2015

alex-mohr commented Jul 23, 2015

erictune commented Jul 23, 2015

erictune commented Jul 23, 2015

soltysh commented Jul 24, 2015

googlebot commented Jul 24, 2015

soltysh commented Jul 24, 2015

Choose a reason for hiding this comment

alex-mohr commented Jul 24, 2015

erictune commented Jul 24, 2015

soltysh commented Jul 24, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

erictune commented Aug 17, 2015

erictune commented Aug 17, 2015

soltysh commented Aug 17, 2015

soltysh commented Aug 17, 2015

soltysh commented Aug 17, 2015

soltysh commented Aug 17, 2015

Choose a reason for hiding this comment

bgrant0607 commented Aug 17, 2015

soltysh commented Aug 17, 2015

bgrant0607 commented Aug 17, 2015

soltysh commented Aug 17, 2015

mikedanese commented Aug 17, 2015

roberthbailey commented Aug 17, 2015

soltysh commented Aug 18, 2015

Choose a reason for hiding this comment