Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Swarm Jobs CLI #2226

Closed
dperny opened this issue Dec 13, 2019 · 8 comments
Closed

Swarm Jobs CLI #2226

dperny opened this issue Dec 13, 2019 · 8 comments

Comments

@dperny
Copy link
Contributor

dperny commented Dec 13, 2019

Swarm jobs is coming, and it needs the CLI designed.

Behavior

Currently, internally and from the API, Jobs are presented as a different Service mode. This is because they follow mostly the same rules as Services, with the key distinction that the Tasks spawned by a Job are designed to enter a terminal state of Complete and stop executing, instead of being restarted.

There are two modes for job, ReplicatedJob, and GlobalJob, which are analogous to Replicated and Global services.

A ReplicatedJob has two parameters: the maximum number of concurrent Tasks (MaxConcurrent) to execute (analogous to the Replicas parameter of a replicated Service) and the total number of desired completed Tasks (TotalCompletions). A running ReplicatedJob executes up to MaxConcurrent Tasks at the same time, and starts a new Task every time one is completed, until the sum of the Running and Completed tasks is equal to TotalCompletions.

A GlobalJob has no parameters, like a Global service, being primarily configured through its placement constraints. It executes on Task on each Node (that matches constraints), starting new Tasks only if a running Task fails (enters a terminal state other than Completed).

Jobs cannot be updated while in progress, only interrupted and restarted. Any chance to the ServiceSpec while a job is in progress will result in stopping all of that job's Tasks, and starting a whole new set of Tasks with the new spec. This behavior can be changed in the future, but is present for now. Because jobs cannot be updated, flags specifying update parameters have no effect.

To re-execute a job that has already been executed, the user can pass --force, which will cause the job Spec to change and result in a re-execution of all tasks. Each execution of a Job, whether fresh or interrupting, increments a JobIteration version number. This number is present on Tasks belonging to a Job, and is used to differentiate which Tasks belong to the current run of the job, and which Tasks belong to previous runs.

For ReplicatedJobs, the --replicas flag can be repurposed to mean MaxConcurrent. Additionally, if no other options are specified, the value of TotalCompletions would be set equal to the value of --replicas, meaning all Tasks desired for the job would execute simultaneously. ReplicatedJobs would require an additional flag, perhaps --total or --total-completions, which could specify the desired TotalCompletions independent of the value of --replicas.

When doing an ls of jobs, instead of displaying two numbers, Running over Desired Tasks, four numbers are needed to express the state of the job: the number of Running Tasks, the number of tasks desired to be Running, the number of Tasks Completed, and the number of Tasks desired to be completed. The resulting list table would look roughly like this:

ID            NAME     MODE           REPLICAS            IMAGE
c8wgl7q4ndfd  mybatch  replicated-job  5/5 (5/10 Complete) somejobimage
dmu1ept4cxcf  redis    replicated      3/3                 redis:3.0.6
iwe3278osahj  mongo    global          7/7                 mongo:3.3

From the CLI side, there is a smallish UI decision to make with how to represent jobs.
There are, as far as I can feel, three approaches

First Option: a --job flag

A flag, --job, could be added, which specifies that the service being executed is a Job-type service. This flag could be used in conjunction with the existing --mode flag to specify any of the now 4 possible modes. So, to create a job, the full command would be something like docker service create --job --mode replicated myimage

Second Option: Separate Modes

Instead of adding a --job flag, we express jobs as they are: different Service modes. This would simply add replicated-job and global-job as possible values to --mode. The command here would be docker service create --mode replicated-job myimage

Third Option: A docker job Sub-command

This is perhaps the most radical option, but has the opportunity to express the best UI. We would create a separate sub-command, importing most of the important bits from the existing docker service sub-command, and present jobs as a separate user interface object. docker service ls would only list Replicated and Global service, docker job ls would only list ReplicatedJobs and GlobalJobs. Any of the flags that are not applicable to jobs would not be present on the docker job sub-command, and the total-completions flag would not be present on the docker service sub-command. The command here would be docker job create --mode replicated myimage. In this case, the ls table would look slightly different:

ID            NAME       MODE        EXECUTING  PROGRESS  IMAGE
c8wgl7q4ndfd  mybatch    replicated  5/5        5/10      somejobimage
dmu1ept4cxcf  systask    global      0/0        6/6       someuser/updater:3.0.6
iwe3278osahj  backup     global      3/3        3/6       someuser/backup:3.3

Absent any dissenting opinion, I will go with Option 1 and add a --job flag

@thaJeztah
Copy link
Member

Need to give this some thought 🤗

docker service ls would only list Replicated and Global service, docker job ls would only list ReplicatedJobs and GlobalJobs.

One thing that could make this problematic (assuming behind the scenes they're all "services", is that it would not reflect "reality"; so docker job create foo would create a job. That job won't be visible in docker service ls, but docker service create foo would fail because a service with that name already exists.

@thaJeztah
Copy link
Member

So, ideally things would really be namespaced, but that's of course a much bigger changed.

@thaJeztah
Copy link
Member

/cc @docker/core-cli-maintainers for visibility

@dovahcrow
Copy link

Hi, any progress on the jobs cli?

@silvin-lubecki
Copy link
Contributor

Hello @dovahcrow , this issue should be closed as the related PR has been merged #2262

@thaJeztah
Copy link
Member

Yes, looks like this can be closed

@rdehouss
Copy link

Do we know which release of docker includes this new feature?

@thaJeztah
Copy link
Member

Will be in the upcoming 20.xx release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants