New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wreck: need a way to list running,pending jobs #1456
Comments
grondo
added a commit
to garlick/flux-core
that referenced
this issue
Apr 24, 2018
Improve wreck.joblist with the following additions: * retrieve a list of "active" jobs first using the new `job.list` rpc available from the wreck/job module. Inactive jobs are appended to this active list only if the returned list of active jobs does not meet or exceed arg.max. Fixes flux-framework#1456 * Allow filtering jobs by state in wreck.joblist with a states table with two allowable members - include: include *only* states where include[state] == true - exclude: exclude states in this table where exclude[state] == true These states are passed to `job.list` rpc for the active job list and directly filtered on kvs job state for the jobs retrieved from the kvs. * Add an active_only flag to wreck.joblist which returns immediately after retrieving the active job list from the job module. This effectively skips kvs traversal and all complete and failed jobs. * Add a kvs_only flag to wreck.joblist which skips the retrieval of active jobs from the `job.list` rpc. This avoids an unnecessary rpc when it is known that no active jobs are required to be returned from the function. (Should be used in combination with exclude/include to restrict job states returned from kvs)
garlick
pushed a commit
to garlick/flux-core
that referenced
this issue
Apr 25, 2018
Improve wreck.joblist with the following additions: * retrieve a list of "active" jobs first using the new `job.list` rpc available from the wreck/job module. Inactive jobs are appended to this active list only if the returned list of active jobs does not meet or exceed arg.max. Fixes flux-framework#1456 * Allow filtering jobs by state in wreck.joblist with a states table with two allowable members - include: include *only* states where include[state] == true - exclude: exclude states in this table where exclude[state] == true These states are passed to `job.list` rpc for the active job list and directly filtered on kvs job state for the jobs retrieved from the kvs. * Add an active_only flag to wreck.joblist which returns immediately after retrieving the active job list from the job module. This effectively skips kvs traversal and all complete and failed jobs. * Add a kvs_only flag to wreck.joblist which skips the retrieval of active jobs from the `job.list` rpc. This avoids an unnecessary rpc when it is known that no active jobs are required to be returned from the function. (Should be used in combination with exclude/include to restrict job states returned from kvs)
garlick
pushed a commit
to garlick/flux-core
that referenced
this issue
Apr 26, 2018
Improve wreck.joblist with the following additions: * retrieve a list of "active" jobs first using the new `job.list` rpc available from the wreck/job module. Inactive jobs are appended to this active list only if the returned list of active jobs does not meet or exceed arg.max. Fixes flux-framework#1456 * Allow filtering jobs by state in wreck.joblist with a states table with two allowable members - include: include *only* states where include[state] == true - exclude: exclude states in this table where exclude[state] == true These states are passed to `job.list` rpc for the active job list and directly filtered on kvs job state for the jobs retrieved from the kvs. * Add an active_only flag to wreck.joblist which returns immediately after retrieving the active job list from the job module. This effectively skips kvs traversal and all complete and failed jobs. * Add a kvs_only flag to wreck.joblist which skips the retrieval of active jobs from the `job.list` rpc. This avoids an unnecessary rpc when it is known that no active jobs are required to be returned from the function. (Should be used in combination with exclude/include to restrict job states returned from kvs)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The wreck prototype is not well suited for real-world use. It doesn't have a concept of queue (so pending jobs can't be sorted in priority order), doesn't track which jobs are running besides the
lwj.state
kvs entry, and doesn't offer any kind of search functionality (like show all single node jobs, or jobs by name, etc)This means that the job listing tool
flux-wreck ls
is very awkward, since it orders output based on job id, which is pretty arbitrary in real-world use.Like job listing utilities in other schedulers, we may need a way to list pending and running jobs, in priority order (with running jobs having de facto infinite priority, so they are always listed first), to get much more mileage with real-world use of the wreck system.
There's probably multiple ways to accomplish this, but reading all job states from the kvs is probably contraindicated, so we may want to set up some new soft link dirs like lwj-complete (because everything is solved with another layer of indirection).
We could also just keep a list of not-complete jobs in the
job
module. This might help that module ensure thatrunrequest
jobs make it torunning
within some timeout. The wreck tools could query for this list.If we wanted to sort pending jobs in priority order, then we'd probably have to query the scheduler.
In the wreck-replacement, we'll have to have a good story for how to do this in a distributed manner. I think we already had the beginning of a design in the KVS schema doc, but I haven't looked back at that yet. The purpose of this issue would be to satisfy the minimum requirement for splash.
The text was updated successfully, but these errors were encountered: