Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flux-jobs: add --json option to dump output in JSON #4994

Closed
grondo opened this issue Mar 14, 2023 · 18 comments · Fixed by #5054
Closed

flux-jobs: add --json option to dump output in JSON #4994

grondo opened this issue Mar 14, 2023 · 18 comments · Fixed by #5054
Assignees

Comments

@grondo
Copy link
Contributor

grondo commented Mar 14, 2023

There seems to be a need for a command line utility that dumps all known information for a job (i.e. all fields available in flux-jobs) to JSON. Currently, we have the deprecated (and hidden) flux job list-ids, but that dumps the raw information returned from the job-list module and thus is not as useful as the processed information available in flux jobs via the JobInfo class.

Therefore, perhaps the easiest path forward would be to add a --json argument to flux jobs which ignored any format and just dumped a JSON representation of the matching JobInfo objects.

@vsoch
Copy link
Member

vsoch commented Mar 14, 2023

bump bump bump! 🙌

I have badly needed this like 8 times!! 😆

I'll also add a note I made in chat - the json dump might include more info than shown in the table. Specifically I'd like the status and state, and return code. Exceptions would be extra nice! A suggested prototype:

{
    "id": 361295786278912,
    "userid": 0,
    "urgency": 16,
    "priority": 16,
    "t_submit": 1668305924.5898263,
    "t_depend": 1668305924.5898263,
    "t_run": 1668305924.6036532,
    "state": "RUN",
    "name": "sleep",
    "ntasks": 1,
    "ncores": 1,
    "duration": 0.0,
    "nnodes": 1,
    "ranks": "3",
    "nodelist": "57d0f60fce2e",
    "expiration": 4821905924.0,
    "result": "",
    "returncode": "",
    "runtime": 0.4636096954345703,
    "waitstatus": "",
    "exception": {
        "occurred": "",
        "severity": "",
        "type": "",
        "note": ""
    }
}

@grondo grondo added this to the flux-core v0.49.0 milestone Mar 23, 2023
@chu11
Copy link
Member

chu11 commented Mar 27, 2023

The last time we discussed this, I think one of the major questions was how to deal with "non existent" fields. e.g. a job that is not yet running does not have a t_run field or a runtime. jobs that haven't failed don't have an exception. etc.

It seems from @vsoch response above, folks would prefer ALL fields and appropriate emptiness/defaults when data is not yet sensible.

If we're going to return ALL fields, possible crazy idea. Could json output just be a special formatting output, i.e. like --format=json. The format would be (excuse possible need to escape stuff) { "id": {id}, "name": "{name}", ...}

Pro: no need to change JobInfo class or refactor any logic in flux-jobs or add a --json option or anything.

Con: when new fields are added, have to update the format. Although, if we add a JobInfo.get_json() or whatever function, that may need to be updated anyways. And a JobInfo.get_json() may be useful to users of the JobInfo class.

@grondo
Copy link
Contributor Author

grondo commented Mar 27, 2023

I had that same thought and came to same conclusion as you, a JobInfo.to_json() or similar method would be more useful.

@grondo grondo self-assigned this Mar 30, 2023
@grondo
Copy link
Contributor Author

grondo commented Mar 30, 2023

I have a prototype of this. Currently it emits a JSON object with an array of job dictionaries under a jobs, key, e.g.

{
  "jobs": [
    {
      "t_depend": 1680190618.4250562,
      "t_run": 0,
      "t_cleanup": 0,
      "t_inactive": 0,
      "duration": 0,
      "expiration": 0,
      "name": "hostname",
      "queue": "",
      "ntasks": 16,
      "ncores": 16,
      "nnodes": "",
      "priority": 16,
      "ranks": "",
      "nodelist": "",
      "success": "",
      "result": "",
      "waitstatus": "",
      "id": 453219713024,
      "t_submit": 1680190618.4123542,
      "t_remaining": 0,
      "state": "SCHED",
      "username": "grondo",
      "userid": 1000,
      "urgency": 16,
      "runtime": 0,
      "status": "SCHED",
      "returncode": "",
      "dependencies": [],
      "annotations": {},
      "exception": {
        "occurred": "",
        "severity": "",
        "type": "",
        "note": ""
      }
    },
    {
      "t_depend": 1680190612.6026108,
      "t_run": 1680190612.61606,
      "t_cleanup": 0,
      "t_inactive": 0,
      "duration": 0,
      "expiration": 0,
      "name": "sleep",
      "queue": "",
      "ntasks": 2,
      "ncores": 2,
      "nnodes": 1,
      "priority": 16,
      "ranks": "0",
      "nodelist": "asp",
      "success": "",
      "result": "",
      "waitstatus": "",
      "id": 355542761472,
      "t_submit": 1680190612.5900893,
      "t_remaining": 0,
      "state": "RUN",
      "username": "grondo",
      "userid": 1000,
      "urgency": 16,
      "runtime": 11.922030448913574,
      "status": "RUN",
      "returncode": "",
      "dependencies": [],
      "annotations": {
        "sched": {
          "resource_summary": "rank0/core[0-1]"
        }
      },
      "exception": {
        "occurred": "",
        "severity": "",
        "type": "",
        "note": ""
      }
    },
    {
      "t_depend": 1680190600.589544,
      "t_run": 1680190600.603245,
      "t_cleanup": 1680190600.6439605,
      "t_inactive": 1680190600.6472647,
      "duration": 0,
      "expiration": 0,
      "name": "hostname",
      "queue": "",
      "ntasks": 1,
      "ncores": 1,
      "nnodes": 1,
      "priority": 16,
      "ranks": "0",
      "nodelist": "asp",
      "success": true,
      "result": "COMPLETED",
      "waitstatus": 0,
      "id": 153998065664,
      "t_submit": 1680190600.5772986,
      "t_remaining": 0,
      "state": "INACTIVE",
      "username": "grondo",
      "userid": 1000,
      "urgency": 16,
      "runtime": 0.04071545600891113,
      "status": "COMPLETED",
      "returncode": 0,
      "dependencies": [],
      "annotations": {
        "sched": {
          "resource_summary": "rank0/core0"
        }
      },
      "exception": {
        "occurred": false,
        "severity": "",
        "type": "",
        "note": ""
      }
    }
  ]
}

If there are no jobs you still get {"jobs": []}.

Would it be better if, when only one job is requested with flux jobs --json JOBID, a single JSON object for that job was output? Seems like consistency would be better but I thought I'd ask.

Also, as noted by @chu11 above, this implementation just uses the defaults from JobInfo so a lot of "unset" attributes are empty strings. Maybe this is a good place to start?

@chu11
Copy link
Member

chu11 commented Mar 30, 2023

Just skimming the above, it occurred to me that the default return code and exception occurred are empty strings (ie “”). Should we pick a default that is the correct type? Granted it’s hard to come up with a good default for an integer return code and boolean. Maybe it can’t be helped.

Edit: oh and waitstatus

@grondo
Copy link
Contributor Author

grondo commented Mar 30, 2023

Yeah, I'm not sure either. This approach is the least code. Replacing empty strings with something "valid" will be a bit fussy and will all have to go into the to_dict() function. The second easiest thing would just be to clear any empty values from the dict before returning it.

@grondo
Copy link
Contributor Author

grondo commented Mar 30, 2023

Here's an active job with empty values removed from the result (just for comparison):

{
  "jobs": [
    {
      "t_depend": 1680193341.4998024,
      "t_run": 1680193341.513417,
      "t_cleanup": 0,
      "t_inactive": 0,
      "duration": 0,
      "expiration": 0,
      "name": "sleep",
      "ntasks": 1,
      "ncores": 1,
      "nnodes": 1,
      "priority": 16,
      "ranks": "0",
      "nodelist": "asp",
      "id": 46138837172224,
      "t_submit": 1680193341.486764,
      "t_remaining": 0,
      "state": "RUN",
      "username": "grondo",
      "userid": 1000,
      "urgency": 16,
      "runtime": 5.172987937927246,
      "status": "RUN",
      "dependencies": [],
      "annotations": {
        "sched": {
          "resource_summary": "rank0/core0"
        }
      },
      "exception": {
        "occurred": false
      }
    }
  ]
}

@grondo
Copy link
Contributor Author

grondo commented Mar 30, 2023

Calling on @vsoch! We're in dire need of a strong opinion here!

@vsoch
Copy link
Member

vsoch commented Mar 30, 2023

I would typically keep the keys consistent so it’s easier to parse in languages like Go, but if you think it looks better without, those fields could be optional. Another pro is that likely the listings could be large, and if we hide the missing fields it would reduce the size a bit. So probably for that reason alone missing or unknown fields should be removed.

For the structure, I would expect the result without an ID to return a listing of jobs, and with an ID just one job. It’s not perfectly consistent, but it will be easier to parse. As long as the JOBID is a field we wouldn’t need it redundantly as a key too.

@grondo
Copy link
Contributor Author

grondo commented Mar 30, 2023

Yeah, if it is ok to leave the unknown fields out, let's go with that. Not only for the reasons you mention, but also because sometimes it is not obvious what the unset value should even be.

I'll change the output when a single job is requested to be just one JSON object for the job. Will get to that later today and post a WIP PR. Thanks!

@chu11
Copy link
Member

chu11 commented Mar 30, 2023

The second easiest thing would just be to clear any empty values from the dict before returning it.

This gets complicated even more, b/c when t_run is 0, that's also "empty", we just defaulted it to 0 instead of empty string.

My gut feeling is to leave all of the "empty/default" stuff in. It just seems easier for the average user scripting/programming than having to do the average "does this field exist" check.

But @vsoch's point about less data is important. Dunno where the pro/con tradeoff lies.

Edit: also dependencies is an empty array in above, that's also "empty"

@vsoch
Copy link
Member

vsoch commented Mar 30, 2023

@chu11 I’m conflicted about it for the same reasons. It was the payload potential size that flipped me to err on the side of taking them out.

@chu11
Copy link
Member

chu11 commented Mar 30, 2023

Would it be hard/offensive to have an option for one way or the other? --json and --json-prune? (prune isn't good, just throwing that out there).

@grondo
Copy link
Contributor Author

grondo commented Mar 30, 2023

yeah, we could remove the unset t_* attributes as well. Something like dependencies is more questionable since an empty array is meaningful in that context (i.e. dependencies is not unset it is set to an empty list)

Would it be hard/offensive to have an option for one way or the other? --json and --json-prune? (prune isn't good, just throwing that out there).

I say we pick one way for now, and if a use case comes up that requires fully populated JSON objects for jobs, we could add an option for that.

@chu11
Copy link
Member

chu11 commented Mar 30, 2023

sounds good. skimming the list, the only "doesnt make sense" ones that are 0 are runtime and expiration, which are populated when the job starts I think. I think duration=0 does make sense being empty, as that's a user input. (IIRC)

@chu11
Copy link
Member

chu11 commented Mar 30, 2023

I say we pick one way for now, and if a use case comes up that requires fully populated JSON objects for jobs, we could add an option for that.

Thinking about this for a bit, shouldn't it be the other way around? fully populated JSON objects is the less-efficient "easy" way, non-fully-populated is the optimized advanced way? thus the latter would get an option?

@grondo
Copy link
Contributor Author

grondo commented Mar 30, 2023

I don't know, it doesn't matter which way is technically easy vs advanced. We pick the way that is the "default" and then people are upset we add an option to change the default. It is so easy to work around the missing fields that I think nobody is ever going to ask for the option.

@vsoch
Copy link
Member

vsoch commented Apr 2, 2023

For others that stumble on this issue, right now I'm using:

$ flux jobs --no-header -o '{status}:{returncode}' ƒj4CgebBV

To get a return code (and you could change the -o parameter to get something else.

@mergify mergify bot closed this as completed in #5054 Apr 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants