Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding options to print all available fields for data/jobs/analytics #71

Merged
merged 1 commit into from Jan 2, 2020

Conversation

@brimoor
Copy link
Contributor

brimoor commented Dec 31, 2019

If the following info isn't enough for you:

$ voxel51 jobs list
id                                    name                 state     archived    upload_date              expiration_date
------------------------------------  -------------------  --------  ----------  -----------------------  -----------------------
898aa65e-aace-4899-85fa-991d69fe088b  onerous-potential-2  COMPLETE  False       2019-12-30 17:17:35 EST  2019-12-30 17:43:54 EST
a9051f54-31be-4844-bf4f-0c10c787755e  onerous-potential-1  COMPLETE  False       2019-12-30 17:17:34 EST  2019-12-30 17:43:54 EST

you can now do this:

$ voxel51 jobs list --all-fields
id                                    name                 state     archived    upload_date              analytic_id                           auto_start    compute_mode    start_date                completion_date           fail_date    failure_type    expiration_date
------------------------------------  -------------------  --------  ----------  -----------------------  ------------------------------------  ------------  --------------  ------------------------  ------------------------  -----------  --------------  -----------------------
898aa65e-aace-4899-85fa-991d69fe088b  onerous-potential-2  COMPLETE  False       2019-12-30 17:17:35 EST  48db3b59-0e08-4303-a855-26cdbbacb885  True          GPU             2019-12-30T22:22:16.000Z  2019-12-30T22:23:49.000Z               NONE            2019-12-30 17:43:54 EST
a9051f54-31be-4844-bf4f-0c10c787755e  onerous-potential-1  COMPLETE  False       2019-12-30 17:17:34 EST  48db3b59-0e08-4303-a855-26cdbbacb885  True          GPU             2019-12-30T22:22:17.000Z  2019-12-30T22:23:45.000Z               NONE            2019-12-30 17:43:54 EST
@brimoor brimoor added the enhancement label Dec 31, 2019
@brimoor brimoor self-assigned this Dec 31, 2019
"--all-versions", action="store_true",
help="whether to include all versions of analytics")
parser.add_argument(
"-a", "--all-fields", action="store_true",

This comment has been minimized.

Copy link
@lethosor

lethosor Jan 1, 2020

Contributor

You switched the behavior of -a here - not sure if that was intentional
(Personally, I prefer -a meaning "list everything" instead of "list all information about some things", but I guess that doesn't really apply to data/jobs as well)

This comment has been minimized.

Copy link
@brimoor

brimoor Jan 1, 2020

Author Contributor

Yeah I switched because I didn't want -a to mean --all-fields for data/jobs but --all-versions for analytics. And I didn't want to not provide a shorthand for --all-fields for data/jobs. So, this is where I wound up

"id", "name", "size", "type", "upload_date", "expiration_date"],
tablefmt=TABLE_FORMAT)
if show_all_fields:
table_str = tabulate(data, headers="keys", tablefmt=TABLE_FORMAT)

This comment has been minimized.

Copy link
@lethosor

lethosor Jan 1, 2020

Contributor

I'm getting a weird order of data fields here - with -a, I get

size type upload_date expiration_date id name encoding

without -a, for comparison:

id name size type upload_date expiration_date

The jobs/analytics tables seem to roughly match regardless of whether I'm using -a or not. I don't see any obvious differences in the code, so I'm guessing this is because the order of dictionary keys isn't defined. (I'm using Python 3.6, which keeps dict keys in insertion order, but the API doesn't necessarily do that, and older versions of Python don't either.)

I think using the order of each query type's supported fields is reasonable - it's not easy to access those through the query classes, but each query instance has a fields property that should be in the order that fields were added to the query, so you could order the columns with that.

This comment has been minimized.

Copy link
@brimoor

brimoor Jan 1, 2020

Author Contributor

With --all-fields, the table is printed with header=keys, which must iterate over the dict items whatever order one's Python distro does.

However, without --all-fields, I am explicitly specifying which keys to show, and in which order. I chose the order there intentionally based on semantics.

I don't think it's an issue that the order changes for --all-fields. "Show me everything" is a different mode of operation.

This comment has been minimized.

Copy link
@lethosor

lethosor Jan 1, 2020

Contributor

I chose the order there intentionally based on semantics.

Yeah, I see where you're doing that. My point is that this output (with ID + name near the end) isn't consistent and could be improved:

$ voxel51 data list -a
size     type             upload_date              expiration_date          id                                    name                                                 encoding
-------  ---------------  -----------------------  -----------------------  ------------------------------------  ---------------------------------------------------  ----------
5.1KB    image/png        2019-12-30 10:18:06 EST  2020-05-31 20:00:00 EDT  5218aed8-403a-4970-b73c-d10aaf440cb8  100-emoji.png                                        7bit
15.3KB   image/png        2019-12-20 16:41:16 EST  2021-02-12 19:00:00 EST  56e4cc5d-0bc4-423a-a6ee-4a57107ef0a2  docker.png                                           7bit
230.3KB  video/mp4        2019-11-19 12:34:09 EST  2022-12-15 12:20:00 EST  3cdfbd81-f180-49dc-89ac-8aeae27673c6  0.mp4                                                7bit
28.9MB   video/quicktime  2019-10-31 12:15:11 EDT  2022-08-15 12:15:11 EDT  bcb1ff43-6681-4ba5-83c1-e0fc01ba859b  200mb 30fps trimed.mov                               7bit
20.1MB   video/mp4        2019-10-01 12:20:35 EDT  2022-07-02 12:20:35 EDT  a0de20ab-0e11-4b9c-9698-42548d5321da  users_6c0a3326-31cc-4201-b758-a55e7d904bcc_data ...  7bit
135.6KB  video/mp4        2019-10-01 10:00:59 EDT  2022-07-01 10:01:00 EDT  ddac4f2a-4ad6-4887-a321-4270a2175e6c  short-cars.mp4                                       7bit

This comment has been minimized.

Copy link
@brimoor

brimoor Jan 2, 2020

Author Contributor

hardcoding headers to get a certain column ordering would create the possibility of missing fields/errors if a new supported field were added to the API. I like this as a simple way to ensure that everything is always shown.

Over time the --all-fields view will probably have so many fields that its not really readable anyway; I see this like a --debug flag, where its about getting all the info dumped in any format

This comment has been minimized.

Copy link
@lethosor

lethosor Jan 3, 2020

Contributor

That's why I was suggesting getting the fields from the query object, since that will contain all the fields that were retrieved, and in an order that makes sense (i.e. the order in query.py). But I'm not sure how easy that is to access here.

Copy link
Contributor

MikeJeffers left a comment

So many fields

@brimoor

This comment has been minimized.

Copy link
Contributor Author

brimoor commented Jan 2, 2020

Many fields. Much wow

@brimoor brimoor merged commit df44fe3 into develop Jan 2, 2020
@brimoor brimoor deleted the show-all-fields branch Jan 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.