Command-line interface #783

buckley-w-david · 2018-04-24T15:22:29Z

As discussed in #775 we're in agreement that the CLI can/should be changed.

I suggest an interface along this line:

Usage: data [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  download
  process
  run
  update
  upload

---

Usage: data download [OPTIONS]

Options:
  --help  Show this message and exit.

Equivilant to the current data.update --just-download


---

Usage: data update [OPTIONS] [SCAN_ARGS]...

Options:
  --scan [skip|download|here]
  --gather [skip|here]
  --help                       Show this message and exit.

Just does the scan and gather operations, without any processing.

---

Usage: data process [OPTIONS]

Options:
  --date TEXT
  --help       Show this message and exit.

Just does the processing operations, assumes update data is in place (does a sanity check before doing anything)

--

Usage: data upload [OPTIONS]

Options:
  --date TEXT
  --help       Show this message and exit.

Just uploads the data to s3, assumes update data is in place (does a sanity check before doing anything)

---

Usage: data run [OPTIONS] [SCAN_ARGS]...

Options:
  --date TEXT
  --scan [skip|download|here]
  --gather [skip|here]
  --upload
  --help                       Show this message and exit.

Almost equivalent to the current interface, will do scan, gather, processing, and uploading. It is just a composition of the other commands.

---

With that configuration the update command passes along SCAN_ARGS to the scan and gather sections of update, formatted so that it is equivalent to the current options dictionary that they receive now.
As it is there it would accept anything, but I would like that to eventually be replaced by a concrete list of options that it can pass along. Maybe we don't even have this intermediate step of passing along what we are given and just go right to that concrete list. Thoughts?

Let me know what you think, if anything needs changing.

The text was updated successfully, but these errors were encountered:

konklone · 2018-04-24T15:43:03Z

I think this is a great proposal! A couple small thoughts:

It seems like data is a little generic in this context. Maybe pulse as a command name instead?
I don't see a lot of value to whitelisting the options it can accept or making them concrete, unless it's a really lightweight process. I would like to avoid adding a lot of argument parsing/cleaning code.

buckley-w-david · 2018-04-24T15:48:33Z

pulse as a command name makes sense to me, it was just data since that's what the package is called right now.

On that subject for the same reason that package name can/should also change. Perhaps to the same name.

buckley-w-david · 2018-04-24T15:57:04Z

As a counter to your comment on whitelisting, as the code is the whitelisting is already taking place, just in the command generation process.

. . .
# Allow some options passed to python -m data.update to go
# through to domain-scan.
# Boolean flags.
for flag in ["cache", "serial", "lambda"]:
    value = options.get(flag)
    if value:
        full_command += ["--%s" % flag]
. . .

Those sections are the existing whitelist, we would just transfer that processing into one central place, and be able to pass concrete lists of arguments instead of these options dictionaries.

Now to play devils advocate to my own argument, I can see some value in making suboptions generic and passed around "as is" in the case of arguments meant to processing once it is refactored into a more general form with "processors" that may have their own arguments, but I think that the scan and gather process is different and will remain mostly static with arguments known at time of writing.

konklone · 2018-04-24T16:08:00Z

I'd love to get rid of that code sample you included too! ;) It's essentially there because those 3 flags are just pass-throughs to domain-scan, rather than Pulse-specific flags. Maybe there's some better way to effect that pass-through.

I'm not going to be strongly opinionated on this stuff, only that we avoid any significant argument parsing code. We recently worked on adding some argument parsing/whitelisting code to domain-scan itself, and the code ballooned substantially and created some instability for a time as we found regressions. I'd rather have things documented clearly and have the interface feel rational and unix-style, and am less excited about making the argument processing itself substantially more complex.

buckley-w-david · 2018-04-24T16:22:15Z

Argument parsing is of course no fun.
Have you heard of/used the click library? It's a really easy to use CLI library that I personally really like (and would be a proponent of using for this).

The (mostly) complete initial implementation of the above proposal using it is in a PR on our fork right now https://github.com/cds-snc/pulse/pull/3/files, take a look if you're curious.

konklone · 2018-04-24T17:12:03Z

click looks great, I'd be happy to give it a go.

gbinal · 2020-06-09T17:05:45Z

Thank you for your contributions to this project over the year. The website has been decommissioned, so I'm going to go ahead and close this issue.

buckley-w-david mentioned this issue Apr 25, 2018

New consolidated CLI #786

Open

gbinal closed this as completed Jun 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Command-line interface #783

Command-line interface #783

buckley-w-david commented Apr 24, 2018

konklone commented Apr 24, 2018

buckley-w-david commented Apr 24, 2018

buckley-w-david commented Apr 24, 2018

konklone commented Apr 24, 2018

buckley-w-david commented Apr 24, 2018

konklone commented Apr 24, 2018

gbinal commented Jun 9, 2020

Command-line interface #783

Command-line interface #783

Comments

buckley-w-david commented Apr 24, 2018

konklone commented Apr 24, 2018

buckley-w-david commented Apr 24, 2018

buckley-w-david commented Apr 24, 2018

konklone commented Apr 24, 2018

buckley-w-david commented Apr 24, 2018

konklone commented Apr 24, 2018

gbinal commented Jun 9, 2020