DM-25385 make a pipetask command based on click #58

n8pease · 2020-06-21T02:18:32Z

No description provided.

timj

Looks okay but I'd like @andy-slac to take a look.

timj · 2020-06-22T23:04:32Z

python/lsst/ctrl/mpexec/cli/opt/show.py

+
+class show_option:  # noqa: N801
+
+    defaultHelp = ("Dump various info to standard output. Possible items are: `config', "


I think this might be better as a """ quoted string.

@timj as far as I can tell doing it that way puts the newlines and whitespace (if each new line does not begin at column 0) in the value of the string (is there a way to prevent that?). Click does it's own text wrapping for help strings and depends on console width. The extra whitespace messes that up. I guess I could format the """ str for narrowest expected console, but I'm not sure that would be a real win over just doing it this way & letting the wookie win.

That's disappointing. The help text is very hard to read in that context. The alternative is to use the triple-quoted string and run it through a helper function that replaces every newline with a space.

ok, that and textwrap.dedent FTW

timj · 2020-06-22T23:07:38Z

python/lsst/ctrl/mpexec/cli/script/build.py

+                              _ACTION_ADD_INSTRUMENT)
+
+
+def build(config_file=[], config=[], delete=[], instrument=[], order_pipeline=None, pipeline_dot=None,


Python advises against using mutables as defaults because those mutables can be changed by this function and that would change the defaults to everyone. Since you are only using these for iteration you can use a tuple as a default value.

timj · 2020-06-22T23:33:35Z

python/lsst/ctrl/mpexec/cli/script/build.py

+        The pipeline instance that was built.
+    """
+
+    class Args:


This is cute. We stuff random properties into an empty class? Couldn't we use a simple dataclass here so that we can do some kind of checking?

I was wanting to avoid managing the opaque args data structure much/any. Of course it's the idiom that comes from the argparse impl, IMHO it should be replaced entirely by member vars to the function or a smaller bunch of classes that contain related data members. I was hoping to avoid making that big of a change right now. But, doing one or the other is probably the Right Thing to do. I'll ponder, it might be worth a discussion on Slack... 🤔

I didn't know any of the history (so this trick was used in the implementation by @andy-slac and you migrated it over?). At the very least a comment like you just made in the class description for Args would be useful.

the args object with a bunch of parser-defined properties is what you get from python's argparse module, e.g. the top example at https://docs.python.org/3/library/argparse.html. It's a common pattern (anti-pattern IMHO) to then pass around that object to your impl functions and let them pluck the members they need out of it.
I can at least add a comment 👍

We'll have to get rid of the argparse.Namespace thing at some point (that is the class that is currently passed around and it does the same thing as Args). I was hoping to get more input from other user about how pipetask API should look like and refactor things accordingly but I do not see any users that need that API, everyone seems to be happy with pipetask CLI 🙁

andy-slac

Some fixes needed, please check comments:

options/actions that build pipeline should be executed in the order in which they appear on command line
unit test for click should be separate from unit test for framework API
small bunch of docstring and help/metavar comments

andy-slac · 2020-06-23T16:45:32Z

bin.src/pipetask2

+
+from lsst.ctrl.mpexec.cli.pipetask import main
+
+if __name__ == '__main__':


Double quotes for all strings.

andy-slac · 2020-06-23T16:53:17Z

python/lsst/ctrl/mpexec/cli/opt/order_pipeline.py

+    def __call__(self, f):
+        return click.option("--order-pipeline",
+                            required=self.required,
+                            help=self.help)(f)


This is a boolean flag, should it have is_flag=True parameter?

andy-slac · 2020-06-23T16:56:05Z

python/lsst/ctrl/mpexec/cli/opt/pipeline.py

+    def __call__(self, f):
+        return click.option("-p", "--pipeline",
+                            required=self.required,
+                            help=self.help)(f)


metavar="PATH"? Or type=click.Path(...)?

andy-slac · 2020-06-23T16:57:58Z

python/lsst/ctrl/mpexec/cli/opt/pipeline_dot.py

+        return click.option("--pipeline-dot",
+                            help=self.help,
+                            required=self.required,
+                            type=click.Path(exists=True, writable=True))(f)


Does exists=True mean it has to exist? This is for output only, it does not need to exist, but will be overwritten if it exists.

it does mean that. I'll fix.

andy-slac · 2020-06-23T16:59:21Z

python/lsst/ctrl/mpexec/cli/opt/save_pipeline.py

+    def __call__(self, f):
+        return click.option("-s", "--save-pipeline",
+                            help=self.help,
+                            required=self.required)(f)


Same here, this is an output file, either give it metavar or type=Path?

andy-slac · 2020-06-23T17:47:26Z

python/lsst/ctrl/mpexec/cli/pipetask.py

+
+
+@click.command(cls=PipetaskCLI, context_settings=dict(help_option_names=["-h", "--help"]))
+@log_level_option()


I guess adding option here means that you have to do:

pipetask2 --log-level=DEBUG build ....

but cannot do

pipetask2 build --log-level=DEBUG ....

for current pipetask we intentionally define all options for each sub-command (so you can do latter but not former). Wt should keep the same behavior for pipetask2

I'm not sure I understand. It was a deliberate decision in butler to move the log level before the subcommand. butler and pipetask should agree. Why do we require that --log-level comes after? We can change our minds on this.

pipetask initially implemented common options to appear before the command and command-specific options after the command (much like git and other tools) but that caused confusion and I had to make it uniform. Announcement for this was here: https://community.lsst.org/t/pipetask-command-line-interface-changes/3923, I don't remember where complaints about initial command line were, on slack or some ticket.

We have to agree on how butler and pipetask works. @TallJimbo do you have a preference?

My understanding is that other tools with subcommands (certainly including git - at least the way I always use it) allow options that are not specific to the subcommand to come either before or after, and if we can do that with Click, there's no problem.

I believe argparse requires such options to always go before the subcommand, which is a UI disaster in this case: it requires the user to remember whether some option is specific to a subcommand, even though it's also used by (i.e. "specific to"), say, 4 of the 5 other subcommands, and hence it's really demanding that the user know the interface of all of the subcommands they aren't trying to call. Telling argparse that all options are specific to subcommands and hence must go after the subcommand was a workaround for this: it's at least not hard to remember "all options go after the subcommand".

Do we know if Click has the same behavior as argparse? If so, I think we have no choice but to do the same workaround, even if it means fighting Click, unless we can make the case that some options are so completely universal that it's easy to remember that they always must go before the subcommand. But I hope it doesn't come to that.

Is there a way to tell Click to allow an option in either place?

is there a way to tell Click to allow an option in either place?

Kind of: we can allow the command (pipetask) and subcommands (e.g. build) to both take an option. If those options need to be mutually exclusive then we have to hand roll that in implementation.

But, I had thought of --log-level and butler --help as being specific to the butler (or pipetask) command. butler --help definitely is, and we do support butler <subcommand> --help. As you'd expect, the later results in totally different output from the former. butler --help will prevent any further subcommands from running (it prints help and exits). The logging setting will affect logging from butler execution as well as execution of the subcommand it calls, it didn't seem like we needed subcommands to specify their own help level.

I don't think it makes sense for some subcommand options to go before the subcommand on the CLI and some to go after it. When there are common subcommands we define each in a shareable way, and it is very simple to add a shared option to the subcommand that accepts it, making it so that all options can go after the subcommand.

if every one else thinks it should be butler command --log-level then change it to that.

allow options that are not specific to the subcommand to come either before or after, and if we can do that with Click, there's no problem.

Click does not seem to want to support e.g. butler create ~/foo --log-level DEBUG in an automatic sort of way, without making --log-level an option of create. There might be a way to code around this but if possible I think we should not fight it.

FWIW, I'm definitely not proposing allowing subcommand-specific options before the subcommand (just the converse). And I agree --help currently works the way it should.

--log-level is one of the few things I might imagine being fine to require before the subcommand, as it could be sufficiently global. I'm not sure it actually is that global, and that's a question of whether logging control for task execution is really the same as logging control for butler (and really whether it's just one logging control for everything). If it is, fine - but if not we may need to actually split this up into e.g. --butler-log-level LEVEL or --task-log-level LEVEL or something more general like --log-level NAME=LEVEL, before we think about what goes before or after the subcommand.

andy-slac · 2020-06-23T18:37:09Z

python/lsst/ctrl/mpexec/cli/script/build.py

+        args.pipeline_actions.append(_ACTION_CONFIG(c))
+
+    for c in config_file:
+        args.pipeline_actions.append(_ACTION_CONFIG_FILE(c))


I think there is a big issue here. The -c and -C actions have to be executed in the order in which they appear on the command line, i.e.:

pipetask build -C override1.py -c param1=val1 -C override2.py -c param2=val2

is not the same as

pipetask build -c param1=val1 -c param2=val2 -C override1.py -C override2.py

I think you new parser implicitly re-orders those options to look like latter case, but we have to keep the relative order of -c/-C actions.

The same applies to other actions, i.e.:

pipetask build ... --delete task --task task

is not the same as

pipetask build ... --task task --delete task

I am pondering that this might be a serious click inhibitor...

I hope it should be possible to do same sort of smart trick in click, similar to what was done in argparse.

Or alternatively we could drop all these options that build pipelines and force everyone to do YAML. I do not know how many people are sill using -t/-c/-C but I remember there was a talk that YAML is our future.

Something for @natelust or @MichelleGower to comment on I think.

I can capture the command-line inputs to the command. It's not built into the api but I can subclass, the Click authors claim it's designed and supported to be extensible this way.

andy-slac · 2020-06-23T18:45:14Z

python/lsst/ctrl/mpexec/cli/script/build.py

+        The pipeline instance that was built.
+    """
+
+    class Args:


We'll have to get rid of the argparse.Namespace thing at some point (that is the class that is currently passed around and it does the same thing as Args). I was hoping to get more input from other user about how pipetask API should look like and refactor things accordingly but I do not see any users that need that API, everyone seems to be happy with pipetask CLI 🙁

andy-slac · 2020-06-23T18:47:28Z

python/lsst/ctrl/mpexec/cli/script/build.py

+    try:
+        pipeline = f.makePipeline(args)
+    except Exception as exc:
+        raise ClickException(f"Failed to build pipeline: {exc}")


raise ClickException(...) from exc is how modern Python wants it to be.

andy-slac · 2020-06-23T18:59:45Z

tests/test_cmdLineFwk.py

+        self.click = False
+        self._testMakePipeline()
+        self.click = True
+        self._testMakePipeline()


I'm not sure that I want click tests here. This unit tests is for the framework API and should not depend on command line parsing. It currently prepares parsed command line options itself, but it does not use existing parser for that. I do not want to mix things, it may well be that API refactoring will happen and it will look very different and may even move to some other common package together with test. In that sense testing click stuff should certainly be in this package but it's better to keep that separate from API testing even if it means some code duplication.

andy-slac · 2020-07-02T22:09:46Z

python/lsst/ctrl/mpexec/cli/cmd/commands.py

+    for pipelineAction in (task_option.optionKey, delete_option.optionKey, config_option.optionKey,
+                           config_file_option.optionKey, instrument_parameter.optionKey):
+        kwargs.pop(pipelineAction)
+    kwargs['pipeline_actions'] = makePipelineActions(ctx.obj)


This looks a bit hackish. Could you instead just define a callback for each option that updates a common list? Or maybe click already has a way to append values from multiple options to the same parameter? Then you could use the same approach as I did in CmdLineParser - use those _ACTION... as type converters for individual options?

The click api really doesn't support any of these approaches. I think the best way would be to pivot to chaining subcommands, and making each of the action options a type of subcommand. Discussing that in Slack now.

andy-slac · 2020-07-02T22:12:14Z

python/lsst/ctrl/mpexec/cli/utils.py

+            pipelineActions.append(_ACTION_CONFIG_FILE(args[i+1]))
+        elif args[i] in instrumentFlags:
+            pipelineActions.append(_ACTION_ADD_INSTRUMENT(args[i+1]))
+    return pipelineActions


Is args an unparsed command line? You may need a bit more care while parsing it this way, there may be complicated cases which simple iteration does not handle.

It's a tuple of command line options and values, not entirely "unparsed". It's checked by code that comes before it, once we've gotten here there shouldn't be anything wildly unexpected. There's unit test code in tests/test_cliUtils.py

I'll add a docstring tho, that should have been done.

Even if it's parsed it could still be potentially problematic in some weird corner cases. Suppose there is a very odd option which can accept any string value and someone passes string "-c" to that option:

pipeline build --foo -c -t TaskClass

That would probably break something.

this allows us to keep pipetask while rewriting it in click. eventually the exising pipetask command will be removed and pipetask2 will be renamed to pipetask.

n8pease force-pushed the tickets/DM-25385 branch from c67f6f6 to 443052a Compare June 21, 2020 02:19

timj approved these changes Jun 22, 2020

View reviewed changes

andy-slac requested changes Jun 23, 2020

View reviewed changes

n8pease force-pushed the tickets/DM-25385 branch 3 times, most recently from 21c08ac to 90f422b Compare June 30, 2020 18:42

n8pease force-pushed the tickets/DM-25385 branch from 90f422b to b20c040 Compare July 2, 2020 19:47

andy-slac reviewed Jul 2, 2020

View reviewed changes

n8pease force-pushed the tickets/DM-25385 branch from b20c040 to ed50a1c Compare July 7, 2020 18:53

n8pease added 5 commits July 13, 2020 13:27

add a pipetask command

1f16c81

add obs_base dependency

e20ebc4

add command, script, and optons

329f824

add pipetask2 command

fd73c17

this allows us to keep pipetask while rewriting it in click. eventually the exising pipetask command will be removed and pipetask2 will be renamed to pipetask.

add package-docs for the pipetask command

a8172c3

n8pease force-pushed the tickets/DM-25385 branch from ed50a1c to a8172c3 Compare July 13, 2020 20:27

change to LogCliRunner for CLI unit tests

a8152fe

n8pease merged commit e4df7f2 into master Jul 15, 2020

n8pease mentioned this pull request Aug 14, 2020

DM-25627 make qgraph and run subcommands for pipetask #65

Merged

andy-slac deleted the tickets/DM-25385 branch November 13, 2020 21:43

andy-slac restored the tickets/DM-25385 branch November 13, 2020 21:43

andy-slac deleted the tickets/DM-25385 branch November 13, 2020 21:43


		class show_option: # noqa: N801

		defaultHelp = ("Dump various info to standard output. Possible items are: `config', "

		_ACTION_ADD_INSTRUMENT)


		def build(config_file=[], config=[], delete=[], instrument=[], order_pipeline=None, pipeline_dot=None,


		from lsst.ctrl.mpexec.cli.pipetask import main

		if __name__ == '__main__':



		@click.command(cls=PipetaskCLI, context_settings=dict(help_option_names=["-h", "--help"]))
		@log_level_option()

DM-25385 make a pipetask command based on click #58

DM-25385 make a pipetask command based on click #58

Conversation

n8pease commented Jun 21, 2020

timj left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andy-slac left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

n8pease Jun 23, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

n8pease Jun 23, 2020 • edited

Choose a reason for hiding this comment

TallJimbo Jun 23, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

n8pease Jun 23, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

n8pease Jun 23, 2020 •

edited

n8pease Jun 23, 2020 •

edited

TallJimbo Jun 23, 2020 •

edited

n8pease Jun 23, 2020 •

edited