New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create dashed aliases automatically #329

Closed
svetlyak40wt opened this Issue Feb 14, 2016 · 19 comments

Comments

Projects
None yet
2 participants
@svetlyak40wt

svetlyak40wt commented Feb 14, 2016

Is it possible to create dashed aliases automatically?

Right now I've added this code to the end of the tasks file as a workaround:

    for item in locals().values():
        if hasattr(item, 'aliases'):
            item_name = item.__name__
            replaced = item_name.replace('_', '-')
            if replaced != item_name and replaced not in item.aliases:
                item.aliases += (replaced,)

But seems, it is a good candidate to be included into the invoke itself.

@bitprophet

This comment has been minimized.

Member

bitprophet commented Feb 17, 2016

To clarify, you mean so that e.g.:

@task
def foo_my_bars():
    pass

is exposed on the CLI as:

$ inv foo_my_bars
# AND
$ inv foo-my-bars

?

If so, yes, I agree that'd be nice, and I actually thought we had issues open for this but I can't find them. Maybe because I'm conflating it with the existing automated underscores-to-dashes we do for task arguments - i.e. def foo(my_cool_arg): ends up generating --my-cool-arg - which is another reason to implement this, I suppose - consistency of a sort.

@svetlyak40wt

This comment has been minimized.

svetlyak40wt commented Feb 18, 2016

Yes, this is what I mean. I'll be glad to implement a patch for this feature.

@bitprophet

This comment has been minimized.

Member

bitprophet commented Jul 13, 2017

The more nontrivial task trees and names I write, the more I want this myself. Many, many instances of @task(name='the-thing-being-decorated') in some internal codebases! May take a stab at it sometime soon.

Shouldn't be too hard (famous last words.) Would be enabled by some config value and thus you'd put something like tasks: auto_dashed_names: true in invoke.yml or similar.

@bitprophet

This comment has been minimized.

Member

bitprophet commented Jul 14, 2017

Actually, need to explicitly jot this down, there's a few ways it could be done:

  • As requested by the OP, where it's a semi-hidden "alternate" method of invoking tasks; they might show up in --list as foo_my_bars, but the parser understands that foo-my-bars means the same thing.
  • The approach I had assumed coming back to this, where task names on the CLI are canonically one format or the other, and the parser and --list are on the same page - it's either dashes or underscores, not both.

Plus/minus breakdown:

  • The first approach means no need to opt in; also that users may use whichever style they prefer (choice.)
  • The downside with that option is that it's easy to confuse users who wonder why e.g. their local support documentation says "call inv foo-my-bars" but inv --list only shows foo_my_bars. (Yes, this will happen - trust me.)
    • Having a canonical name thus unifies representation and invocation and reduces ambiguity/confusion. The site operator gets to choose which representation/style is "house style" and that's how tasks are displayed and invoked.
    • And when it's an option, users may override it in their personal config files, so choice is still arguably preserved.
  • We could list both formats in --list, but that's also confusing ("are those the same tasks, or not??") not to mention noisy/messy.

So I think I'm still coming down on the side of "make it an explicit setting" and that's what I will implement for now, tho I am open to arguments as always.

@svetlyak40wt

This comment has been minimized.

svetlyak40wt commented Jul 14, 2017

Most unix command-line tools accept long options as dashed, not underscored.

Why not make it the only way how they will be named in next "major" release of invoke?

@bitprophet

This comment has been minimized.

Member

bitprophet commented Jul 14, 2017

Remember that we're discussing task names here, not options! Options have been forcibly dashed for quite a while now :) (so e.g. def mytask(some_long_opt) becomes --some-long-opt.)

Re: making the backwards incompatible change, I might, we'll see. I think just having it be consistent but configurable is nice, though, since in this case the task name thing is aesthetics/subjective. There isn't a real POSIX convention for "subcommand names" like there is for options/flags.

@svetlyak40wt

This comment has been minimized.

svetlyak40wt commented Jul 15, 2017

Oh, sorry.

bitprophet added a commit that referenced this issue Jul 20, 2017

@bitprophet

This comment has been minimized.

Member

bitprophet commented Jul 24, 2017

While implementing: realizing that we should probably apply the same transformation to collection names as well, for consistency's sake.

@bitprophet

This comment has been minimized.

Member

bitprophet commented Jul 24, 2017

There's also some annoying practical bits to be considered / recorded:

  • There are a bunch of areas this impacts: --list, --complete, --help, as well as parsing.
  • At the moment things are split up by responsibility, meaning that most of the code surrounding tasks and collections (and thus their names) has no access to configuration data (aside from the mostly-irrelevant collection-level config stuff, where we only really care about the config data already loaded from config files.)
  • We could effect this by going low-level and having tasks themselves change their name on instantiation, which would then 'bubble up' and avoid having to modify the parser, display, etc code. However, we'd then need to grant that lower level code access to the config system. Should be possible but feels somewhat gross.
    • Counterpoint: the more we configure things, the more all parts of the system will want access to the early loaded config, so it may eventually be unavoidable.
  • We could instead effect it by staying high-level and performing a transformation in the parser and CLI modules, a blanket s/_/-/g, which avoids the above problem of transmitting config info down the stack.
    • However: this would also require the inverse transformation in a few spots, because the internals of Collection and Task only "speak" the underscored version of names.
    • It also just means the changes are spread more widely - parsing, display, help, completion, etc, as above - and thus it's more work anytime the behavior updates, and more fragile / more prone to missing spots and thus introducing bugs/inconsistencies.

I am leaning towards 'transmit Config downwards, then have the expected name format bubble upwards' as it enables a lot more stuff in the future & should be less code, which is a nice combination. Transformative mappings (the other option above) feel awkward to me.

@bitprophet

This comment has been minimized.

Member

bitprophet commented Jul 24, 2017

Another minor thing: what if someone configures auto_dashes=True but then has e.g. @task(name='something_underscored')? Should their local, specific request for an underscored name win out over a global default preference for dashes? This also applies to aliases.

For now I think I'd say "no", partly because when we're able to reference a Config, we can't easily tell if a given Task's name was determined via a decorator or not (while it's unusual, someone could have constructed their collection out of handmade Task objects). Plus the only reason not to say "no" is if we wanted to allow someone the ability to say "yes, I know all my tasks are dashed by default, but I really want one of them to be inconsistent"...which as noted above is what we want to avoid - inconsistency. It's not user-friendly.

@bitprophet

This comment has been minimized.

Member

bitprophet commented Jul 24, 2017

Overview of how the flow & object relationships work right now, re: passing a Config down the stack:

  • Collections are created either by the user (via constructor or .from_module, within collection files themselves, i.e. nested namespace trees) or by Loader, via the CLI
    • This presents an issue, how do we handle sub-namespaces, when we can't possibly give them the Config object? Seems the lowest we can go is the root namespace.
  • Program.load_collection is what uses the Loader and gets the resulting root Collection back
  • Task-level parsing receives the result of Program.collection.to_contexts (generates ParserContexts)
    • Means references to colls/tasks within the task tree would still use underscored names, but since we haven't really solved the task-calling-from-task problem yet, probably puntable.
  • Collection.task_names is used for --list (Program.list_tasks), and is called recursively
  • Program.print_task_help implements --help <task> and interrogates the parser contexts, not the underlying tasks, so if we update Collection.to_contexts it should make this 'just work'.
  • --complete results in calling invoke.complete.complete with Program.collection, and calls Collection.to_contexts internally, so once again, Collection.to_contexts becomes a foci
    • Though if we want it to accept a config kwarg, that config needs passing into complete() as well as being used in Program
    • Between that and the need to tweak Collection.task_names, it may make sense for this to be implemented via handing a Config into Collection while it's loaded. Even if the guts of Collection.task_names/Collection.to_contexts end up doing translation, that still pushes it down a level or two and would still be more manageable than doing all of it up in Program/Parser/etc?
@bitprophet

This comment has been minimized.

Member

bitprophet commented Jul 24, 2017

Also, thinking this ought to default to on...I'd want it to do that by 1.0 and if we're gonna break backwards compat it might as well be now.

@bitprophet

This comment has been minimized.

Member

bitprophet commented Jul 25, 2017

Another minor wrinkle, been writing this assuming we could load all 3 non-runtime configs early in the process, but the per-project config actually requires knowing where the collection was loaded from, as it's defined as living in the same dir as the collection. (A subtle distinction, since for 99% of users, that and "the user's invoking CWD" are the same thing.)

Re: #310, it's moot, since the point of that "get the config early" need is configuring collection loading itself.

Re: this ticket, I could see some folks wanting to configure stuff like auto-dashes via project config, which would necessitate even more change (e.g. adding an earlier "loaded from user's CWD regardless of where collections are looked for and/or discovered in" fifth config file level). For now, esp given I suspect most folks will like auto-dashes being default, I think having folks use system or user conf files to turn off auto-dashes, is fine?

@bitprophet

This comment has been minimized.

Member

bitprophet commented Jul 25, 2017

Started looking at implementing the actual underpinnings for all this (the "instantiate, and then later, update, Config" behavior, now that I've tinkered with most of the outerlying doc, test, etc bits). Found that we already kinda have it, in the sense of Config.post_init and Config(defer_post_init=True).

These were added to aid in cloning configs, since at that time, the behavior was "do everything in __init__" and this meant you couldn't easily copy a Config (or, for example, turn a Config into a subclass, as in Fabric 2) without loading on-disk config files multiple times. (Perhaps not the worst crime, but still inefficient, and pollutes debug logs, etc...)

At the moment, post_init is simply the trigger for load_files, which performs all actual file loading (system, user, project, and runtime):

  • __init__ prepares the data structures and ancillary attributes for those config levels (things that come in via __init__ kwargs like system_prefix, i.e., configuring how to load the various files)
  • __init__ also sets a specific _found series of attributes telling the object whether or not to try loading each file - None for "no loading's been tried yet", True for "loaded!" and False for "attempted to load and failed / could not find".
  • load_files (actually _load_file) honors and/or sets those markers, providing idempotency.
  • post_init calls load_files (and then merge to update the central, merged config with any new data.)

Thus there are two main flows right now:

  • Normal/default instantiation has defer_post_init==False, calls post_init at the end of __init__, which calls load_files, which loads all config files and sets _*_found non-None. Object is fully set up and further use is assumed to be normal user behavior (tracked in modifications and deletions levels.)
  • Cloning instantiation has defer_post_init==True, does not call post_init right away, ends up with a usable-but-lacking-files Config; then the rest of clone takes care of copying attributes to the clone and finally calling post_init on it.

The key difference here is that in the cloning situation, it wants to avoid all file loads; we only want to avoid some file loads. So simply leaning on defer_post_init + post_init within Executor (which is where the CLI creates its Config) won't entirely help.

Additionally, we don't care just about files, we also want to be able to defer things like setting the overrides config level, which until now has been a stock constructor kwarg. (Granted, setting it post-init is quite simple, if "private".)

A TODO is next, because I think I see the light at the end of the tunnel.

@bitprophet

This comment has been minimized.

Member

bitprophet commented Jul 25, 2017

I think the above all boils down to ripping off the rest of the band-aid and making the remaining private API bits public:

  • Wrap _load_file in tiny per-level wrappers which take no args, and call merge automatically (as the latter should be quite fast given it's in-memory and configs aren't usually larger than human-scale; should we find users with very large configs we can always add a lazy aspect to merge later)
  • Create similar tiny wrappers for the other non-file-based levels (basically just defaults and overrides since collection and env levels already got public methods due to needing to be split out previously.) - not really worthwhile atm
  • Note in their docs that they are effectively idempotent & may be called multiple times with no ill effects (as long as the caller is sure they truly don't need the previous data for that level.)
  • Update the previous private calls with those public ones
  • (Probably) replace load_files with use of the more granular methods, it's probably only used in that one spot.
  • (the crux) Update Program to use defer_post_init and the new public methods (instead of post_init, which will never be called outside of clone; maybe rename those bits?) to enable the functionality desired in these tickets
  • Finish tying off that implementation in case other nasty details jump out...
  • Update the changelog to note the details in API change, if necessary
  • Finish updating Fabric 2 to account.

bitprophet added a commit that referenced this issue Jul 26, 2017

bitprophet added a commit that referenced this issue Jul 26, 2017

bitprophet added a commit that referenced this issue Jul 26, 2017

@bitprophet

This comment has been minimized.

Member

bitprophet commented Jul 26, 2017

Tweaked the above some:

  • Did away with post_init/defer_post_init, replacing with lazy=True and leverage of the new load methods
  • Updated __init__ so that even when non-lazy, it only loads user/system conf files, as those levels don't care about anything else. It no longer tries loading the project-level config, for example, or the runtime one.
  • Thus we don't actually even need lazy in Program to get what we want, we just need the new public loader methods.
    • I think; Fabric 2's loading of runtime ssh config files may complicate this. Poking...
@bitprophet

This comment has been minimized.

Member

bitprophet commented Jul 27, 2017

Actually looking at the details of giving the early, partially loaded Config to the Collection now. One issue is that Collection.configuration is an existing API member that has nothing to do with this version of the config (it's what eventually informs the "collection level" in the config itself, back up at the Program level.) So having both configuration and config around would be quite confusing, even just for developers (vs API consumers.)

One solution is to simply hand in only the specific data required, i.e. Collection(auto_dashes=True), though this doesn't scale if Collection ends up needing to check a lot of config options. Probably puntable though.

@bitprophet

This comment has been minimized.

Member

bitprophet commented Jul 27, 2017

Have this mostly working but ran into a fun set of corner cases:

  • With the current design, it's not possible to 'trickle down' this (or any other) kind of setting to subcollections: by the time the top-level namespace is given a subcollection, that subcollection has already done all of its bookkeeping re: assigned names, aliases, etc.
    • Ditto for its subcollections, and so on - the top level namespace is the very last one to do anything.
  • That's not a huge problem because we can simply rub our transformation method on all names as we're emitting them, regardless of whether they're ours or belong to a subcollection. Cool! All set.
  • Except...what if someone wants to disable auto-dashes? Because the functionality is phrased as "if setting is true, perform the substitution", turning it off only turns it off for the topmost collection; subcollections will continue to perform the substitution automatically.
  • Obvious, if non-ideal, solution is to perform the inverse transformation in this case. That will be problematic for anyone who wants to suppress auto-dashes but still wants to name a few tasks explicitly with dashes...but I'm so far down this stupid tangent I don't care.
@bitprophet

This comment has been minimized.

Member

bitprophet commented Jul 27, 2017

Jeez. I think this is done enough now...waiting on Travis to verify (tho since this stuff is all very "pure Python" and doesn't touch the OS, not worried.)

Far more work than I expected, but that's programming for you. Turns out when you have a lot of task namespace related features it means the guts of the code are a lot more complex than you remember...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment