Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Easy execution of tasks from within Python itself #170

Open
bitprophet opened this issue Aug 20, 2014 · 29 comments · May be fixed by #613
Open

Easy execution of tasks from within Python itself #170

bitprophet opened this issue Aug 20, 2014 · 29 comments · May be fixed by #613

Comments

@bitprophet
Copy link
Member

Until now the focus has been mostly on CLI execution, but calling programmatically has always been on the table - couldn't find a good ticket so this must be it :) (#112 is sort of a feature request for this but it's more of a support thread, I'll roll it into here instead.)

Currently there are no great options for calling a task from another task:

  • The "main" way to execute presently is Executor(<Collection obj>).execute('name'), which requires one to have access to the right Collection instance.
    • Especially if that Collection is the one the task is itself a part of, this is basically impossible with the current API.
  • Executing a task that lives in an imported collection is easier, one can simply do that_collection['taskname'](args), however:
    • it's still a little fugly (if not a ton)
    • more importantly, it has annoying edge cases when using contextualized tasks - the context objects are different from each other so you can't just hand yours into the other task (though this feels like it should be allowed!)

Rambling thoughts about how to smooth over the above problems:

  • Having some focus Collection when executing is still a must at some point - Fabric's execute('name') can't work here because it relies on implicit global state.
  • The connection between a task's inner block and the Collection it lives in, must be via the Context handed in (or some other argument, but a single forced argument is plenty...).
    • Feels like more unification of (or smoothing over of access to) Context and Collection might help here.
    • At the very least we need a way to easily call a task with the "right" collection-as-configuration tree - similar to what the CLI mechanisms do now. Which may mean shoving more of that code into Collection or Context.
  • The previous bullet is more fuel for removing non-contextualized tasks (Consider making contextualization the default #114)
    • However, the non-contextualized route just means you can't easily call contextualized tasks from non-contextualized ones (w/o making your own Context - which is doable). You can always just treat everything like a regular module-of-functions and call other nearby task functions straight up.

Brainstorm of use cases/scenarios:

  • Call one top level task from another
  • Call a subtask from a top level task
  • Call a top level task from a subtask (this doesn't actually make any sense, by definition a subtask has no idea that it IS a subtask, at best it can be expected to know about its siblings)
  • Call a subtask from another subtask in the same module (though ideally this is exactly the same as the first bullet)
  • Call a top level task as a library call (eg in the shell)
  • Call a subtask as a library call
  • (Potentially a new ticket) Call one or more top level tasks, handing them the results of the parser - basically, control over the existing hardcoded-ish CLI module.
@sophacles
Copy link

Not exactly sure if this is 100% related, but - I have a handful of task collections bundled by functionality. I prefer not to do inv -c ${long_path_here} --args.... So I created a setup.py script, with a bunch of entry point script directives: build, prod, dev, etc. which all point to the various function names in this file https://gist.github.com/sophacles/c17ce33a14d582dfa268.

It would be nice to have a way of doing this this without the hackery.

@bitprophet
Copy link
Member Author

@sophacles Is there a reason the namespacing features don't cut it for that use case? The entire point of that API is to make it pretty easy to throw around Python modules/Collection objects, import them from wherever, etc; then make them into a task tree.

There's more exploration to do in that space yet (eg selecting a specific subtree at runtime - which would also help your use case if your problem is you don't want to expose the whole tree at once) but it works reasonably well already.

@sophacles
Copy link

Really it comes down to environment building. The exact gist i linked is slightly truncated. I have like 10 utilities built out of invoke, each of which has something like 5-20 targets (some overlapping at the deepest layers of the tree, but each utility has a specific cognitive task focus -- build, test, deploy generic, deploy specific environments, debug tools, demo setup, and so on).

I could probably train myself always to do:

inv -c /path/to/source/tree/utils/$TASK ....

But I kind of like having them:

  1. as a script available in my $PATH
  2. as tools named cognitively, just like when I'm using busybox... i just link (e.g) ls and cp rather than calling busybox ls ... or busybox cp file1 file2
  3. in a consistent place (depending on what I'm doing, the source tree where the task files live can vary. It's in one place on my dev machine, and different places in deployment depending on what OS, nfs details, and so on. We're still working out our "one true deployment strategy" so this changes as stuff comes up).

Also noteworthy, several of my targets are most commonly called from places in the system that are not the directory where invoke lives. For example it's nice having the build scripts called from the cwd when I'm debugging. Another example is the deploy scripts callable from wherever I happen to be in the fs when I figure out my error. (In the later case, I'm often developing my tasks in an editor on my mac, with the system I'm testing deploys on in a virtualbox that has the source tree mounted, or on a remote vm with the source tree mounted in both places via NFS).

@bitprophet
Copy link
Member Author

Somehow I neglected to click your gist link before. Durr. I see what you're doing there.

I think that's closely related to but not 100% the same as this ticket, insofar as I was intending this to be "I have some stuff at the Python level and want to call a task as a function or similar" and yours is "I want to hijack or duplicate the normal parser -> load collection -> execute task(s), process".

Definitely agree that your use case should be supported (making the CLI guts reusable has been a huge weak spot in Fab 1 and I want to avoid that mistake here). Reckon we could make a spinoff ticket, link your gist, and then I can shit out some thoughts based on existing API and what needed changes might be. For now I'll just make sure it's a separate use case up top.

@sophacles
Copy link

OK - made a ticket for the funcitonality I was describing at #173

@sophacles
Copy link

As for the rest of this ticket - I know invoke is intended to be part of fab2, so perhaps (unless they exist and I missed them) and explanation of how it's intended for fab2 to utilize invoke would be a good starting point. That way there are practical considerations and actual use cases to generalize from. It certainly helps avoid "architecture astronauting". It also could help avoiding some bad cases of fab2 relying on invoke relying on fab2.

@bitprophet
Copy link
Member Author

Yes, one of the reasons I have been poking Invoke lately is I'm attempting to get an actual Fab 2 ready for people to hammer on. And as you implied, every time I need to take one step forwards with Fab 2 work, it results in me needing to come back here and add or change stuff :) so it's a very good way to force this sort of reusability!

@bitprophet bitprophet added this to the 0.10 milestone Sep 10, 2014
@bitprophet
Copy link
Member Author

Been having another discussion with @sophacles on IRC about this (or at least, strongly related things).

His core use case in that discussion is (sanitized a tiny bit):

  • Task module uwsgi with some configuration defaults used by task uwsgi.endpoint
  • Other task module deploy containing task deploy.service, wherein deploy.service wants to call uwsgi.endpoint internally
  • That internal call to uwsgi.endpoint needs to supply a context appropriate as if uwsgi.endpoint had been called on the CLI top level - i.e. it must reflect CLI flags, config files, etc. This isn't really under discussion but it's part of this overall ticket's use case.
  • What is under discussion is what the rest of the configuration should look like when uwsgi.endpoint is called - should it contain any values defined within the deploy namespace?

Originally, I was assuming the best approach here is the "pure" approach: if task A calls task B, task B gets a context 100% identical to if it had been called as a top level task. If task A wants to change B's behavior it should just use regular function arguments.

Erich's use case seems to be that he wants to call task B with a "hybrid" context containing elements from task A's namespace, so that the A namespace can change some of those default settings affecting B's behavior. I.e. say the B module defines uwsgi.root as "/", and the A module wants any calls it makes to tasks in B, to act like uwsgi.root was "/otherthing". In this setup, A would have {'uwsgi': {'root': '/otherthing'}} in its configuration dict, and its calls to B would merge that with B's defaults, overriding them.

I'm not entirely for or against this at the moment but it is not what I'd originally assumed to be the sensible default.

@sophacles
Copy link

I deleted my previous two comments as I was still not explaining it sufficiently for my brain to shut up about it. I think this actually explains it correctly:

First off, I view the context as similar to the OS's environment[1]. In an OS, it is up to the spawning process (the caller) to set up the environment in which the child process (callee) runs. In the OS, if the callee manipulates the environment, and in turn spawns another child (callee_1), then callee_1 is beholden to whatever manipulations callee set up, even if they differ from caller's settings.

Each child process should do something like this:

if [ -z "$VARNAME"] ; then
   export VARNAME="defaultval"
fi

By analogy, the context resulting from a task doing manipulation, is the context a called task (a callee) should hold. It will need to do (or it's logical equivalent):

if "varname" not in ctx:
    ctx.update({'varname':'defaultval'})

Second, is the role of Collection in all this. There is all the stuff about collections building containers, including setting container defaults. Using the OS analogy, I view it as similar to creating a wrapper script[2]:

# wrap all the things
if [ -z "$VAR1" ] ; then
    VAR1="default_val1"
fi

if [ -z "$VAR2" ] ; then
    VAR2="default_val2"
fi

exec $@

When called from the command line, inv collection_foo.task_bar works (conceptually) the same as sh wrapper.sh wrappee. That is to say, if there wasn't already context setting the relevant variables (by say a config file loaded by inv), then the defaults are put into the context.

None of this is of course the "controversial" stuff... wrappers in wrappers, or collections in collections, are the same, and everyone agrees that is good :).

So, going on to the thing discussed in IRC, and outlined quite well by @bitprophet above, I need to set up a couple definitions:

  • bare task - the result of calling a task function via normal import/call.
  • collection task - the result of calling a task via it's collection

In code that is:

from some_other_taskfile import a_task, a_collection

# in this case, a_task is being called as  a bare task
@task
def call_bare(ctx):
    a_task(ctx)

# in this case a_task is being called as a collection task
@task
def call_collection(ctx):
    a_collection['a_task'](ctx)

In the OS/shell analogy, let's say we have wrapper.sh as defined above, and wrappee.sh defined as:

# This is trivial I know, but I didn't want to get lost in a complicated shell script
echo $VAR1 > $VAR2

If I just call sh wrapper.sh sh wrappee, I end up with a file called default_val2 and its contents are default_val1\n.

Now say I choose to build another tool, taking advantage of wrappee.sh. If consider wrappee the same as a bare task, I need to make a caller.sh like so:

VAR1='caller_text'

if [ -z "$VAR2" ] ; then
    VAR2 = "default_val2"
fi

exec sh wrappee.sh

If i don't define VAR2 in caller.sh, there is a shell syntax error, because the variable is undefined. This is expected, but generally bad programming, because there is a repeated default declaration, and it now needs to be updated in 2 different places, for different use cases of some core component. I'd much prefer to do this for caller.sh

VAR1='caller_text'

exec sh wrapper.sh sh wrappee.sh

and make use of the tools and defaults I've already built. This second caller.sh is analogous to a collection task in my mind.

N.B - just because my examples above are in shell, they are really operating system semantics

Finally: there is an argument that stuff above should just be kwargs to the task, however, I tend to look at my tasks as useful independent units (possibly unified in a collection via a combination of: grouping by concept a la directories, and a combination of sane defaults, ala a wrapper script). Just like with programs, there are some things that are useful as command line arguments, and some things that make sense as part of the environment (usually a decision about what child processes or groups of processes will do).

The tl;dr of this, is that:

collection_foo['task_bar'](context)

should act like:

$ VAR=DEFINITION sh wrapper_foo.sh sh script_bar.sh

[1] I am deriving this influence from the fact that invoke is fundamentally a way to write better shell scripts. Better being subjective of course, but the goal is to use a real programming language to handle all the stuff that is icky in shell - such as building parameter lists, path manipulation, having command arguments, and making decisions (if, loop, et al) - before actually spawning some other command.

[2] or using run-parts, or using functions or sourcing other scripts, but they work out very similarly

@sophacles
Copy link

In a similar vein, using terminology from above:

  • calling a task in both collection task and bare task should definitely have the pre/post tasks called
    • pre/post are defined on the task, at task definition time, so they logically are "part" of the task
    • this has some cross over into other issues :(

@bitprophet
Copy link
Member Author

Some semi disjointed replies after rereading most of the above:

  • In rereading my most recent older comment, when coming to this fresh I actually disagree with old-me. If task A invokes other-task B, the context task B gets should be related in some form to the context task A holds. It is "the current execution context" after all.
    • This is especially true if we presume calling tasks from other tasks involves manual passing of the literal Context object, which depends on the API that falls out of this. (See below for more on the tweak-then-pass-on idea.)
  • The config overhaul branch is landing soon. It currently only has one "in-code" config source - the "old" collection-driven stuff - so it doesn't actually change this scenario a lot yet.
    • However, it fleshed out the config API a bunch, and the final Config object can currently be tweaked at runtime before handing to a callee - which gives the caller absolute control over the context/config the callee sees (regardless of what the original Collections had configured.)
    • We could make this explicit if necessary, i.e. adding another 'post-overrides' config level/source. Not sure if that's overkill or not (gets real hairy in an infinite-recursion-depth scenario) but it would help bookkeeping.
  • Anyway, so I agree overall with @sophacles' latest comments; I feel like the "bare" call scenario isn't actually useful and I'm honestly unsure why I was pushing for it.
  • This topic does cross heavily into "constructing meta-tasks from multiple sub-tasks" (aka "something stronger than just pre/post tasks") territory, which I was actually just discussing with @offbyone recently.

@jhermann
Copy link

NOT having read the long thread, I'm still throwing #223 in the ring. 😄

@frol
Copy link

frol commented Aug 26, 2015

Just in case somebody needs a workaround to execute tasks from another tasks right now, I implemented a really simple way to do so in my tasks by passing a root namespace and a helper function into context:

from invoke import ctask, Collection
from invoke.executor import Executor

# Define tasks
# ============

@ctask
def test_task1(context, arg1):
    print("TASK1")

@ctask
def test_task2(context):
    print("TASK2 BEGIN")
    context.invoke_execute(
         context,
         'test_task1',
         arg1='value1'
    )
    print("TASK2 END")


# Define and configure root namespace
# ===================================

# NOTE: `namespace` or `ns` name is required!
namespace = Collection(
    test_task,
    # put tasks or nested collections here
)

def invoke_execute(context, command_name, **kwargs):
    """
    Helper function to make invoke-tasks execution easier.
    """
    results = Executor(namespace, config=context.config).execute((command_name, kwargs))
    target_task = context.root_namespace[command_name]
    return results[target_task]

namespace.configure({
    'root_namespace': namespace,
    'invoke_execute': invoke_execute,
})

@bitprophet bitprophet modified the milestones: 0.11, 1.0 Sep 7, 2015
@pdonorio
Copy link

Sorry for bothering. This issue is 3 years old, I was wondering if we've reached better ways now to call tasks from other tasks. Thanks!

@bitprophet
Copy link
Member Author

No, but it remains at or near the top of the priority list; it's definitely something that needs solving before 1.0.

@TimotheeJeannin
Copy link

Any updates on this ?

@pdonorio
Copy link

it's definitely something that needs solving before 1.0.

@bitprophet I see that 1.0 was released :)

@muhammedabad
Copy link

muhammedabad commented Jun 4, 2018

With the release of 1.0, is the ability to call tasks from within tasks (similar to v1's execute) available ?

EDIT: If the answer to the original question is yes, please also advise if the current task context object can be passed through to other called tasks.

@bitprophet
Copy link
Member Author

Re: @muhammedabad's 2nd question, see #261 - it needs work still!

Re: this ticket and 1.0: hey, running these projects isn't a cakewalk! ;) I judged it was better to get this & related projects above-board and on semver, than to continually punt until it was perfect.

My gut says we can definitely implement this in a backwards compatible manner, since it's likely to end up implemented as a new method on Context (plus related changes to things like Executor). So it should appear in some 1.x feature release!

@SamuelMarks
Copy link

Any updates?

I've written a set of configuration management tools around Fabric (and Apache Libcloud).

But can't upgrade it to be Python 2.7-3+ compatible until I upgrade Fabric. But that requires execute at a minimum.

Roadmap?

@jmsuzuki
Copy link

Any updates?

I've written a set of configuration management tools around Fabric (and Apache Libcloud).

But can't upgrade it to be Python 2.7-3+ compatible until I upgrade Fabric. But that requires execute at a minimum.

Roadmap?

I'm in the same boat. I need execute to migrate from python 2.7 to 3+

@rectalogic
Copy link

Just in case somebody needs a workaround to execute tasks from another tasks right now, I implemented a really simple way to do so in my tasks by passing a root namespace and a helper function into context:

To get this to work with fabric, and honor the executed tasks host list, I had to from fabric.main import program and pass program.core to the Executor:

from invoke import Collection
from fabric import task
from fabric.executor import Executor
from fabric.main import program

# Define tasks
# ============


@task(hosts=["localhost"])
def test_task1(context, arg1):
    print("TASK1", context)


@task
def test_task2(context):
    print("TASK2 BEGIN", context)
    context.invoke_execute(
         context,
         'test_task1',
         arg1='value1'
    )
    print("TASK2 END")


# Define and configure root namespace
# ===================================

# NOTE: `namespace` or `ns` name is required!
namespace = Collection(
    test_task1,
    test_task2,
    # put tasks or nested collections here
)


def invoke_execute(context, command_name, **kwargs):
    """
    Helper function to make invoke-tasks execution easier.
    """
    results = Executor(namespace, config=context.config, core=program.core).execute((command_name, kwargs))
    target_task = context.root_namespace[command_name]
    return results[target_task]


namespace.configure({
    'root_namespace': namespace,
    'invoke_execute': invoke_execute,
})
$ fab test-task2
('TASK2 BEGIN', <Context: <Config: {'root_namespace': <Collection None: test-task1, test-task2>, 'tasks': {'search_root': None, 'collection_name': 'fabfile', 'dedupe': True, 'auto_dash_names': True}, 'run': {'shell': '/bin/bash', 'hide': None, 'pty': False, 'encoding': None, 'in_stream': None, 'replace_env': True, 'echo': False, 'warn': False, 'echo_stdin': None, 'watchers': [], 'env': {}, 'out_stream': None, 'err_stream': None, 'fallback': True}, 'timeouts': {'connect': None}, 'sudo': {'password': None, 'prompt': '[sudo] password: ', 'user': None}, 'inline_ssh_env': False, 'port': 22, 'load_ssh_configs': True, 'user': 'cureatr', 'ssh_config_path': None, 'invoke_execute': <function invoke_execute at 0x7f1993b41f50>, 'connect_kwargs': {}, 'forward_agent': False, 'gateway': None, 'runners': {'remote': <class 'fabric.runners.Remote'>, 'local': <class 'invoke.runners.Local'>}}>>)
('TASK1', <Connection host=localhost>)
TASK2 END

rectalogic added a commit to rectalogic/invoke that referenced this issue Jan 16, 2019
rectalogic added a commit to rectalogic/invoke that referenced this issue Jan 16, 2019
@breisig
Copy link

breisig commented Sep 18, 2019

Any update on this?

@SamuelMarks
Copy link

I've switched to the Python 3 compatible fab-classic in the meantime.

jsharpe added a commit to zenotech/MyCluster that referenced this issue Jan 8, 2020
@SamuelMarks
Copy link

Another year has gone by… any update?

@christian-intra2net
Copy link

Also greatly missing this feature. I have some subtasks that I would like to either call directly from the command line or from other tasks.

@jcw
Copy link

jcw commented Nov 7, 2021

Would this feature also help with the following scenario I'm after?

  • an inv live task launches the fswatch utility to track file changes
  • it then continuously watches for changes by responding to output lines
  • on a change, launch another task, which recompiles and runs the target app
  • rinse and repeat, IOW my goal here is to use invoke as a live-coding mechanism

So the question is whether invoke can be made to repeatedly re-launch a task.
Or is there some other way? (with apologies if I'm mis-reading the gist of this issue)

@SamuelMarks
Copy link

@jcw - I think you want https://github.com/gorakhargosh/watchdog

@SamuelMarks
Copy link

Another year has gone by… any update?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet