Make execution model more robust/flexible #21

Closed
bitprophet opened this Issue Aug 19, 2011 · 19 comments

Projects

None yet

1 participant

@bitprophet
Member

Description

  • Right now, relatively simple, calls each function one by one, and for each
    function, runs on each host one by one, and the host list may be different
    per-function.
  • What may make more sense is to specify host list "first", then for each
    host, you call each function in turn (in which case the order of functions
    may matter more than it does now). This would mean that the logic for host
    discovery changes a decent amount.
  • Do we want to allow this to be switched up dynamically? I.e. allow user
    to specify a "mode" (like in the old Fabric) to determine which of those
    two algorithms is used?
  • How do these decisions affect what decorators/etc can be applied? I.e.
    @hosts doesn't make any sense in the latter scenario because there is only
    one global host list per session. (But isn't that the more sensible
    solution? Is it ever useful to execute a fab command line and have the host
    list to change during the session?)
  • See mikeias' fork for one potential
    approach: preserves dynamic checking of host/role lists as attached to
    tasks via decorators, but thus requires one to call a function with the
    task name as a string, instead of simply calling the function itself. May
    be worth the tradeoff, and is definitely food for thought.
  • Also work in the concept of dependencies. Right now users have to explicitly call the other functions, which results in minor DRY violations in some cases at least.
    • Argument against dependencies would be that it could introduce magic, i.e. calling a function that is hooked into by other functions is no longer a straight "just calls this one function only".
    • Could also be a lot more complex to set up, depending on how the stuff above is approached (i.e. if the above is done without sacrificing a lot of simplicity, this almost definitely would)

As a possible stopgap (or maaaybe long term) simply refactor existing connection/execution stuff so it can be exposed to users, i.e. something encapsulating "call function X on Y host list". Which is essentially the same as mikeias' fork (2nd to last bullet point above) --but with the caveat that this method would not be the only or preferred way to call functions, instead just an additional tool for when the default exec model doesn't fit your needs.


Originally submitted by Jeff Forcier (bitprophet) on 2009-07-21 at 03:02pm EDT

Relations

  • Related to #76: Use decorator to define tasks
  • Related to #218: Update get(), env.all_hosts to work for library users
  • Related to #243: Consider not using set() when merging hosts/roles
  • Related to #266: Externally loaded roledefs
  • Duplicated by #286: Use call instead of init for classes that define one
  • Related to #297: Object-oriented hosts/roles/collections
  • Related to #391: Allow tasks to generate new tasks (for other hosts) in the same session
  • Related to #19: Implement parallelism/thread-safety
  • Related to #20: Rework output mechanisms
@bitprophet bitprophet was assigned Aug 19, 2011
@bitprophet
Member

Jeff Forcier (bitprophet) posted:


One potential idea: have @hosts not just add the host list attribute, but actually return a function that runs the inner function on the given host list.

I.e. right now, when people want to (for some reason) run a sub-task on a given host list, they have to manually do for host in [desired host list]: with settings(host_string=host): subtask(). That could be easily stuck inside the decorated function instead.

However, the tough part is having this mesh with non-sub tasks, because then you have the fab main loop AND the outer wrapping function both trying to do that loop. Would need to detect how the function is being used, which might be messy.


on 2009-11-09 at 07:53am EST

@bitprophet
Member

Andy McCurdy (andymccurdy) posted:


IMHO, Fabric's main loop dealing with hosts and/or roles is what complicates things. One way to provide more flexibility would be to have Fabric's main call a run_task() function, passing a reference to the task about to be run. run_task() would simply inspect the task's .roles and .hosts attrs and the environment to build the correct host list, favoring anything from the command line first, then anything defined on the task. Once a suitable host list is built, run_task would iterate over each host, calling the task.

With this approach, tasks that wish to call subtasks could instead call run_task() passing the task function.

run_task would most likely need to act like a context manager, noting any settings in the environment that it changes, then reverting back to the previous settings once the task execution completes.

I'm working on prototyping this in my github fork now.


on 2009-11-20 at 02:06pm EST

@bitprophet
Member

Jeff Forcier (bitprophet) posted:


Hey Andy, finally got around to reviewing your prototype. This general idea is nearly identical to what mikeias did in his fork in June. I guess you didn't look at his stuff? I think the only real difference (aside from names and argument order) is that yours works off a newer version of the main codebase since you did it more recently, and your subroutine expects the function object itself to be passed in versus the function name.

FWIW I like your approach better as it's slightly more direct -- though either way, the approach of having to do execute(subroutine) instead of just subroutine() is still something I wish was avoidable. Especially once one gets into having to specify args/kwargs tuple/dictionaries (so it's then execute(subroutine, (a, b, c), {'d': 'foo', 'e': 'bar'}) instead of subroutine(a, b, c, d='foo', e='bar')).

But, I don't think it really is avoidable; the decorator idea I had in that earlier comment has its own problems and it does not really solve the whole problem either.

Where to go from here:

  • Settle on a name. I think execute feels better than run_task, it keeps better with the other operations which are mostly single-word actions. Do we have any other good candidates? call perhaps?
  • Possibly extend this idea to allow override (or append?) of host/role list via arguments -- some use cases need this (i.e. dynamic host/role lists generated at runtime, in such a manner that callable/lazy roles don't quite do the job).
    • Perhaps something like execute(subroutine, hosts=['what', 'ever'], overwrite=True), versus execute(subroutine) would simply do the usual merging of global and decorator lists, and execute(subroutine, hosts=['what', 'ever']) would append those two hosts to the global + decorator lists.
    • Yes, this looks a lot like main_run_task, so we may just merge the two together and rename kwargs somewhat.
  • Close this ticket, and spin off the dependencies problem into its own ticket (meta-tickets like this are an extremely bad habit of mine -- targeted, "the problem is X and it can be solved by Y" tickets are much, much better.)

on 2009-12-13 at 06:37pm EST

@bitprophet
Member

Andy McCurdy (andymccurdy) posted:


Thanks for taking a look at this Jeff. There is one other idea I had, but initially dismissed it would have taken things in a much different direction. We could make each task be a subclass of a Task, a common base class. The only requirement on a Task would be for the user to implement a run() method. We'd then implement __call__(*args, **kwargs) in the base Task class. Inside of __call__, the host list and other options would be determined, then __call__ would call run(), forwarding along any arguments.

A subtask would be executed like: MyTask(arg1, arg2, kwarg3='three'), where MyTask is a class object.

The other thing I like about this approach is that it's extremely clear what are tasks and what are helper functions -- ultimately providing a convenient task registry.


on 2009-12-13 at 07:51pm EST

@bitprophet
Member

Jeff Forcier (bitprophet) posted:


That's an interesting idea, and I have been wanting to eventually take Fabric from a mostly subroutine/function oriented structure to a more object-oriented one (though that had been mostly for internals and not so much for fabfiles themselves.)

It's not perfect, largely because it makes things significantly more complex for the users/fabfiles:

  • now it's class definition + methods instead of just functions
    • that's an extra LOC or two per task definition
    • plus everything interesting is now indented an extra level
  • subtasks would actually need to instantiate and THEN be __call__ed, unless I'm missing something -- so that's another extra LOC for every subroutine call
    • which is, if anything, even worse than just moving from subroutine() to execute(subroutine).

However, it's definitely the case that moving to classes and __call__ offers greater flexibility, and could also make it easier to implement a few other related features, if it's not already a requirement for them:

  • the currently-sorta-implemented "call this task only once, ever" behavior, @runs_once
  • and the not-yet-implemented dependency resolution feature that's part of this ticket right now.

I'll have to think about it a bit more, though (as you apparently are) I'm definitely still leaning towards the execute/run_task approach for the time being.


on 2009-12-13 at 11:51pm EST

@bitprophet
Member

Andy McCurdy (andymccurdy) posted:


We could make BaseTask's __init__ perform the setup and call self.run(*args, **kwargs) too. That'd avoid a LoC as you could just just say MyTask().


on 2009-12-14 at 02:56pm EST

@bitprophet
Member

Jeff Forcier (bitprophet) posted:


That's true, I guess. Of course then (nitpick alert!) we either have task names start looking like CamelCase or we ditch PEP-8 to some degree, neither of which I really like.

At any rate, original point stands -- classes are an interesting avenue to consider but not where I think I want this to go in the near term.


on 2009-12-14 at 03:08pm EST

@bitprophet
Member

Andy McCurdy (andymccurdy) posted:


Upon further reflection, I think I prefer the class-based approach. My original hesitation was mostly in the interest of preserving backwards-compatibility.

I believe the class-based approach encourages more pluggable tasks that can ultimately be redistributed, much like Capistrano's recipes. Class-based tasks aren't required to make that happen, but I do think they are ultimately more elegant than a similarly-functional module with a handful of functions defined in it.

If we did choose to go down the class-based route, there's at least one easy way to be "almost backwards-compatible". A @task decorator could be introduced that simply wraps a Task class around a task function. Existing users would only need to add the decorator to their current task functions. This also solves the (IMHO) problem that every callable currently gets registered as a task, which I believe should be less magic in favor of being more explicit.


on 2009-12-14 at 03:16pm EST

@bitprophet
Member

Jeff Forcier (bitprophet) posted:


Copy-n-paste of IRC discussion between myself and Andy follows below (wrapped to not break layout.) No real decisions were reached, just some back and forth about pros/cons.

It's still clear that classes are powerful and can be used to implement a number of different features (dependencies, calling subtasks, etc) but I remain unconvinced that it's the right way to go for the short term future.

However, Andy will be prototyping some of his ideas so I can see what it looks/feels like, since I do think this bears serious investigation for the future.

15:17 < andymccurdy> bitprophet: we're playing redmine tag.
15:19 < bitprophet> I noticed
15:20 < bitprophet> otoh it's sometimes good for that sort of back and
forth to be recorded (the ticket for getting this channel logged
notwithstanding...)
15:20 < andymccurdy> heh, yup.
15:21 < bitprophet> just read your last comment
15:21 < bitprophet> are you familiar with #76?
http://code.fabfile.org/issues/show/76
15:21 < andymccurdy> so do you dislike class-based tasks more out of the
backwards-incompatability issues or the extra-LoC that they'd introduce?
15:21 < andymccurdy> i'm not -- looking now
15:22 < bitprophet> ok. just noting it since it's related to the point you
raised re: task declaration
15:22 < andymccurdy> i ended up writing a @notatask decorator in my
fabfile.
15:22 < andymccurdy> that just injects the function in fabric's internal
function list.
15:22 < bitprophet> my dislike is, I guess, more the latter than the
former. backwards compat is easy enough to retain by implementing both
behaviors for a version or two, usually.
15:22 < bitprophet> ha
15:23 < bitprophet> clever
15:23 < bitprophet> but yea, I just (so far) have enjoyed fabric's spartan
fabfile feel. "just functions." moving to classes kinda kills that pretty
harshly.
15:23 < andymccurdy> My hope is that ultimately users would see less LoC
because they'd get to re-use Tasks more.. potentially that others have
written.
15:24 < bitprophet> Well, there are other avenues for reusing tasks, both
my (eventually faster) pulling mine and others' code into contrib, as well
as stuff like http://code.fabfile.org/issues/show/99
15:25 < bitprophet> I don't quite see how classes vs functions makes reuse
truly _that_ much easier
15:26 < bitprophet> either way it usually comes down to "from third_party
import some_name" + "some_name()"
15:26 < andymccurdy> well, I could see having a set of tasks around SCM.
one for git, one for svn, etc.  sure, you could make a module and name the
functions the same in each, but *ugh*
15:26 < bitprophet> ha
15:27 < bitprophet> maybe this is because I'm not fully focused right now
(@work) but how would classes make that better?
15:28 < andymccurdy> if i wanted to do some extra special thing right after
I checkout out some code, I could simple subclass the git task and add my
logic to it's checkout method.
15:29 < bitprophet> ahh, so this is really about dependencies, then?
15:29 < andymccurdy> sure, that's one piece of it.  *everyone* feels the
need to have different production and staging setups.
15:30 < bitprophet> environments, another angle. hm
15:30 < bitprophet> it's still clear that classes open up a whole new world
of possibilities via subclassing/overriding/etc
15:30 < andymccurdy> i think that's what capistrano does fairly well at.
they provide a bunch of out-of-the-box recipes that get you 90% there, then
you assemble them together to meet your needs.
15:32 < bitprophet> for sure. they're not using classes either, though, and
offhand my assumption was that our eventual implementations of
(environments, dependencies, etc) would at least partly resemble their
function based approach (in Python, probably decorators as well)
15:33 < andymccurdy> hrm, (looking at the git class now in cap) I guess
it's not a task by itself.
15:33 < bitprophet> honestly the main things I balk at are the thought of
doing "fab FooTheBar" instead of "fab foo_the_bar" (call me caseist) and
the extra indents/LOC.
15:34 < bitprophet> Yea, that stuff is deploy strategies, which is class
based
15:34 < bitprophet> but it's not quite the same thing as we're talking
about here. (Sorta-kinda, though. deploy strats *are* a major part of cap
use.)
15:35 < bitprophet> (also, re: my balking: no, those are not enormous
unsurmountable concerns -- I'm just picky.)
15:35 < andymccurdy> the actual task name would be easy to go from
camel-case to lower-with-underscores doing task registration.
15:35 < bitprophet> certainly. it requires the extra step, but that's
possibly unavoidable anyway, and does bring us back to a non-camelcased
dialect when referring to tasks
15:36 < bitprophet> so if you have the time/energy, please feel free to
experiemnt again but along these lines instead, I'd be interested to see
what fabfiles start to look like with class tasks.
15:36 < andymccurdy> the LoC/indents, ya I somewhat agree -- although let
users choose by just using the @task decorator on their simple function.
15:36 < andymccurdy> sure
15:36 < bitprophet> I do still think that the refactoring you did already
is the way to go for the short term, though.
15:37 < andymccurdy> i should be able to have a class-based branch done
after lunch
15:37 < bitprophet> and, yea, @task, though that's sort-of another issue
(though they're all intermingled of course.)
15:37 < andymccurdy> yup, totally.
15:37 < bitprophet> I really need to get back up to speed with backporting
1.0 features and knocking off smaller tickets so I can push out an 0.9.1
etc.
15:38 < bitprophet> so I can start focusing on this sort of fun
experimental stuff :)
15:38 < bitprophet> (plus the other big issues -- I really want to give
Paramiko agent forwarding capabilities...)
15:38 < andymccurdy> and don't forget about threaded task execution, hah!
15:39 < bitprophet> bah, who needs threads?!
15:39 < bitprophet> (yes, that too. all the big stuff.)
15:39 < andymccurdy> i'm gonna grab some food.  i'll make another branch
afterwards with the class-based stuff.
15:39 < bitprophet> (I'm hoping that moving to non-threaded IO, i.e. the
greenlets/coroutines/etc approach, will make that a lot easier to
accomplish. will have to see.)
15:40 < bitprophet> ok, cool. thanks for taking the time to experiment like
this, it's appreciated!
15:40 < andymccurdy> ya that sounds like a good idea.

on 2009-12-14 at 06:54pm EST

@bitprophet
Member

Andy McCurdy (andymccurdy) posted:


http://github.com/andymccurdy/fabric/tree/class-based-tasks

Initial stab at class-based tasks. Tests pass. Probably want to accept more named parameters on in Task.init, like hosts, roles, etc which could alter the environment prior to running run(), then restoring it before leaving.

Simple python functions can still be full fledged tasks by simply adding the @task decorator to them. There's an ordering issue with @task and @roles|@hosts where if hosts or roles gets applied first, task will not carry them over to the class. This won't be difficult to fix, but wanted to get something for you to look at sooner.


on 2009-12-17 at 02:06pm EST

@bitprophet
Member

Jeff Forcier (bitprophet) posted:


Popping this in the warts list, more and more folks are assuming that Python-called subtasks will preserve their own specific host list, and it definitely feels like a major wart.

In the meantime point people to http://lists.gnu.org/archive/html/fab-user/2010-02/msg00001.html which has a semi-solution already in it.


on 2010-02-08 at 10:51am EST

@bitprophet
Member

Jeff Forcier (bitprophet) posted:


Another approach using decorators (haven't looked but I assume it's similar to the approach I outlined at the top of the ticket) can be found in this mailing list post by Dan Pope.

Right now I favor the more straightforward/clean function approach, but both should be considered as decorators could preserve the ability to call subtasks without a wrapper.


on 2011-02-03 at 10:02am EST

@bitprophet
Member

Jeff Forcier (bitprophet) posted:


Think this will go well with parallelism as a 1.2 staple.


on 2011-03-10 at 12:13pm EST

@bitprophet
Member

Hacking on this now as an excellent pairing with the pretty-much-done work from #19. See 21-execution-enhancements, though hopefully this will get done quick enough that it'll merge into master RSN.

@bitprophet bitprophet added a commit that referenced this issue Sep 29, 2011
@bitprophet bitprophet First stab at doc-driven dev for execute().
Re #21.

Next up:
* Make execute() actually implement what is stated in the docs
* Quite possibly update that to wrap everything in Task, then move
  interesting bits into Task.run or w/e
* Probably add more examples to the docs?
6d0910a
@bitprophet
Member

So I forgot GH doesn't display the entire commit message...copy/paste:

Next up:

  • Make execute() actually implement what is stated in the docs
  • Quite possibly as factored-out instance methods in Task, then move interesting bits into Task.run or w/e
  • Probably add more examples to the docs?

EDIT: Actually, we can't use run, because it's already part of Task's published API, so subclasses are defining it already. So we'll need another function, implemented in Task, which does things with self.run.

Perhaps Task.execute, though that might be confused with execute-the-function.

Don't spend too much time fretting about this -- we can change the API in 2.0. This and Task should be considered stopgap measures so people can get stuff done, not super long term design decisions. Hopefully.

@bitprophet bitprophet added a commit that referenced this issue Oct 4, 2011
@bitprophet bitprophet execute() now honors hosts kwarg
Re #21
46e76a7
@bitprophet bitprophet added a commit that referenced this issue Oct 4, 2011
@bitprophet bitprophet Move get_hosts and deps into tasks.py
New tests not needed at this time, host/role/exclude all tested
against get_hosts() already.

Re #21
44f6bad
@bitprophet bitprophet added a commit that referenced this issue Oct 5, 2011
@bitprophet bitprophet Safely execute task with its own connection env vars
Specifically:

* Replace side-effect-using `interpret_host_string` with `to_dict`/`from_dict` (for use with `settings`)
* Cleanly set up an env dict for use when calling the actual task
* Update tests appropriately

Re #21
757a0ae
@bitprophet
Member

About halfway, maybe moreso, through the initial transition to putting everything inside execute (after which I plan to cut it up further and probably do the "wrap all tasks in a Task" thing, making some of the functions methods if applicable.)

Currently stuck on some stupid Fudge thing where somehow my test_tasks.py tests have the last one called bleed its expected-call Fake object, into one of the test_utils.py tests, so its implicit verify call barfs (about the task test!)

I have some work in a side branch re #203 about trying Fudge 1.x in case this is a bug in Fudge 0.9 (since I can't see any obvious fault in my code and this is now the 2nd time I've run into this) so I may revisit that after all instead of bashing my head against the innards of a dependency lib.

The yaks, they don't stop from getting hairier.

@bitprophet
Member

Figured it out; @with_fakes only clears recorded calls, not expectations of calls as I was assuming. The only reason this did not result in tons of horrific expectation bleed is that FabTest did clear_expectations() in setUp. So the last FabTest case to run will bleed its expectations out.

I "fixed" this by adding clear_expectations to tearDown too. Think I should re-examine those two methods, and the use of @with_fakes:

  • I should really only need clear_expectations in tearDown
  • I could also probably stick verify in tearDown and remove all use of @with_fakes
  • Make sure any code using Fudge is inside FabTest.

EDIT: That didn't work so well, generates a number of additional verification fails even in places where it should've worked fine. Clearly I still don't quite grok the innards of how the Fudge stuff works. Not a huge deal, I quickly reverse-tested some random tests to make sure that things still work with the double-clear_expectations, and they do, so no harm no foul.

@bitprophet
Member

At this point everything inside the main execution section of main.py is in tasks.py and execute. Works in some basic serial and parallel tests. Next up is to do some reorganization and more vigorous testing (and possibly write more tests too, though I already wrote a number of new ones.)

@bitprophet bitprophet added a commit that referenced this issue Oct 7, 2011
@bitprophet bitprophet Add changelog entry re #21 8c32fba
@bitprophet
Member

This is merged into master now.

@bitprophet bitprophet closed this Oct 7, 2011
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment