New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Task dependencies / before/after hooks #452
Comments
|
Our development team has migrated our search engine content build cycle from Ant to using Fabric. We love Fabric, but miss the dependency based tasks of Ant. (thats pretty much the only thing we miss) If we were to build some data for deployment to the search cluster for indexing, running it currently looks something like this: fab build production deployThe script downloads data from a feed and transforms it (build) for loading into the search engine (deploy) as part of our production cluster (production) We'd love it if we could just normally say: fab production deployAnd the build task would be automatically triggered to check whether the feed has been updated and the content needs to be rebuilt. Our scripts are smart enough to do delta builds when small parts of the feed change, but Fabric lacks a elegant way to trigger other tasks as prerequisites. The loop is an interesting suggestion. I'll ponder an approach moving ahead to imagine what a solution would look like for us and if its generalizable. I suspect a decorator could accomplish the task. A rough pass (needs a better name) might be: @task
def extract():
# extract datasets from data feed
@chain(extract)
@task
def transform():
# transform extracted datasets into documents for later indexing
@chain(transform)
@task
def build():
# using this approach, the other tasks do the heavy lifting, the only
# code needed in build is the pre-deploy steps
@chain(build)
@task
def deploy():
# some deployment code in hereSo, if we call deploy, it will call build - build shouldn't do anything unless needed, so it does make the code more complicated I think. Hrm. This makes me think of a pipeline routing or flow based programming issue. Maybe a better approach is to provide a chain or pipeline decorator that accepts multiple tasks for a default task: @pipeline(extract, transform, load)
@task(default=True)
def all():
# do entire cycleI'll mull this over and see if I can think of an elegant approach without adding unwarranted complexity. |
|
Maybe its just a different kinda task, a |
|
@sartian -- thanks for the comments. I haven't sat down to mull over my personal API preferences (tho I did just update the description to be in line with my current thoughts re: necessity of this feature overall) but it might end up as simple as just a decorator taking a In my memory, this is usually implemented in one of two ways: the subtask saying "I should be called before/after tasks A, B, ..., Z", or (probably better/more intuitive, if sometimes requiring more LOC) the "main" tasks saying "Before calling me, call tasks X, Y, ...". Offhand, I think this might work best as another set of kwargs to So to transform your example: @task
def extract():
# extract datasets from data feed
@task(requires=[extract])
def transform():
# transform extracted datasets into documents for later indexing
@task(requires=[transform])
def build():
# using this approach, the other tasks do the heavy lifting, the only
# code needed in build is the pre-deploy steps
@task(requires=[build])
def deploy():
# some deployment code in here
### example part 2
@task(default=True, requires=[extract, transform, load])
def all():
# do entire cyclePossibly doing the usual "I can take a single non-iterable value too" option so base-cases like the above could drop the wrapping square-brackets. More thoughts/concerns:
|
|
More questions about how this should/would work (thanks @sjmh):
|
|
based on #628 Your critique re: decorator vs kwarg is valid (again #628). I had originally implemented that with kwargs and only moved to the decorator because I didn't like how it looked: But I do think you are right for consistency's sake that it should be available as a kwarg. Would you be opposed to having the @Depends decorator basically be a shortcut? Effectively providing 2 ways to do the same thing. The decorator would just set the attr on the Task class. I can also appreciate not wanting to have 2 ways to handle the same problem as well. Lemme know. |
|
Yea, I'm torn on having both, it's convenient but also bloats the API (which is never a great argument re: a single addition, but add up a bunch of "more than one way to do it"...) Another worry is how to handle somebody giving both options on a single task (unlikely but possible): What should the dependency list be here? If For now, especially because I'm going to revisit this in Invoke soon and that'll replace this implementation for Fab 2 (eventually...and it's entirely possible I'll make the same decision made here) I'll let you decide what you think is best. If you go with both, I think the error message is the safest route to take -- any automatic merging/overriding feels like it'd be confusing to somebody, especially given how decorator application order isn't super intuitive. |
|
I was thinking to implement after/before hooks and in the first search i found this issue. any resolution? |
|
@gilgamezh Invoke has been worked on a lot recently and is going to be out soon, with some experimental APIs for this kind of thing. (In fact I was just discussing it with another user on IRC yesterday.) Please keep an eye on the Twitter/mailing list/etc as we'll announce once Invoke has a 0.9.x release and is ready for feedback. Thanks! |
|
Great! thanks. |
|
Was this included in any recent release / dev branch? |
|
@andresriancho Kinda-sorta, Invoke has it, but Fab 1.x doesn't. Fab 2 will be based on Invoke (or Invoke can be used by itself if you're not using Fab's SSH kit). See the roadmap for details. |
|
Great, nice to see that 2.x is on it's way 👍 |
|
do we have a hook to execute a function just once before iterating over all the hosts and executing a task on them? I wanna print out some summary when a task is called, like "I am going to execute this on etc. is there a way to do this right now? |
|
This is implemented in Invoke already (happened last year-ish) so it'll definitely be available for Fabric 2.0 users. |
|
@bitprophet should this be working on the v2 branch right now? I can use |
|
@wardi Think that's a bug/TODO, just confirmed it on my end, suspect it has to do with Fabric 2 having a custom executor subclass. Should be relatively easily fixed, will see if I can bang it out real quick. |
|
Ah yea - seeing the code in question (& its comments) reminds me why this was deferred, it's a more complex question in Fabric than in regular Invoke, because of host parameterization. Ye olde "task In Fabric, if So right now, Fabric 2 decided to not even try answering that question and ignores pre/post. There's a bunch of existing commentary about this in pyinvoke/invoke#461 and I even have a branch open with unfinished work towards a call-graph, pure-dependency-based setup. But barring completion of that, I'll have to decide on the most-applicable interpretation of the above problem as it applies to |
|
FWIW, if I had to just "do it well enough" today, I'd go for the "pre-tasks once, main task N times across hosts, post-tasks once" version of the scenario, since the other scenario can be effected well-enough by users calling their pre/post tasks manually within the main task's body. I'm curious if that's what you (@wardi) would have expected? Or were you even using Of note, this also brings pyinvoke/invoke#261 into play (especially as pertaining to the hosts-list factor, since it's not even technically part of a task's signature but is a sort of metadata.) Complicated! |
|
@bitprophet thanks! Ignoring pre/post seems to make sense. Refuse to guess and all that. I've gone the simple route and just call the pre function and implement my own check based on a context variable to see if it has already been run in the same session. |
|
@wardi I've actually just pushed a change to the v2 branch that implements what I said I would do, above, re: It's still subject to the other problems at the Invoke level re: pre/post not getting all the args/kwargs from the main task, and so forth, but at least now the behavior is functionally identical instead of just not existing. (This also applies to "running Let me know how that works for you, if you've time to pull it down and check it out! |
|
not exactly, I was hoping that I could use @task
def this_first(c): ...
@task(this_first)
def thing1(c): ...
@task(this_first)
def thing2(c): ...to do My specific "this_first" uses are "have the user enter their ssh password to set up c.config" and "pull all local git repos before they are deployed to various targets (in procedures given by thing1 and thing2)" But as I said, I'm fine with implementing this in plain python instead of using pre/post if this isn't a common use case. |
|
Ah yea, that's part of the other outstanding issues re: real dependency graph tracking. The current setup is old and very literal/naive. So, keep an eye on pyinvoke/invoke#461! |
1.3 features
execute()which allows for meta-tasks which call other tasks and honors their individual@hosts, etc settings. This solves many of the more complicated multi-task invocation problems people have run into.However, there are some arguments in favor of having a flipped-around version of a meta-task, namely specifying that Task A should always execute Task B before or after it runs. (Versus creating a Task C consisting solely of
execute(Task B); execute(Task A).) This is commonly referred to as "dependencies" or a "call chain", as seen in Rake/Capistrano/etc.One such argument is just that "good" refactored design of tasks means you may often want to call 'subtasks' stand-alone, so "just put your pre-requisite in a meta-task" doesn't help. E.g. setting up an "environment" (in the staging/production sense) before executing.
I was -0 on this before because technically you can solve this solely with
execute, but in more complex cases that starts to require a "runner" task that takes the real task to run as an argument. At that point, decorating a number of subtasks with "requires X to be run beforehand" is more Pythonic.Closely related: the idea of having a queue instead of a loop in
fabric.main.main, as mentioned in #391. In such a system, dependency decorators might act by simply loading the referenced items into the queue before/after the decorated task.The text was updated successfully, but these errors were encountered: