Skip to content
This repository

execute() on different host does not change back to original host when done #568

Closed
jsdalton opened this Issue February 24, 2012 · 18 comments

7 participants

Jim Dalton Jeff Forcier Antti Kaihola Rory Geoghegan Corry Haines Mailgun Team Ruslan Lutsenko
Jim Dalton

Here's a quick demonstration of the issue:

fabfile.py:

from fabric.api import *

@task
@hosts('localhost')
def test():
    run("echo hi")
    execute(test_other_host)
    run("echo hi")

@task
@hosts('otherhost')
def test_other_host():
    run("echo howdy")

The output is:

$ fab test
[localhost] Executing task 'test'
[localhost] run: echo hi
[localhost] out: hi

[otherhost] Executing task 'test_other_host'
[otherhost] run: echo howdy
[otherhost] out: howdy

[otherhost] run: echo hi
[otherhost] out: hi


Done.
Disconnecting from otherhost... done.
Disconnecting from localhost... done.

The problem is that when test_other_host() is done executing, the remaining tasks in test() are executing under that host's context, not the original context it was decorated with. This appears to happen regardless of how the hosts are specified (e.g. via the command line, explicitly as an arg to execute(), etc).

I'm running the latest version of Fabric:

$ fab --version
Fabric 1.4.0
ssh (library) 1.7.13

So I'm not sure whether it's always been like this or whether this is a regression.

Jeff Forcier
Owner

This is, currently, on purpose -- execute() is used internally for calling all CLI-mentioned tasks, and tasks have historically been able to mutate env for the benefit of later tasks (e.g. setting env.hosts dynamically.)

Unfortunately as you found this results in unintuitive behavior when used by hand, because its own mutation to env re: env.host_string and so forth, then "bleeds" out after it completes its run.


This has come up a number of times lately so clearly we need to address it. Offhand I think a quick solution would be to change its behavior to default to "cleaning up" env at the end, with a kwarg toggle controlling that, so the internal call can still behave normally.

Antti Kaihola

I ran into the same issue with this code:

@task
def update_appserver():
    execute(remove_appserver_from_load_balancer, env.host, host='loadbalancer')
    sudo('service apache2 stop')
    # ... et cetera ...

def app1():
    env.hosts.append('app-1')

$ fab app1 update_appserver

The service apache2 stop command is being executed on loadbalancer instead of app-1.

Rory Geoghegan

I would argue that jsdalton's example is a feature, not a bug, because if you run a task that modifies the hosts, it should modify the hosts.

However, I don't think the behaviour akaihola found is acceptable. If one specifies a host for one command, it should only stand for that command.

Jeff Forcier
Owner

Yes, there's no one clear solution: blanket "execute can't modify env" breaks existing "tasks can prep env for later tasks" functionality, and the current behavior results in too many edge cases and frustrating/hard to troubleshoot state bleed (a new one popped up today, #592, where parallel tasks will cause all tasks after them to run in parallel.)

Possible solutions:

  • Revert to a default cleanup/revert of env at the end of execute, but add another execute kwarg saying "allow this execution to modify env vars X, Y and Z".
    • Helps with literal execute calls when the user knows they want to modify env
    • Doesn't help with "default" calls (i.e. internal use of execute to call CLI-given tasks) since there is no good way to indicate that "env vars OK to modify" list
      • Except maybe another decorator, which is kind of shitty.
  • Have execute revert only the values it itself modifies in env. So the task itself may still mutate env (the use case we are trying to preserve) but the bookkeeping execute performs to get stuff done (which is what is causing all the error reports re: changing host, changing parallel flag etc) gets properly cleaned up.
    • Don't see any downsides to this as long as it's technically possible to implement.
Corry Haines

I think the "clone env" kwarg would be pretty helpful. This problem actually toasted our production systems as the serial deploy went completely parallel when we upgraded fabric (see #592)

Jeff Forcier
Owner

You weren't able to test it on a staging environment first? :( We try hard to make things work well, but bugs do happen.

Re: solution: I'm not sure why I listed the kwarg idea first, as it doesn't fully solve the problem. I think the second bullet point would, and I'll be trying that out ASAP since this general issue is affecting a lot of folks in various ways. Keep an eye here for updates -- I'd love it if you could try out any fixes (preferably not on production, though...)

Corry Haines

At the time we did not really have tests for this part of deployment. That will probably change :)

And my last comment was actually in response to an earlier comment of yours. The second bullet does look better.

I would be happy to put some testing in on the change, however it works.

Jeff Forcier bitprophet referenced this issue from a commit March 20, 2012
Jeff Forcier Changelog re #568 95f67ea
Jeff Forcier
Owner

@tabletcorry @rgeoghegan @akaihola @jsdalton If you could all give the latest 1.4 branch or the latest master, a shot, and see if it fixes this for you, it'd be appreciated. It appears to work for me, in the following contrived example:

from fabric.api import *

@parallel
def run_parallel():
    run("uptime")

def run_serial():
    run("uname -a")

def go():
    env.hosts = ['about', 'five', 'different', 'servers', 'here']
    execute(run_parallel)
    import time
    print "Entering serial execution"
    time.sleep(5)
    execute(run_serial)

Running that fabfile on a 1.4.0 checkout exhibits the bug (run_serial runs in parallel); running it on the 1.4 branch works as intended (run_serial runs in serial.)

There are tests for this in the test suite now as well, but as usual real world testing is invaluable.

Jim Dalton

I ran the demonstration of the issue that I had whipped up on my local machine (OS X 10.7) and it works great now. I'll upgrade to the next point version when it's released and I'll let you know if any issues arise in real world use.

Mailgun Team

The problem with execute() remains for @parallel decorator.
In the example below, execute() will try to run pull() task in parallel mode even though @parallel is applied only to prepare().

@task 
def deploy():
    @parallel 
    def prepare():
        pass

    @task 
    def pull():
        pass

     execute(prepare)
     execute(pull)

Maybe this should be a separate ticket. I'll create a new one.

Jeff Forcier
Owner

@mailgun Two things: 1) nesting functions like that is...pretty strange and isn't really supported. What happens when you break them all out to the top level in your file? 2) Can you run the example code I provided and confirm whether or not it works for you? finally, 3) which exact version of Fabric / which Git SHA, are you using for these tests?

Hm, that's 3 things, I was never good at math :(

Mailgun Team

Jeff,

The code above is the most amazing way to use Fabric if you ask me! Each of those inner tasks has its own list of hosts, some run in parallel and some are serial, and all are neatly encapsulated by the parent task, visible to fab -l. This works very well in v1.3

Your sample from above:

  • Does not work on v1.4 installed using "pip install Fabric==1.4"
  • Works on 1.3
  • Works if I git-clone your Github repo and do "python setup.py develop" on it.

So it appears you fixed it? :)

Jeff Forcier
Owner

Yes, that is what I was trying to say above, this is fixed in Git but has not been released yet, because I wanted some of you to test it out first ;) Thanks for clarifying and testing it!

Corry Haines

The change in 1.4 fixes my contrived example (in #592) and looks to fix the issues that I had.

Ruslan Lutsenko

Hi , have weird problem with Fabric 1.4.0 seem it has some relation to current thread. I need to execute different type of subtasks inside single fabric task. Some of them need to be run in serial mode others in parallel. And when I execute parallel task after serial it make fabric stuck in limbo state. Maybe someone can point me if it's already know issue, or I do something wrong here, otherwise I need open issue request. Thats example of code I use http://pastie.org/private/4p6n4301c1hscwgjm2zva and here debug output I receive when do call fab --show debug --fabfile=test.py main http://pastie.org/private/xth5kr5cm3jpx3yzc2waa

_HOSTS = [ "friendslowslave4", "friendslowslave5" ] 

@task
def showHostnameSerial():
    cmd = "hostname"
    if run(cmd).failed:
        return True
    return False

@task
@parallel
def showHostnameParallel():
    cmd = "hostname"
    if sudo(cmd).failed:
        return True
    return False

@task
def main(): 
    results = execute(showHostnameSerial, hosts=_HOSTS)
    print results
    results = execute(showHostnameParallel, hosts=_HOSTS)
    print results
Ruslan Lutsenko

tested on Fabric 1.5a (93e426b)
same story

[friendslowslave4] Executing task 'showHostnameParallel'
job queue appended friendslowslave4.
[friendslowslave5] Executing task 'showHostnameParallel'
job queue appended friendslowslave5.
job queue closed.
Job queue starting.
Popping 'friendslowslave5' off the queue and starting it
Popping 'friendslowslave4' off the queue and starting it

Jeff Forcier
Owner

Hi @lruslan -- if you experienced this in the released 1.4.0 then it probably isn't related to this issue directly. I have seen reports of parallel mode hanging, it's unclear if it's related to running serial tasks beforehand like you're claiming. Will move this to a new ticket.

Jeff Forcier bitprophet closed this March 30, 2012
Jeff Forcier
Owner

@lruslan - please see #600


I'm closing this now since there've been some confirming positive reports re: the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.