Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate "piping" data directly from subprocess to remote process #289

Open
bitprophet opened this issue Aug 19, 2011 · 5 comments
Open
Labels

Comments

@bitprophet
Copy link
Member

Description

In other words, a Fabric version of:

$ tar czvf - /path | ssh hostname "cat > file.tgz"

Perhaps API'd as:

pipe("tar czvf - /path", "cat > file.tgz")

or similar. (Would be cute if we could be clever and have local and sudo/local handle this themselves, e.g. local('whatevs') | run('remote whatevs')but that's probably more magic than we ought to use.)

Point being when somebody needs to stream data to the remote end instead of storing up a local file, transferring it, and then doing something on the remote end.

Assuming subprocess allows us to obtain data chunk by chunk (which I'm 99.9% sure it does), and we can solve the existing problem of how to print to the user + capture at the same time, we can probably hook it up to the remote stdin in the same manner as is done for interactivity.

There may be snags like having to force shell=False (though I think a shell will still pass its stdin to the child process, so maybe not necessary?) but overall it ought to work.


Originally submitted by Jeff Forcier (bitprophet) on 2011-02-18 at 04:58pm EST

@ghost ghost assigned bitprophet Aug 19, 2011
@miracle2k
Copy link

I'm interested in this. The reverse is possible by capturing stdout - this isn't streaming, but you can get to the data. Of course, run() doesn't provide a way to give a stdin stream, so to send some data to mysql I have to go through a file.

@guettli
Copy link

guettli commented Oct 30, 2013

I leave this ticket close. But: This ticket was not about a new feature. It was about documentation. For me it is ok, if this is not supported. But it should be documented.

@omribahumi
Copy link

@bitprophet I have an opened file of a tarball (creating it in Python code), I would like to pipe it to run('tar xv -').
I tried replacing stdin with my file, the problem is it doesn't handle EOF properly.
A simple code to show this behaviour:

import sys
import tempfile
import os
from fabric.api import *
from contextlib import contextmanager

@contextmanager
def replace_stdin(new_stdin):
    STDIN = sys.stdin.fileno()
    old_stdin = os.dup(STDIN)
    os.dup2(new_stdin.fileno(), STDIN)
    yield
    os.dup2(old_stdin, STDIN)

@task
def pipe():
    tmp = tempfile.TemporaryFile()
    tmp.write('hello')
    tmp.seek(0)

    with replace_stdin(tmp):
        run('cat -')

The code above just hangs, as the EOF isn't forwarded to the remote end. local('cat -') works as expected.

  1. I noticed fabric/io.py input_loop() doesn't handle EOFs. When an EOF arrives (didn't check the Windows implementation), byte is '' but the remote STDIN isn't closed.
  2. Why don't you add an stdin= parameter to the run() command so that replace_stdin() hack won't be necessary?

Thanks!

@omribahumi
Copy link

Ok, so replacing input_loop in fabric.io with:

def input_loop(chan, using_pty):
    while not chan.exit_status_ready():
        if win32:
            have_char = msvcrt.kbhit()
        else:
            r, w, x = select([sys.stdin], [], [], 0.0)
            have_char = (r and r[0] == sys.stdin)
        if have_char and chan.input_enabled:
            # Send all local stdin to remote end's stdin
            byte = msvcrt.getch() if win32 else sys.stdin.read(1)
            if byte:
                # print 'sending', repr(byte)
                chan.sendall(byte)
            else:
                chan.shutdown_write()
                return
            # Optionally echo locally, if needed.
            if not using_pty and env.echo_stdin:
                # Not using fastprint() here -- it prints as 'user'
                # output level, don't want it to be accidentally hidden
                sys.stdout.write(byte)
                sys.stdout.flush()
        time.sleep(ssh.io_sleep)

Partially solves it. It still takes a long while to transfer the data (since we're doing a time.sleep(ssh.io_sleep)).
I'm not sure how this is solvable in Windows, but in POSIX you can replace the sleep with a timeout on the select call to prevent spin locking. Also, instead of reading one byte at a time from stdin, we can read a bigger chunk. sys.stdin.read(8192) won't block if there is one byte available.

Something like:

def input_loop(chan, using_pty):
    while not chan.exit_status_ready():
        if win32:
            have_char = msvcrt.kbhit()
        else:
            r, w, x = select([sys.stdin], [], [], ssh.io_sleep)
            have_char = (r and r[0] == sys.stdin)
        if have_char and chan.input_enabled:
            # Send all local stdin to remote end's stdin
            buffer = msvcrt.getch() if win32 else sys.stdin.read(8192)
            if buffer:
                # print 'sending', repr(byte)
                chan.sendall(buffer)
            else:
                chan.shutdown_write()
                return
            # Optionally echo locally, if needed.
            if not using_pty and env.echo_stdin:
                # Not using fastprint() here -- it prints as 'user'
                # output level, don't want it to be accidentally hidden
                sys.stdout.write(buffer)
                sys.stdout.flush()
        if win32:
            time.sleep(ssh.io_sleep)

I would also suggest splitting the Windows and POSIX implementations to different functions.
This code is highly unreadable 👎

P.S.: If someone with the same problem stumbles upon this, my code for un-taring is:

with settings(echo_stdin=False):
    with replace_stdin(tarball_fd):
        run('tar xv', pty=False, shell=False)

@saherahwal
Copy link

Regarding the input_loop code I hit this issue today:

def input_loop(chan, using_pty):
    while not chan.exit_status_ready():
        if win32:
            have_char = msvcrt.kbhit()
        else:
            r, w, x = select([sys.stdin], [], [], ssh.io_sleep)
            have_char = (r and r[0] == sys.stdin)
        if have_char and chan.input_enabled:
            # Send all local stdin to remote end's stdin
            buffer = msvcrt.getch() if win32 else sys.stdin.read(8192)
            if buffer:
                # print 'sending', repr(byte)
                chan.sendall(buffer)
            else:
                chan.shutdown_write()
                return
            # Optionally echo locally, if needed.
            if not using_pty and env.echo_stdin:
                # Not using fastprint() here -- it prints as 'user'
                # output level, don't want it to be accidentally hidden
                sys.stdout.write(buffer)
                sys.stdout.flush()
        if win32:
            time.sleep(ssh.io_sleep)

**File "c:\program files (x86)\microsoft visual studio\shared\python36_64\lib\site-packages\fabric\thread_handling.py", line 13, in wrapper
callable(*args, kwargs)
File "c:\program files (x86)\microsoft visual studio\shared\python36_64\lib\site-packages\fabric\io.py", line 262, in input_loop
sys.stdout.write(byte)
TypeError: write() argument must be str, not bytes

Shouldn't the line be sys.stdout.write(buffer.decode('utf-8')) for python 3?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants