Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting terminal size of a sub process #657

Closed
xuoguoto opened this issue May 15, 2024 · 28 comments
Closed

Setting terminal size of a sub process #657

xuoguoto opened this issue May 15, 2024 · 28 comments

Comments

@xuoguoto
Copy link

Hello all,

This is a follow up to an earlier question about asyncssh and cmd2 at #648 .

I have an ssh server with following code:

import asyncio, asyncssh, subprocess, sys
import logging, os

logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.DEBUG)

class MySFTPServer(asyncssh.SFTPServer):
    def __init__(self, chan: asyncssh.SSHServerChannel):
        root = '/tmp/'
        os.makedirs(root, exist_ok=True)
        super().__init__(chan, chroot=root)

async def handle_client(process: asyncssh.SSHServerProcess) -> None:
    bc_proc = subprocess.Popen('./script.sh', shell=True, stdin=subprocess.PIPE,
                               stdout=subprocess.PIPE, stderr=subprocess.PIPE)

    await process.redirect(stdin=bc_proc.stdin, stdout=bc_proc.stdout,
                           stderr=bc_proc.stderr)
    await process.stdout.drain()
    process.exit(0)

async def start_server() -> None:
    await asyncssh.listen('127.0.0.1', 8022, server_host_keys=['ssh_host_key'],
                          authorized_client_keys='ssh_user_ca',line_editor=False,
                          process_factory=handle_client, sftp_factory=MySFTPServer,
                          allow_scp=True)

loop = asyncio.get_event_loop()

try:
    loop.run_until_complete(start_server())
except (OSError, asyncssh.Error) as exc:
    sys.exit('Error starting server: ' + str(exc))

loop.run_forever()

This invokes a shell script which is as follows:

#!/bin/sh                                                                                                                               
/bin/script -q -c ./cmd01.py /tmp/transscript.log
exit 0

This invokes a simple cmd2 app

#!/usr/bin/env python                                                                                                                   
"""A simple cmd2 application."""
import cmd2, os


class FirstApp(cmd2.Cmd):
    """A simple cmd2 application."""


if __name__ == '__main__':
    import sys
    print (os.get_terminal_size())
    c = FirstApp()
    sys.exit(c.cmdloop())

If you notice, in the python script I am printing the terminal size using print (os.get_terminal_size()). If I run script.sh in bash I get the following output:

$ ./script.sh 
os.terminal_size(columns=136, lines=43)
(Cmd) help

Documented commands (use 'help -v' for verbose/'help <topic>' for details):
===========================================================================
alias  help     macro  run_pyscript  set    shortcuts
edit   history  quit   run_script    shell

(Cmd)

Where I can see that the columns and lines are printed correctly.

Now when I ssh into the asyncssh program, I get the following output:

$ ssh -p 8022  user@localhost 
os.terminal_size(columns=0, lines=0)
(Cmd) help

Documented commands (use 'help -v' for verbose/'help <topic>' for details):
===========================================================================
alias  help     macro  run_pyscript  set    shortcuts
edit   history  quit   run_script    shell

(Cmd) 
Connection to localhost closed.

The columns and lines are 0.

If I see the debug logs of asyncssh:

DEBUG:asyncio:Using selector: EpollSelector
INFO:asyncssh:Creating SSH listener on 127.0.0.1, port 8022
INFO:asyncssh:[conn=0] Accepted SSH client connection
INFO:asyncssh:[conn=0]   Local address: 127.0.0.1, port 8022
INFO:asyncssh:[conn=0]   Peer address: 127.0.0.1, port 38922
<snip>
INFO:asyncssh:[conn=0] Auth for user user succeeded
DEBUG:asyncssh:[conn=0, chan=0] Set write buffer limits: low-water=16384, high-water=65536
INFO:asyncssh:[conn=0, chan=0] New SSH session requested
DEBUG:asyncssh:[conn=0, chan=0]   Terminal type: screen.xterm-256color
DEBUG:asyncssh:[conn=0, chan=0]   Terminal size: 136x43
INFO:asyncssh:[conn=0, chan=0]   PTY created
DEBUG:asyncssh:[conn=0, chan=0]   Env: LC_ADDRESS=en_IN
DEBUG:asyncssh:[conn=0, chan=0]   Env: LC_NAME=en_IN
DEBUG:asyncssh:[conn=0, chan=0]   Env: LC_MONETARY=en_IN
DEBUG:asyncssh:[conn=0, chan=0]   Env: LANG=en_IN
DEBUG:asyncssh:[conn=0, chan=0]   Env: LC_PAPER=en_IN
DEBUG:asyncssh:[conn=0, chan=0]   Env: LC_IDENTIFICATION=en_IN
DEBUG:asyncssh:[conn=0, chan=0]   Env: LC_TELEPHONE=en_IN
DEBUG:asyncssh:[conn=0, chan=0]   Env: LC_MEASUREMENT=en_IN
DEBUG:asyncssh:[conn=0, chan=0]   Env: LC_TIME=en_IN
DEBUG:asyncssh:[conn=0, chan=0]   Env: LC_NUMERIC=en_IN
INFO:asyncssh:[conn=0, chan=0]   Interactive shell requested
INFO:asyncssh:[conn=0, chan=0] Sending exit status 0
INFO:asyncssh:[conn=0, chan=0] Closing channel
INFO:asyncssh:[conn=0, chan=0] Received channel close
DEBUG:asyncssh:[conn=0] Received disconnect: disconnected by user (11)
INFO:asyncssh:[conn=0, chan=0] Channel closed

I can see that the server is getting correct terminal size as 136x43.

But the client side terminal size not correctly set. I think the pty is running in an 80x24 terminal, as, if I run vim in the cmd, the screen size is seen as 80x24.

How can I set the terminal size of the script to correct terminal size?

@pyhedgehog
Copy link

pyhedgehog commented May 16, 2024

Problem is here:

    bc_proc = subprocess.Popen('./script.sh', shell=True, stdin=subprocess.PIPE,
                               stdout=subprocess.PIPE, stderr=subprocess.PIPE)

You are starting your script with stdin/stdout redirected to newly created pipes (not a tty).
You can get same effect in shell:

$ cat|./script.sh 2>&1|cat

I think you want:

    bc_proc = subprocess.Popen('./script.sh', shell=True, stdin=process.stdin,
                               stdout=bprocess.stdout, stderr=process.stderr)

@ronf
Copy link
Owner

ronf commented May 17, 2024

Redirecting the original stdin/stdout/stderr might be an option if you don't want your application to capture or modify any of the input or output. Also, if you do this, you'll probably only be able to get a single command to run out of this, or it will require some special handling to safely do your own I/O on these streams when the upstream script exits, since it will try and close those streams.

You could also use the "pty" module in Python to open up your own pty to use for the upstream script, and then funnel data between the SSH session and that upstream PTY. I haven't ever tried this, so I'm not sure if there are any issues with that, but it would give you total control over the input & output.

@xuoguoto
Copy link
Author

I have tried with pty module also:

#!/usr/bin/env python                                                                                                                           
import os
import pty

shell = "./cmd01.py"

def read(fd):
    data = os.read(fd, 1024)
    return data

pty.spawn(shell, read)

This also works, and I am able to execute commands continuously. My only problem is that I have no means to set the size of the pty to match the size of window I am working on

@xuoguoto
Copy link
Author

Problem is here:

    bc_proc = subprocess.Popen('./script.sh', shell=True, stdin=subprocess.PIPE,
                               stdout=subprocess.PIPE, stderr=subprocess.PIPE)

You are starting your script with stdin/stdout redirected to newly created pipes (not a tty). You can get same effect in shell:

In this case I am able to work correctly (send commands and receive their reply multiple times) with the remote cmd program via ssh. As I said in previous message, my only problem is that the pty size is fixed and do not change with the window size of my ssh client.

I can see that the asyncssh does get correct window size from the client, but I have no means to pass that to underlying pty.

@ronf
Copy link
Owner

ronf commented May 17, 2024

When you create your pty with pty.openpty(), you'll get back a pair of file descriptors. The first is the master side of the pty, which you would want to redirect the SSH connection to read & write to and the second is the slave side of the pty which you would want to redirect the Popen to redirect to.

To set the terminal size, you'd need to do an ioctl of TIOCSWINSZ on the master side of the pty before you began doing the redirection, passing in the terminal size you get from AsyncSSH. The code would look something like:

    fcntl.ioctl(pty_master, termios.TIOCSWINSZ, struct.pack('hhhh', width, height, pixwidth, pix height))

If you need to handle window size changes, AsyncSSH provides a way to get an exception when the terminal is resized, and you could redo your ioctl() to pass that through to the pty when that happens.

@ronf
Copy link
Owner

ronf commented May 18, 2024

Here's a more complete example:

async def handle_connection(process):
    local_pty, local_tty = os.openpty()

    fcntl.ioctl(local_pty, termios.TIOCSWINSZ,
                struct.pack('hhhh', *process.term_size))

    local_proc = subprocess.Popen('stty -a', shell=True,
                                  stdin=local_tty, stdout=local_tty)
    os.close(local_tty)

    await process.redirect(stdin=local_pty, stdout=os.dup(local_pty))
    await process.stdout.drain()
    process.exit(0)

In this version I used os.openpty() instead of pty.openpty(), since they're basically equivalent and this can save the need for an extra import.

Note that I didn't do anything with stderr here on either the remote subprocess or on the SSH channel redirect. This is because PTYs only ever communicate through stdin & stdout. There is no separate stderr stream when a PTY is allocated.

Also, note there's no explicit close of local_pty. This is handled automatically by the SSH process redirection. In fact, for it to work right, I had to add a call to os.dup() on the local_pty since the redirects will attempt to close it twice (on both stdin and stdout).

For local_tty, though, the subprocess will close the copy of the fd in the subprocess when it exits but it won't close it in the parent, so that has to be done explicitly. This can be done right after the Popen creates the subprocess, as the copy of the fd in the subprocess will exist at that point and will stay open until the subprocess exits. Closing the parent fd allows us to get an EOF notification when the subprocess exits (waking up the call to drain).

@xuoguoto
Copy link
Author

xuoguoto commented May 24, 2024

Thanks for the detailed reply, made some changes and the initial terminal size is drawn as expected.

In the example height and width were interchanged, that was corrected.

fcntl.ioctl(pty_master, termios.TIOCSWINSZ, struct.pack('hhhh', width, height, pixwidth, pix height))

for ioctl height should be first and width second, like

width, height, pixWidth, pixHeight = process.term_size
fcntl.ioctl(local_tty, termios.TIOCSWINSZ, struct.pack('hhhh', height, width, pixHeight, pixWidth))

Also the pty and tty returned from os.openpty() were interchanged, for subprocess and ioctl tty was used instead of pty
for redirect pty was used instead of tty.

here is the complete program:

import asyncio, asyncssh, sys, subprocess, os, fcntl, struct, termios, logging

logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.DEBUG)

async def handle_client(process: asyncssh.SSHServerProcess) -> None:

    local_pty, local_tty = os.openpty()
    width, height, pixWidth, pixHeight = process.term_size
    fcntl.ioctl(local_tty, termios.TIOCSWINSZ,
                struct.pack('hhhh', height, width, pixHeight, pixWidth))

    local_proc = subprocess.Popen('./script.sh', shell=True,
                                  stdin=local_pty, stdout=local_pty)
    os.close(local_pty)

    await process.redirect(stdin=local_tty, stdout=os.dup(local_tty))
    await process.stdout.drain()
    process.exit(0)


async def start_server() -> None:
    await asyncssh.listen('127.0.0.1', 8022, server_host_keys=['ssh_host_key'],
                          authorized_client_keys='ssh_user_ca',line_editor=False,
                          process_factory=handle_client)

loop = asyncio.get_event_loop()

try:
    loop.run_until_complete(start_server())
except (OSError, asyncssh.Error) as exc:
    sys.exit('Error starting server: ' + str(exc))
try:
    loop.run_forever()
except KeyboardInterrupt as exc:
    sys.exit('Program stopped')

But when the terminal size changes, resizing is not working. Could not figure out how to raise the exception when the terminal is resized in the code above and then catch it to resize the terminal.

@ronf
Copy link
Owner

ronf commented May 24, 2024

You're absolutely right about needing to swap width & height in the ioctl, as the first two arguments in the ioctl are rows and columns. Good catch on that. However, you shouldn't swap pixWidth and pixHeight, as those are x first and then y. So, these lines should look something like:

    width, height, pixWidth, pixHeight = process.term_size

    fcntl.ioctl(local_pty, termios.TIOCSWINSZ,
                struct.pack('hhhh', height, width, pixWidth, pixHeight))

As for local_tty & local_pty, you want the local process to be reading & writing on the TTY, and AsyncSSH to be feeding that TTY data via the PTY. So, while it may appear to work in either direction, there may be minor differences between the "master" and "slave" sides of a PTY which may cause problems if you swap them. If you put back the associations with just the width & height swapped, does that work for you?

As for handling resizing, that will probably be somewhat tricky to do when you are taking advantage of redirects to feed the data through. AsyncSSH will properly pass through window size changes when the target of a redirect is another SSHProcess, but it doesn't currently try to do this in other cases. It also doesn't pass through "break" indications or signals in the case where you are redirecting to something other than SSHProcess.

It might be possible to check if the redirect target is a local TTY and use ioctl to pass changes through when it is. This would need to be done from a write_exception() method in PipeWriter. It would probably require keeping the raw file descriptor around (at least when it is a tty), but that should be doable. I'll experiment with this and get back to you.

@xuoguoto
Copy link
Author

If you put back the associations with just the width & height swapped, does that work for you?

Without swapping PTY and TTY the client wasn't disconnecting properly. The only way to disconnect the client was to stop the server without the swap, that's why they were swapped.

I'll experiment with this and get back to you.

Thanks! Will wait for your update.

@ronf
Copy link
Owner

ronf commented May 24, 2024

Ok - I've got a prototype for forwarding window size changes and break from an SSH connection to a local TTY:

diff --git a/asyncssh/process.py b/asyncssh/process.py
index 7c7b5a9..a6b46b5 100644
--- a/asyncssh/process.py
+++ b/asyncssh/process.py
@@ -29,6 +29,7 @@ import os
 from pathlib import PurePath
 import socket
 import stat
+import sys
 from types import TracebackType
 from typing import Any, AnyStr, Awaitable, Callable, Dict, Generic, IO
 from typing import Iterable, List, Mapping, Optional, Set, TextIO
@@ -51,6 +52,11 @@ from .stream import SSHReader, SSHWriter, SSHStreamSession
 from .stream import SSHClientStreamSession, SSHServerStreamSession
 from .stream import SFTPServerFactory

+if sys.platform != 'win32': # pragma: no branch
+    import fcntl
+    import struct
+    import termios
+

 _AnyStrContra = TypeVar('_AnyStrContra', bytes, str, contravariant=True)

@@ -434,6 +440,17 @@ class _PipeWriter(_UnicodeWriter[AnyStr], asyncio.BaseProtocol):
         assert self._transport is not None
         self._transport.write(self.encode(data))

+    def write_exception(self, exc: Exception) -> None:
+        fd = self._transport.get_extra_info('pipe').fileno()
+
+        if sys.platform != 'win32' and os.isatty(fd):
+            if isinstance(exc, TerminalSizeChanged):
+                fcntl.ioctl(fd, termios.TIOCSWINSZ,
+                            struct.pack('hhhh', exc.height, exc.width,
+                                        exc.pixwidth, exc.pixheight))
+            elif isinstance(exc, BreakReceived): # pragma: no branch
+                termios.tcsendbreak(fd, exc.msec)
+
     def write_eof(self) -> None:
         """Write EOF to the pipe"""

Passing through signals would be trickier, as that can't be done though a pipe. It would only be possible if a local subprocess was provided as the target of a redirect so that AsyncSSH had access to the pid. So, for now, I've left that out.

Passing through break seems to work, but it doesn't appear to actually do anything when sent to a shell, even when BRKINT is set (which should cause a break to turn into an interrupt, like ^C). Given that, I may leave this out and only handle the window size change. Let me know if it works for you.

One other thing I noticed that is that you probably should be redirecting both stdout and stderr in the local subprocess call to the TTY. Otherwise, stdout will continue to go to the stdout of the AsyncSSH server process. I ended up doing the following for this:

    local_proc = subprocess.Popen('stty -a', shell=True, stdin=local_tty,
                                  stdout=local_tty, stderr=local_tty)

Note that this only applies to the local_proc and not to the stdout/stderr on the AsyncSSH side (since making that a PTY merges together stdout & stderr). Also, it's important to make sure the os.dup() call is there. Otherwise, you can get errors when cleaning things up.

Once I added the call to os.close(local_tty), things started to exit properly for me.

@ronf
Copy link
Owner

ronf commented May 28, 2024

A somewhat more comprehensive version of the above is now available in the "develop" branch as commit 1bbd845. This version will automatically set the initial terminal size provided by the client when the SSHServerProcess redirects stdin to a local TTY, in addition to passing through any size updates after that.

Here's an updated example client. Note that there's no direct terminal size code here any more, as AsyncSSH is now taking care of that.

import asyncio
import asyncssh
import os
import subprocess
import sys

async def handle_connection(process):
    local_pty, local_tty = os.openpty()

    local_proc = subprocess.Popen('while true; do stty size; sleep 5; done',
                                  shell=True, stdin=local_tty,
                                  stdout=local_tty, stderr=local_tty)
    os.close(local_tty)

    await process.redirect(stdin=local_pty, stdout=os.dup(local_pty))

    await process.stdout.drain()
    process.exit(0)

async def start_server():
    await asyncssh.listen('', 8022, server_host_keys=['ssh_host_key'],
                          authorized_client_keys='ssh_user_ca',
                          process_factory=handle_connection)

loop = asyncio.new_event_loop()

try:
    loop.run_until_complete(start_server())
except (OSError, asyncssh.Error) as exc:
    sys.exit('Error starting server: ' + str(exc))

loop.run_forever()

@ronf
Copy link
Owner

ronf commented May 28, 2024

Actually, you may want to hold off a bit on trying this. I'm seeing some unit test failures on Linux and Windows that I didn't see on macOS. I'll let you know when this is resolved.

@ronf
Copy link
Owner

ronf commented May 28, 2024

Ok - commit 43bcd2d should address this.

@xuoguoto
Copy link
Author

Thanks, I was checking with the old code and was facing some difficulty. Will check this out and let you know.

@ronf
Copy link
Owner

ronf commented May 28, 2024

The issue I found (and the associated fix) only affected unit test code, so it shouldn't have prevented your tests or the above example code from running, though I'm thinking about hardening the code a bit to ignore errors in trying to copy the window size information to a TTY which had its slave side closed. On macOS, this doesn't cause a problem, but on Linux an error is raised in the fcntl.ioctl() call, and there could be a race between a client sending a window size update and the remote process exiting and closing the TTY.

I also discovered that asyncio on Windows doesn't allow pipes created with os.pipe() to be passed into loop.create_read_pipe() or loop.create_write_pipe(), so my non-TTY unit test was failing on Windows.

@xuoguoto
Copy link
Author

Hi,
A quick update on the testing so far:

I've run program with asyncssh development branch and the initial terminal size is working as expected without setting ioctl, but the resizing isn't working as expected when a program such as vim is running as subprocess. I am not sure if this is the expected behavior or some additional steps needs to be done.

When the program runs inside tmux width changes (by changing the tmux panel width) are effected immediately but for height changes to be effected, one width change has to be performed after height change. Sometimes the screen needs to be cleared with ctrl + L to view the text properly.

When the client is running in a standalone terminal window resizing has no effect on the client but the resizing message can be seen on the server.

Also I'm still having trouble without switching pty and tty as the client can't be disconnected even when the subprocess is complete, only way to disconnect the client is to stop the server. Switching tty and pty will disconnect the client when the subprocess is complete/exited.

Also development branch asyncssh produces following warning when program is stopped

sys:1: RuntimeWarning: coroutine 'Event.wait' was never awaited                              
RuntimeWarning: Enable tracemalloc to get the object allocation traceback 

All the testing is done in a Ubuntu 22.04 machine. I am happy to run any additional tests or provide addition info if required.

@ronf
Copy link
Owner

ronf commented May 29, 2024

Regarding the Event.wait message, that can happen if you exit the Python interpreter without giving the various cleanup functions a chance to run. In this case, I think the problem might be that the local proc continues to run even after the client closes the SSH connection. So, there's still an instanced of handle_connection() running that gets destroyed without cleanup when the server is stopped. If the server-side process exited when it saw EOF on stdin, I don't think you'd see this problem.

Also, it looks like my example code is missing a call to await process.wait_closed() at the end of handle_connection(), or putting the entire body of that function in an async with process: block. That may be needed as well to avoid an event object there from not being properly awaited when handle_connection() exits.

As for the rest of what you've described, most of it sounds out of AsyncSSH's control. All it can do is to do the appropriate ioctl() to tell the pseudo-TTY that the size has changed. Everything else is up to the application running on that TTY.

Regarding switching pty & tty, do you have a minimal example where you can reproduce this problem? If I change the example above to run a command which exits, the SSH connection from the client is closed when the command finishes.

@xuoguoto
Copy link
Author

Also I'm still having trouble without switching pty and tty as the client can't be disconnected even when the subprocess is complete, only way to disconnect the client is to stop the server. Switching tty and pty will disconnect the client when the subprocess is complete/exited.

Regarding switching pty & tty, do you have a minimal example where you can reproduce this problem? If I change the example above to run a command which exits, the SSH connection from the client is closed when the command finishes.

I am attaching a minimal example where this problem is illustrated. Also a short video of the experience when running this.
Screen cast of running these two programs: Screencast.webm

I am pasting the programs here as gh do not allow me to attach these files.

cmd01.py

#!/usr/bin/env python
"""A simple cmd2 application."""
import cmd2, os

class FirstApp(cmd2.Cmd):
    """A simple cmd2 application."""

if __name__ == '__main__':
    import sys
    c = FirstApp()
    sys.exit(c.cmdloop())

test.py

import asyncio, sys, subprocess, os, fcntl, struct, termios, logging, asyncssh

logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.DEBUG)

async def handle_client(process: asyncssh.SSHServerProcess) -> None:

    local_pty, local_tty = os.openpty()
    # width, height, pixWidth, pixHeight = process.term_size
    # fcntl.ioctl(local_pty, termios.TIOCSWINSZ,
    #             struct.pack('hhhh', height, width, pixWidth, pixHeight))

    local_proc = subprocess.Popen('./cmd01.py', shell=True,
                                  stdin=local_tty, stdout=local_tty, stderr=local_tty)
    os.close(local_tty)

    await process.redirect(stdin=local_pty, stdout=os.dup(local_pty))

    await process.stdout.drain()
    await process.wait_closed()
    process.exit(0)


async def start_server() -> None:
    await asyncssh.listen('127.0.0.1', 8022, server_host_keys=['ssh_host_key'],
                          authorized_client_keys='ssh_user_ca',line_editor=False,
                          process_factory=handle_client)

loop = asyncio.get_event_loop()

try:
    loop.run_until_complete(start_server())
except (OSError, asyncssh.Error) as exc:
    sys.exit('Error starting server: ' + str(exc))
try:
    loop.run_forever()
except KeyboardInterrupt as exc:
    sys.exit('Program stopped')

@ronf
Copy link
Owner

ronf commented May 30, 2024

Thanks. I was able to reproduce the issue here, but the problem is that your await process.wait_closed() is in the wrong place. You need to call process.exit(0) BEFORE you wait for the process to finish closing. When I made that change and typed "quit" into the Cmd prompt, it correctly exited the SSH client.

There are other variations of this as well, like waiting for local_proc to finish, and then potentially reading the exit code and providing that in process.exit(). In that case, you wouldn't need the await process.stdout.drain(). However, if you just move the wait_closed() before the exit(), does that fix the problem for you?

@xuoguoto
Copy link
Author

I had made this change, but there seems to be one more issue. The await process.stdout.drain() is never completed so call to process.exit(0) is never reached. I had put a log line immediately after await process.stdout.drain() and that line is never printed. I am not sure if this is some platform specefic issue as it seems to be running fine at your end.

@ronf
Copy link
Owner

ronf commented May 30, 2024

Is stdin still not returning from drain() even when you type "exit" into the Cmd shell? Once all current output is read and EOF is received on a particular data channel, the redirection should be disabled and then drain() should return. What does your debug output look like after entering "exit"?

@xuoguoto
Copy link
Author

xuoguoto commented May 31, 2024

Just to recap, I am testing two variations of the same scenario, one called test-with-pty.py and other test-with-tty.py. Difference between them is as follows:

--- test-with-pty.py    2024-05-31 11:38:43.351902272 +0530
+++ test-with-tty.py    2024-05-31 11:38:44.907893967 +0530
@@ -8,11 +8,11 @@
 
     local_pty, local_tty = os.openpty()
     
-    local_proc = subprocess.Popen('./cmd01.py', shell=True,
-                                  stdin=local_tty, stdout=local_tty, stderr=local_tty)
-    os.close(local_tty)
+    local_proc = subprocess.Popen('./script.sh', shell=True,
+                                  stdin=local_pty, stdout=local_pty, stderr=local_pty)
+    os.close(local_pty)
 
-    await process.redirect(stdin=local_pty, stdout=os.dup(local_pty))
+    await process.redirect(stdin=local_tty, stdout=os.dup(local_tty))
     await process.stdout.drain()
     process.exit(0)
     await process.wait_closed()

test-with-pty.py is is as per documentation with redirect(stdin=local_pty, stdout=os.dup(local_pty)) while test-with-tty.py is pty and tty switched, ie redirect(stdin=local_tty, stdout=os.dup(local_tty)).

Case 1. In test-with-pty I can directly open the cmd01.py resizing is working as expected, but the exit is not registered. I have attached the screen capture of how this goes.

Screencast.from.05-30-2024.07.03.00.PM.webm

Case 2. In test-with-tty if I directly open the cmd01.py (or another program like bc) the client program (cmd2 in this case) goes into a loop. I have attached the screen capture.

Screencast.from.05-30-2024.07.10.41.PM.webm

Case 3. In test-with-tty if use another script to run cmd01.py things are much better, It is disconnecting correctly and the only issue is that when height is changed, its getting updated only when width is changed. I have attached the video of this also.

Screencast.from.05-30-2024.06.57.44.PM.webm

So this is the current status. I am testing this on python version : 3.10.12 on Ubuntu 22.04.

What does your debug output look like after entering "exit"?

This scenario happens in Case 1. After exit is pressed, there are no logs in the sever side. It just waits as you can see in the screen cast. Is there any additional debugging that can be done (for eg tcpdump) that will help? Also please note that it works as expected in Case 3.

script.sh.txt
test-with-tty.py.txt
test-with-pty.py.txt

@ronf
Copy link
Owner

ronf commented May 31, 2024

Thanks for the additional info. I re-ran my test here on Ubuntu (previous runs were on macOS), and I was able to reproduce the hang in my original code (similar to your case 1). As you said, it seems to hang on await process.stdout.drain(), even after the Cmd shell exits. In fact, it remains hung there even after the SSH client disconnects, though there is debug output in that case showing that the server did process the connection close. I haven't yet had a chance to dig into why drain is hanging, but the OS difference appears to explain why you were seeing it and I wasn't. I'll take a closer look at this tonight.

On a separate note, your "case 3" screen capture doesn't appear to be doing anything. Was that the capture you intended to post?

@ronf
Copy link
Owner

ronf commented Jun 1, 2024

I did some more investigation tonight.

On macOS, closing the slave TTY causes the master (PTY) to receive an EOF. However, it appears that on Linux, the master gets back a connection_lost with an EIO error, without sending EOF first, and the connection_lost code path doesn't properly wake up readers that were blocked waiting for the redirect to complete.

I'll need some time to go through the code and find the best place to unblock waiters blocked on drain when a redirected pipe is being cleaned up. This difference in error handling would explain what we've been seeing, though, and I don't think it'll be too hard to fix.

@xuoguoto
Copy link
Author

xuoguoto commented Jun 1, 2024

Thank you for your time and effort. I will wait for your investigation to be complete.

@ronf
Copy link
Owner

ronf commented Jun 1, 2024

Ok - I have a proposed fix you can try:

diff --git a/asyncssh/process.py b/asyncssh/process.py
index bc3496d..3510e33 100644
--- a/asyncssh/process.py
+++ b/asyncssh/process.py
@@ -365,6 +365,12 @@ class _PipeReader(_UnicodeReader[AnyStr], asyncio.BaseProtocol):

         self._transport = cast(asyncio.ReadTransport, transport)

+    def connection_lost(self, exc: Optional[Exception]) -> None:
+        """Handle closing of the pipe"""
+
+        self._process.feed_close(self._datatype)
+        self.close()
+
     def data_received(self, data: bytes) -> None:
         """Forward data from the pipe"""

@@ -1048,6 +1054,12 @@ class SSHProcess(SSHStreamSession, Generic[AnyStr]):
         self._readers[datatype].close()
         self.clear_reader(datatype)

+    def feed_close(self, datatype: DataType) -> None:
+        """Feed pipe close to the channel"""
+
+        if datatype in self._readers:
+            self.feed_eof(datatype)
+
     def feed_recv_buf(self, datatype: DataType,
                       writer: _WriterProtocol[AnyStr]) -> None:
         """Feed current receive buffer to a newly set writer"""

This change looks for cases where connection_lost() is called on a PipeReader without calling eof_received() first, and triggers the code which would have happened in eof_received() in that case. Among other things, this will disable the redirection and wake up any tasks blocked on drain().

This seems to solve the problem on Ubuntu, and also continues to work on macOS (where eof_received() is called).

@xuoguoto
Copy link
Author

xuoguoto commented Jun 4, 2024

Thanks, tried this out and it seems to fix the disconnection issue without swapping pty and tty. Thanks a lot for your patience and help!

@ronf
Copy link
Owner

ronf commented Jun 4, 2024

Thanks for confirming, and for all your help tracking this down! This change is now in the "develop" branch as commit 5159542.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants