Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.14.1+ fails to rekey with OpenSSH <= 6 #662

Closed
eyalgolan1337 opened this issue Jun 2, 2024 · 7 comments
Closed

2.14.1+ fails to rekey with OpenSSH <= 6 #662

eyalgolan1337 opened this issue Jun 2, 2024 · 7 comments

Comments

@eyalgolan1337
Copy link

When connecting to a server running OpenSSH <= 6 (tested against 5.3p1, 6.6.1p1, and 7.2p2), the connection will fail after attempting rekeying (whether triggered by the server or client).

Steps to reproduce:

server.Dockerfile:

FROM docker.io/ubuntu:14.04

RUN apt-get update && apt-get install -y openssh-server

RUN echo "root:root" | chpasswd

RUN sed -i 's/^#\?PermitRootLogin.*$/PermitRootLogin yes/' /etc/ssh/sshd_config

RUN mkdir /var/run/sshd

CMD /usr/sbin/sshd -E /root/sshd.log && watch "du -sh /root/big.data ; tail /root/sshd.log"

client.Dockerfile:

RUN pip3 install asyncssh==2.14.1

RUN cat <<'EOF' >/root/test.py
import asyncio
import asyncssh

async def main():
  async with asyncssh.connect(host='127.0.0.1',  
                              username='root',
                              password='root',
                              known_hosts=None,
                              compression_algs=None) as conn:
    async with conn.start_sftp_client() as sftp:
      await sftp.put('/root/big.data', '/root/big.data')
      
asyncio.run(main())
EOF

# Create a 2GB file and run the test script
CMD dd if=/dev/zero of=/root/big.data bs=1G count=2 && python3 /root/test.py

Build and start the server:

docker build -t asyncssh-bug-server -f server.Dockerfile .
docker run --rm -ti --name asyncssh-bug-server asyncssh-bug-server

Build and start the client:

docker build -t asyncssh-bug-client -f client.Dockerfile .
docker run --rm -ti --network container:asyncssh-bug-server asyncssh-bug-client

Server output:

Every 2.0s: du -sh /root/big.data ; tail /root/sshd.log

1021M   /root/big.data
Server listening on 0.0.0.0 port 22.
Server listening on :: port 22.
Accepted password for root from 127.0.0.1 port 59640 ssh2
Disconnecting: Protocol error: expected packet type 21, got 94

Client output:

...
File "/usr/local/lib/python3.12/site-packages/asyncssh/sftp.py", line 770, in run_task
    await self._dst.write(data, offset)
  File "/usr/local/lib/python3.12/site-packages/asyncssh/sftp.py", line 3198, in write
    await self._handler.write(self._handle, offset, data_bytes)
  File "/usr/local/lib/python3.12/site-packages/asyncssh/sftp.py", line 2680, in write
    return cast(int, await self._make_request(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/asyncssh/sftp.py", line 2447, in _make_request
    resptype, resp = await waiter
                     ^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/asyncssh/sftp.py", line 2356, in recv_packets
    packet = await self.recv_packet()
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/asyncssh/sftp.py", line 2345, in recv_packet
    pktlen = await self._reader.readexactly(4)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/asyncssh/stream.py", line 242, in readexactly
    return await self._session.read(n, self._datatype, exact=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/asyncssh/stream.py", line 547, in read
    raise exc
BrokenPipeError

As you can see, after 1GB of data is sent, the server expects an SSH_MSG_NEWKEYS packet (21) but gets an SSH_MSG_CHANNEL_DATA packet (94) instead.

This only occurs with OpenSSH < 7 and asyncssh>=2.14.1.

@ronf
Copy link
Owner

ronf commented Jun 2, 2024

These are very old versions of OpenSSH, so the problem may be on the OpenSSH side and not an issue in AsyncSSH. I can look into whether I can make a change to better interoperate with older OpenSSH versions, but depending on the problem this may not be possible. As far as I know, nothing has changed at the SSH protocol level where older versions would expect anything different to be sent here.

If could also be AsyncSSH is doing something wrong around allowing data packets to continue to flow even after renegotiation has been started, but if that were the case I would expect it to still be an issue even against newer versions of OpenSSH.

Can you be more specific about exactly which OpenSSH version fixes this issue? You say you tested 7.2p2 -- did that version work? Also, 6.6.1p1 doesn't look like a proper OpenSSH version number. Did you mean 6.6p1 here, or was that some kind of distribution-specific version number and not the upstream OpenSSH version? Could you try releases between 6.6 and 7.2 to see where it exactly started/stopped working?

Also, can you get a level 2 debug log from AsyncSSH, showing exactly which packets and sent and received when the problem occurs? Do you see any sign of it starting the new key exchange before the failure?

@eyalgolan1337
Copy link
Author

These are very old versions of OpenSSH, so the problem may be on the OpenSSH side and not an issue in AsyncSSH.

That's what I initially assumed, but the problem seems to have been introduced in asyncssh 2.14.1. Running with 2.14.0 works fine.

Can you be more specific about exactly which OpenSSH version fixes this issue? You say you tested 7.2p2 -- did that version work?

Yes, sorry for the confusing wording. 7.2p2 does work.

Also, 6.6.1p1 doesn't look like a proper OpenSSH version number. Did you mean 6.6p1 here, or was that some kind of distribution-specific version number and not the upstream OpenSSH version?

root@a77d6983f8e6:/# ssh -V
OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.13, OpenSSL 1.0.1f 6 Jan 2014

This is what I found in the Debian patch log:

...
The patch fixes the bug and makes OpenSSH identify itself as 6.6.1 so as
to distinguish itself from the incorrect versions so the compatibility
code to disable the affected KEX isn't activated.

RedHat also appears to be using the same version scheme, but I couldn't find anything on the official OpenSSH site/repo.

Could you try releases between 6.6 and 7.2 to see where it exactly started/stopped working?

6.7p1 does not work, 6.8p1 does.

Also, can you get a level 2 debug log from AsyncSSH, showing exactly which packets and sent and received when the problem occurs? Do you see any sign of it starting the new key exchange before the failure?

DEBUG:asyncssh:[conn=0, chan=0] Sending 16413 data bytes
DEBUG:asyncssh.sftp:[conn=0, chan=0] Sending write for 16384 bytes at offset 2141126656 in handle 00000000
DEBUG:asyncssh:[conn=0, chan=0] Sending 16413 data bytes
DEBUG:asyncssh.sftp:[conn=0, chan=0] Sending write for 16384 bytes at offset 2141143040 in handle 00000000
DEBUG:asyncssh:[conn=0, chan=0] Sending 16413 data bytes
DEBUG:asyncssh:[conn=0, chan=0] Writing from session paused
DEBUG:asyncssh.sftp:[conn=0, chan=0] Sending write for 16384 bytes at offset 2141159424 in handle 00000000
DEBUG:asyncssh:[conn=0, chan=0] Sending 16413 data bytes
INFO:asyncssh:[conn=0] Connection failure: 
INFO:asyncssh:[conn=0, chan=0] Closing channel due to connection close
INFO:asyncssh:[conn=0, chan=0] Channel closed: 
INFO:asyncssh.sftp:[conn=0, chan=0] SFTP client exited: 
INFO:asyncssh:[conn=0, chan=0] Closing channel
DEBUG:asyncssh.sftp:[conn=0, chan=0] Sending close for handle 00000000
INFO:asyncssh:[conn=0] Closing connection
INFO:asyncssh:[conn=0] Sending disconnect: Disconnected by application (11)

Aside from the Writing from session paused, I don't see anything that indicates an attempt to rekey.

Interestingly, when debug logging is enabled, it takes longer to fail (more data is written before the connection fails).

I changed the RekeyLimit on the server to 512MB, and with logging enabled the connection fails after either 1.5 or 2GB. Could asyncssh be sending too quickly and missing/waiting too long to respond to rekey requests?

asyncssh.log.gz

Summary:

asyncssh \ OpenSSH <=6.7p1 >=6.8p1
<=2.14.0 okay okay
>=2.14.1 broken okay

@eyalgolan1337
Copy link
Author

After testing with specific commits, I found that the issue was introduced in 83e43f5.

@ronf
Copy link
Owner

ronf commented Jun 5, 2024

Thanks for the additional info. This commit was a precursor to fixes related to the Terrapin attack. It added more strict checks on the acceptance of certain messages to avoid injection attacks. What's odd here, though, is that it seems like it is OpenSSH that's complaining about unexpected messages, not AsyncSSH, and nothing really should have changed with this commit regarding what AsyncSSH is sending out.

Looking at the diff between OpenSSH 6.7p1 and 6.8p1, it looks like maybe support for server-based rekeying might have only been added in 6.8p1. Could that be why 6.7 is working? Perhaps the server is never triggering a rekeying in that case?

I must admit that I don't know if OpenSSH has any rules about how quickly a peer must respond to a rekeying action. If we're talking about GB worth of data here, though, that doesn't seem likely to be the issue. AsyncSSH's default window size is 2 MB, and I think OpenSSH is similar. So, it wouldn't be able to queue up more than that amount to send before putting back-pressure on SFTP, and that should give it a chance to see and react to any re-keying requests which might be interspersed with window updates from OpenSSH.

Looking at the logs, the outgoing NEWKEYS message should have been sent just after logging "Compression alg: none" for "Server to client", and the incoming NEWKEYS message is what triggers the "Completed key exchange" message seen in the logs just after that. So, the rekeying appears to be completing.

My best guess at this point is that perhaps AsyncSSH is letting some DATA messages to be sent out in between the time that it sends KEXINIT and NEWKEYS.

From a code inspection, it looks like there is a call to send_deferred_packets() in send_newkeys(), but I no longer see any code in send_packet() which holds back on sending other packet types once a key exchange is in progress. In fact, looking at 83e43f5, it looks like the following code was removed in 2.14.1:

@@ -1596,9 +1612,7 @@ class SSHConnection(SSHPacketHandler, asyncio.Protocol):
             self._send_kexinit()
             self._kexinit_sent = True

-        if (((pkttype in {MSG_SERVICE_REQUEST, MSG_SERVICE_ACCEPT} or
-              pkttype > MSG_KEX_LAST) and not self._kex_complete) or
-                (pkttype == MSG_USERAUTH_BANNER and
+        if ((pkttype == MSG_USERAUTH_BANNER and
                  not (self._auth_in_progress or self._auth_complete)) or
                 (pkttype > MSG_USERAUTH_LAST and not self._auth_complete)):
             self._deferred_packets.append((pkttype, args))

I think the removal of this code may explain it. Could you try adding that back in and seeing if that fixes the issue?

@eyalgolan1337
Copy link
Author

Adding it back appears to fix the problem!

@ronf
Copy link
Owner

ronf commented Jun 5, 2024

Great, thanks!

There are some unit test failures when I add this back in that I'll need to investigate, but I think this is definitely the issue, and this code should not have been removed in that earlier commit.

Thanks for the report, and for helping to track this down! I'll let you know here when I have a final fix for this.

@ronf
Copy link
Owner

ronf commented Jun 6, 2024

Fixing the unit tests was straightforward. This fix is now available in the "develop" branch as commit c3dc869. Thanks for all your help!

@ronf ronf closed this as completed Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants