Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FS#3759 - Idle ssh Connection exits with: client_loop: send disconnect: Broken pipe #8776

Open
openwrt-bot opened this issue Apr 28, 2021 · 8 comments
Labels
flyspray release/21.02

Comments

@openwrt-bot
Copy link

@openwrt-bot openwrt-bot commented Apr 28, 2021

farmergreg:

=====Summary=====

  • Device problem occurs on: x86-64 based router
  • Affected Software version: 21.02.0-rc1
    • This bug does NOT happen when running 19.07.7

=====Steps to reproduce=====

SSH from one computer to another machine that is on the internet

machine 1 < ---- > openwrt 21.02.0-rc1 <---- internet ----> machine 2

Run this command on machine 1: watch 'sleep 3000;date'

After some time has passed (wait about 30 minutes, maybe a little less):

  1. the ssh tunnel will exit when you press enter in the terminal.
  2. The error message will be: "client_loop: send disconnect: Broken pipe"
@openwrt-bot
Copy link
Author

@openwrt-bot openwrt-bot commented Apr 28, 2021

farmergreg:

Update: it turns out that my simple script above (watch ‘sleep 3000;date’) also exits on 19.07.7 with the "client_loop: send disconnect: Broken pipe" error.

I have another script ( a long running backup job) that doesn't exit with the error when running 19.07.7, but does exit with the error when running the rc version. That script runs a backup (typically 2 - 3 minutes), then sleeps until the next hour starts and then runs again. So I'm not completely sure this is an openwrt bug, but there is certainly a change in behavior between the rc and the 19.07.x releases.

@openwrt-bot
Copy link
Author

@openwrt-bot openwrt-bot commented May 2, 2021

farmergreg:

Update:
It seems that wireguard in 21.02.0-rc1 might be part of the cause.

SSH without wireguard is working fine for me when using the RC candidate.

SSH over a wireguard tunnel is where I start seeing the "client_loop: send disconnect: Broken pipe " error messages.

Network Diagram:

machine1 <-- wireguard --> OpenWRT A <--> internet <--> OpenWRT B <---> wireguard <---> machine2

The wireguard vpn is a site to site link. Both routers involved are running openwrt. Both have ports open so that either router can initiate the wireguard tunnel.

I am able to quickly and easily switch between 19.07.7 and 21.02.0-rc1 on OpenWRT router "A". 19.07.7 works great and ssh connections appear to stay alive indefinitely. 21.02.0-rc1 has the "client_loop: send disconnect: Broken pipe" problem and ssh connections typically exit early/unexpectedly.

@openwrt-bot
Copy link
Author

@openwrt-bot openwrt-bot commented Jun 19, 2021

farmergreg:

This is present in all rc releases thus far. I installed 21.02.0-rc3 today and an idle ssh connection made over wireguard exits after a little while with "client_loop: send disconnect: Broken pipe"

this does not happen in 19.07.7

@openwrt-bot
Copy link
Author

@openwrt-bot openwrt-bot commented Aug 3, 2021

farmergreg:

Update: this may be related to #3373. I turned off the "Software flow offloading" on my router. With this change, the ssh connection that exits early seems to stay connected (it's been connected for several hours now). This connection is being made over IPv4 (not ipv6 like in #3373).

@openwrt-bot
Copy link
Author

@openwrt-bot openwrt-bot commented Aug 5, 2021

farmergreg:

I can confirm that this is still an issue in rc4.

The wireguard tunnel in question is has ipv4 and ipv6 addresses locally, however the ssh connection is made using ipv4 to a vm on the remote side that only has ipv4 enabled.

Turning off software flow offloading is the workaround for now.

@openwrt-bot
Copy link
Author

@openwrt-bot openwrt-bot commented Aug 6, 2021

patrakov:

This is the same as https://forum.openwrt.org/t/software-flow-offloading-and-conntrack-timeouts/74588 and effectively confirmed as a WONTFIX in that thread.

@openwrt-bot
Copy link
Author

@openwrt-bot openwrt-bot commented Aug 6, 2021

buraktamturk:

I am having the same issue with no wireguard setup (just pppoe connection), disabling software flow offloading as a workaround fixed the issue.

@aparcar aparcar added the release/21.02 label Feb 22, 2022
@Shine-
Copy link

@Shine- Shine- commented Mar 2, 2022

The attached patch should fix it for 21.02 (kernel 5.4).

"Should", as I was never knowingly affected by this issue, I just stumbled over it while being affected by FS#3373 (now #8239).

The patch effectively reverts kernel commit e97d940 and additionally picks all necessary changes from kernel commit 4592ee7f (which fixed the issue in kernel 5.15) while skipping stuff that kernel 5.4 doesn't have yet (ie. sysctl-configurable offload timeouts).

remove-timeouts-for-flow-offload-pickup.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flyspray release/21.02
Projects
None yet
Development

No branches or pull requests

3 participants