Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

continues login crashes AsusWRT #103230

Closed
danielskowronski opened this issue Nov 2, 2023 · 9 comments
Closed

continues login crashes AsusWRT #103230

danielskowronski opened this issue Nov 2, 2023 · 9 comments

Comments

@danielskowronski
Copy link

danielskowronski commented Nov 2, 2023

The problem

This is reopen of #75056

Out-of-the-box official integration AsusWRT, slowly but steadily crashes AsusWRT Merlin running router by opening and failing to close SSH connections, ultimately filling up the entire RAM.

On my router (which has 1GB of RAM), each such connections leaves 3704K of garbage, and last time it crashed, there were over 4500 instances of dropbear.

It appears that if there's an issue during connection, then all channels will be closed, but not the connection itself. It's also the reason why Dropbear is unable to treat those connections as inactive and auto-close them.


In this state, this official integration is effectively a router-killer. If there's no chance for quick solutions, I think there should be at least a warning added.

What version of Home Assistant Core has the issue?

core-2023.11.0

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

AsusWRT

Link to integration documentation on our website

https://www.home-assistant.io/integrations/asuswrt/

Diagnostics information

diag.json

Example YAML snippet

No response

Anything in the logs that might be useful for us?

AsusWRT merlin version seems to be irrelevant; however, I can confirm it's affecting 3004.388.4 on RT-AX88U Pro. Other reports on the official FW forum exist (e.g. this one)

ha.log contains logs from HA with debug logs enabled for AsusWRT

netstat.log is dump of established connections from HomeAssistant to AsusWRT, seconds after logs were obtained

router.log contains syslog from AsusWRT

From there, it's easy to track ssh connection states. For example:

  • "good" one is conn=1 with client port 59802 and PID on router 13733
  • "bad" one is conn=2 with client port 59812 and PID on router 13737
  • another "bad" one is conn=17 with client port 55166 and PID on router 22381

Additional information

No response

@home-assistant
Copy link

home-assistant bot commented Nov 2, 2023

Hey there @kennedyshead, @ollo69, mind taking a look at this issue as it has been labeled with an integration (asuswrt) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of asuswrt can trigger bot actions by commenting:

  • @home-assistant close Closes the issue.
  • @home-assistant rename Awesome new title Renames the issue.
  • @home-assistant reopen Reopen the issue.
  • @home-assistant unassign asuswrt Removes the current integration label and assignees on the issue, add the integration domain after the command.

(message by CodeOwnersMention)


asuswrt documentation
asuswrt source
(message by IssueLinks)

@ollo69
Copy link
Contributor

ollo69 commented Nov 2, 2023

Working on PR #95720 that could fix this problems implementing HTTP protocol, unfortunately this is taking a lot of time.

@danielskowronski
Copy link
Author

So the SSH connector is abandoned? At first glance, it seems like it's some missing connection close in asyncssh - couldn't find anything on Dropbear side.

@ollo69
Copy link
Contributor

ollo69 commented Nov 2, 2023

I do not see any maintenance from times in ssh lbrary, but you can try to open a issue in aioasuswrt and see if someone will take it in charge.

@danielskowronski
Copy link
Author

It looks like issue is partially there on all levels.

First, AsusWRT Merlin has a misleading option "Idle timeout" under SSH server config. It actually sets nvram shell_timeout, which is only used to construct forced env variable TMOUT. This one is read by interactive shells, however in the situation we have here, there's no interactive session - only SSH connection with no channels. It can be verified by checking dropbear parameters - it's only -p ADDR:22 -a with -I <idle_timeout> omitted, so it defaults to 0 or never. However, it looks like this behaviour is inherited from upstream vanilla Asus code (#define DEFAULT_IDLE_TIMEOUT 0 in Dropbear source code and nothing else setting -I).

Second, there's connection leak in aioasuswrt at least here - https://github.com/kennedyshead/aioasuswrt/blob/master/aioasuswrt/connection.py#L45 and it seems like that async_connect should be wrapped in asyncio.Lock(). Just like kennedyshead/aioasuswrt#68

Finally, the layout of this module is not helping - explicit disconnect is available, but implemented only for Telnet mode - https://github.com/home-assistant/core/blob/dev/homeassistant/components/asuswrt/bridge.py#L164

I've been testing my theory with manual edits to aioasuswrt and it looks like it's the root problem we should tackle. I'll try to make the necessary PR there, and it shouldn't require any change in this module :)

@issue-triage-workflows
Copy link

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates.
Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍
This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

@danielskowronski
Copy link
Author

/no further activity occurs as all affected user have their routers crashed

@github-actions github-actions bot removed the stale label Jan 31, 2024
@ollo69
Copy link
Contributor

ollo69 commented Jan 31, 2024

@danielskowronski,

did you try to reconfigure integration selecting https protocol or this solution doesn't work for you?

@issue-triage-workflows
Copy link

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates.
Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍
This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

@issue-triage-workflows issue-triage-workflows bot closed this as not planned Won't fix, can't repro, duplicate, stale May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants