Feature request: keepalive or auto-reconnect #663

spidercensus · 2017-02-09T16:19:26Z

I'm working to implement Salt proxy minions to Juniper devices in a customer network where idle-timeouts are configured for 5 minutes. This means that if we don't send something into the session before the 5 minutes are up, the session dies and a new proxy minion must be built.

I have tried adding the keepalive directive to my ssh_config file, but this had no effect.

Host * 
    ServerAliveInterval

Based on the above, it seems like transport layer keepalive is not going to solve this problem. The Netconf spec currently does not contain a keepalive operation, and the IETF mailing list seems to have agreed not to create one: https://www.ietf.org/mail-archive/web/netconf/current/msg08888.html

I see three options for solving this problem

No changes to the Netconf repo. Keepalives implemented entirely in application.
Implement keepalive Device parameter and keepalive thread which executes some RPC on a set interval. I don't like this option because it will fill up device logs and induce CPU churn
Add an auto_reconnect parameter to Device. This would allow for a check to be performed before every RPC is executed to make sure that the underlying SSH transport is still connected and functioning. If it is not, then call open() to get the transport up before running the RPC.

I think the third option is the easiest to implement. Would you consider a merge if I created this?

The text was updated successfully, but these errors were encountered:

vnitinv · 2017-02-09T17:14:37Z

@spidercensus I think such check (and consecutive action) should be taken care by user's code.
right now dev.connected value is static, I am planning to make to property hence the value will be returning the current state of connection. Using this value user can take action as per there need.

spidercensus · 2017-02-09T17:20:18Z

I agree that Device.connected needs to become a property. After my discussion with stacy, I was going to raise another request for that.

There is value in building this feature into the execute() and cli() methods on demand. Frameworks such as salt will be forced to check the connection state before every RPC request, which leads to greater overhead than an internal check only when the flag is turned on.

spidercensus · 2017-02-09T23:52:37Z

pull 664

…ONF over SSH sessions. Without SSH keepalives, a NAT or stateful firewall along the network path between the PyEZ host and the target Junos device, may timeout an inactive TCP flow and cause the NETCONF over SSH session to hang. Sending SSH keepalives avoids this situation. The default value is 30 seconds. Setting this parameter to a value of 0 disables SSH keepalives. Note: This is a different situation than Issue Juniper#663 in which the target Junos device is timing out the NETCONF over SSH session due to a configured idle-timeout on the system login class.

spidercensus · 2017-02-15T19:01:19Z

This is implemented in pull #669

spidercensus mentioned this issue Feb 9, 2017

Connected property #664

Merged

stacywsmith mentioned this issue Feb 14, 2017

Implement SSH keepalives, default 30-second interval, for NETCONF over SSH sessions. #668

Open

spidercensus mentioned this issue Feb 15, 2017

Auto reconnect #669

Closed

dineshbaburam91 added the Type: Enhancement label Jul 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: keepalive or auto-reconnect #663

Feature request: keepalive or auto-reconnect #663

spidercensus commented Feb 9, 2017

vnitinv commented Feb 9, 2017

spidercensus commented Feb 9, 2017

spidercensus commented Feb 9, 2017 •

edited

spidercensus commented Feb 15, 2017

Feature request: keepalive or auto-reconnect #663

Feature request: keepalive or auto-reconnect #663

Comments

spidercensus commented Feb 9, 2017

vnitinv commented Feb 9, 2017

spidercensus commented Feb 9, 2017

spidercensus commented Feb 9, 2017 • edited

spidercensus commented Feb 15, 2017

spidercensus commented Feb 9, 2017 •

edited