-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ssh keepalives to ssh communicator #19339
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @spion06,
Thanks for the contribution!
While adding keepalive messages might be useful in some circumstances, I'm wondering if it's a knob we need to expose to users at all. The provisioner are not long-lived connections, so the tuning of the frequency isn't really of much use. I think we might just want a sane default (less than 30s, since that's a common network timeout setting elsewhere) and always have it enabled.
Hold off on rewriting the PR though, until we have a plan for implementation.
// Start a goroutine that executes every KeepAliveInterval to keep the session alive | ||
// will return when the session no longer responds to requests | ||
go func() { | ||
if c.config.KeepAliveInterval < 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will panic below with a timeout of 0
, and should probably be limited to a reasonable minimum like 1 second, so we don't flood the connection with 1ns
keepalives.
@@ -51,6 +54,9 @@ type connectionInfo struct { | |||
ScriptPath string `mapstructure:"script_path"` | |||
TimeoutVal time.Duration `mapstructure:"-"` | |||
|
|||
KeepAliveInterval string `mapstructure:"keep_alive_interval"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't added to config validation, so it can't actually be used from a configuration.
Closing in favor of #20437, which removes the config option and only starts one keepalive routine per connection. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
This helps resolve an issue in stigged environments where the server-side ClientAliveInterval is set to a lower value (such as 300) and ClientAliveCount is set to 0. During the ssh provisioner if there is a long pause without data the server-side will disconnect causing terraform to fail. This would resolve that issue by sending session-level keepalive requests. The overhead of the ssh keepalive is minimal and shouldn't impact performance of anything.
This should resolve #18517 based off a cursory search of open issues.