Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ssh keepalives to ssh communicator #19339

Closed
wants to merge 2 commits into from
Closed

Add ssh keepalives to ssh communicator #19339

wants to merge 2 commits into from

Conversation

spion06
Copy link

@spion06 spion06 commented Nov 9, 2018

This helps resolve an issue in stigged environments where the server-side ClientAliveInterval is set to a lower value (such as 300) and ClientAliveCount is set to 0. During the ssh provisioner if there is a long pause without data the server-side will disconnect causing terraform to fail. This would resolve that issue by sending session-level keepalive requests. The overhead of the ssh keepalive is minimal and shouldn't impact performance of anything.

This should resolve #18517 based off a cursory search of open issues.

Copy link
Member

@jbardin jbardin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @spion06,

Thanks for the contribution!

While adding keepalive messages might be useful in some circumstances, I'm wondering if it's a knob we need to expose to users at all. The provisioner are not long-lived connections, so the tuning of the frequency isn't really of much use. I think we might just want a sane default (less than 30s, since that's a common network timeout setting elsewhere) and always have it enabled.

Hold off on rewriting the PR though, until we have a plan for implementation.

// Start a goroutine that executes every KeepAliveInterval to keep the session alive
// will return when the session no longer responds to requests
go func() {
if c.config.KeepAliveInterval < 0 {
Copy link
Member

@jbardin jbardin Dec 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will panic below with a timeout of 0, and should probably be limited to a reasonable minimum like 1 second, so we don't flood the connection with 1ns keepalives.

@@ -51,6 +54,9 @@ type connectionInfo struct {
ScriptPath string `mapstructure:"script_path"`
TimeoutVal time.Duration `mapstructure:"-"`

KeepAliveInterval string `mapstructure:"keep_alive_interval"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't added to config validation, so it can't actually be used from a configuration.

@jbardin
Copy link
Member

jbardin commented Feb 22, 2019

Closing in favor of #20437, which removes the config option and only starts one keepalive routine per connection.

@jbardin jbardin closed this Feb 22, 2019
@ghost
Copy link

ghost commented Mar 29, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Mar 29, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remote Exec Failing Even After Successful Execution of the Script
2 participants