New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/crypto/ssh: Dial hangs in kexLoop indefinitely - ignoring ClientConfig.Timeout #51926
Labels
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone
Comments
CC @FiloSottile |
I have noticed that this seems to happen only on subsequent/concurrent connections. I could not reproduce it ever happening on the first connection. So potentially this is somehow related to #27140. |
pjbgf
added a commit
to pjbgf/source-controller
that referenced
this issue
Mar 25, 2022
The underlying SSH connections are kept open and are reused across several SSH sessions. This is due to upstream issues in which concurrent/parallel SSH connections may lead to instability. golang/go#51926 golang/go#27140 Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
pjbgf
added a commit
to pjbgf/source-controller
that referenced
this issue
Mar 25, 2022
The underlying SSH connections are kept open and are reused across several SSH sessions. This is due to upstream issues in which concurrent/parallel SSH connections may lead to instability. golang/go#51926 golang/go#27140 Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
pjbgf
added a commit
to pjbgf/source-controller
that referenced
this issue
Mar 25, 2022
The underlying SSH connections are kept open and are reused across several SSH sessions. This is due to upstream issues in which concurrent/parallel SSH connections may lead to instability. golang/go#51926 golang/go#27140 Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
pjbgf
added a commit
to pjbgf/source-controller
that referenced
this issue
Mar 25, 2022
The underlying SSH connections are kept open and are reused across several SSH sessions. This is due to upstream issues in which concurrent/parallel SSH connections may lead to instability. golang/go#51926 golang/go#27140 Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
pjbgf
added a commit
to pjbgf/source-controller
that referenced
this issue
Mar 25, 2022
The underlying SSH connections are kept open and are reused across several SSH sessions. This is due to upstream issues in which concurrent/parallel SSH connections may lead to instability. golang/go#51926 golang/go#27140 Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
pjbgf
added a commit
to pjbgf/source-controller
that referenced
this issue
Mar 28, 2022
The underlying SSH connections are kept open and are reused across several SSH sessions. This is due to upstream issues in which concurrent/parallel SSH connections may lead to instability. golang/go#51926 golang/go#27140 Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
By ensuring the session's StdoutPipe is serviced quickly, seems to resolve the problem, as mentioned on crypto/ssh comments: |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes, as this is library related using version:
I can confirm the issue also happens with previous versions:
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
The application implements a golang ssh transport that hangs indefinitely at
ssh.Dial
every so often.The current timeout is set to 30 seconds, which
ssh.Dial
does not uphold (https://github.com/fluxcd/source-controller/blob/main/pkg/git/libgit2/managed/ssh.go#L251-L255 https://github.com/fluxcd/source-controller/blob/main/pkg/git/libgit2/managed/init.go#L30).This is a low concurrency (2-4 parallel workers) application which creates multiple ssh connections to execute simple git operations.
The
ssh.Dial
uses thessh.ClientConfig
as below:Actual code can be seen at:
https://github.com/fluxcd/source-controller/blob/main/pkg/git/libgit2/managed/ssh.go#L166
What did you expect to see?
The
ssh.Dial
operation error if the Dial operation took longer than the pre-configured timeout.What did you see instead?
The goroutine hangs indefinitely.
pprof
shows the culprit being:The text was updated successfully, but these errors were encountered: