Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ssh: Recreate connection on retries in setupProxy #309

Merged
merged 1 commit into from
Jan 16, 2024

Conversation

cfergeau
Copy link
Contributor

The previous fix was not working as expected, as the ssh go code will
close the underlying connection when there's a failure.
This was causing the retries for CreateBastion() to fail, as after the
first failure it would try to use a closed connection.

This commit recreates the connection each time before calling
CreateBastion() to fix this. This also simplifies the code.

Copy link
Member

@jakecorrenti jakecorrenti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The previous fix was not working as expected, as the ssh go code will
close the underlying connection when there's a failure.
This was causing the retries for CreateBastion() to fail, as after the
first failure it would try to use a closed connection.

This commit recreates the connection each time before calling
CreateBastion() to fix this. This also simplifies the code.

Signed-off-by: Christophe Fergeau <cfergeau@redhat.com>
@gbraad
Copy link
Member

gbraad commented Jan 15, 2024

@cfergeau guess this needs to be merged now.

@praveenkumar
Copy link
Contributor

CI is failing with following ( tried retest twice and both the time same). Not sure if this is because this PR or something else.

dns
/Users/runner/work/gvisor-tap-vsock/gvisor-tap-vsock/test/basic_test.go:49
  should resolve LDAP SRV record for google.com [It]
  /Users/runner/work/gvisor-tap-vsock/gvisor-tap-vsock/test/basic_test.go:72

  Unexpected error:
      <*exec.ExitError | 0xc0002861a0>: 

@cfergeau
Copy link
Contributor Author

The CI failure seems related to time="2024-01-15T15:06:07Z" level=info msg="Detected Kernel panic, retrying..." which is very unexpected. This test passed today in other PRs https://github.com/containers/gvisor-tap-vsock/actions/runs/7524944350/job/20490954791
I don't see how this PR could trigger a panic in the guest kernel, wondering if this could be a bug in the gh actions image, as the failing test uses a macos-12 with version: 20240105.1 while the working one uses 20240105.3

@baude
Copy link
Member

baude commented Jan 15, 2024

code LGTM

@baude
Copy link
Member

baude commented Jan 15, 2024

i re-ran the tests, assume it is a flake ... dealer's choice

/hold
/lgtm

@cfergeau
Copy link
Contributor Author

Yeah, also ran the tests on a macos13 x86_64 machine, and I could not reproduce the issue.

@cfergeau
Copy link
Contributor Author

Since this PR has the 'hold' label, I'll revert the changes for a try, and see if CI behaves better.

@cfergeau
Copy link
Contributor Author

Ok, even git master is currently failing, even if it was green yesterday. Let's ignore the red CI for now, I'll take a closer look with a dummy "debug CI" PR

Copy link
Contributor

openshift-ci bot commented Jan 16, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cfergeau, jakecorrenti

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cfergeau
Copy link
Contributor Author

/unhold

@cfergeau
Copy link
Contributor Author

And with one more try it's green, go figure!

@cfergeau
Copy link
Contributor Author

cfergeau commented Jan 16, 2024

Trying to readd /lgtm, but not sure I'm allowed to on my own PRs
/lgtm

Copy link
Contributor

openshift-ci bot commented Jan 16, 2024

@cfergeau: you cannot LGTM your own PR.

In response to this:

Trying to readd /lgtm, but not sure I'm allowed to
/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@benoitf
Copy link

benoitf commented Jan 16, 2024

/lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants