Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ssh-agent: Increase timeout before we explicitly close connection #3631

Merged

Conversation

flouthoc
Copy link
Collaborator

@flouthoc flouthoc commented Nov 17, 2021

There are cases where remote will close connection by itself with a message
make sure we give connection enough time instead of closing explictly
early.

Future improvement: Relay output and perform close instead of relying on ServeAgent to flush
buffer by closing connection.

Output after this:

STEP 4/4: RUN --mount=type=ssh,required=true ssh git@github.com
Pseudo-terminal will not be allocated because stdin is not a terminal.
Warning: Permanently added the ECDSA host key for IP address '<>' to the list of known hosts.
agent key .. returned incorrect signature type
Hi flouthoc! You've successfully authenticated, but GitHub does not provide shell access.

[NO TESTS NEEDED]
There is no convenient way to test this in CI.

Closes: #3587

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 17, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: flouthoc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

There are cases where remote will close connection by itself with a message
make sure we give connection enough time instead of closing explictly
early.

Future improvement: Relay output and perform close instead of relying on  `ServeAgent` to flush
buffer by closing connection.

[NO NEW TESTS NEEDED]

Signed-off-by: Aditya Rajan <arajan@redhat.com>
@flouthoc
Copy link
Collaborator Author

Patch verified here: #3587 (comment)

@rhatdan
Copy link
Member

rhatdan commented Nov 17, 2021

@vrothberg @ashley-cui PTAL

go func() {
time.Sleep(500 * time.Millisecond)
time.Sleep(2000 * time.Millisecond)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with the value, but just wonder if we could instead get this value from a config file.

Copy link
Collaborator Author

@flouthoc flouthoc Nov 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomSweeneyRedHat Sure we could do that but I am confused if this should be visible to end-user and anybody would ever want to configure this.

Ideally we should remove timeout and add buffer forwarding mechanism which should close connection but that seemed like a bigger change to me and would like to refactor bunch of things in sshagent after discussing with maintainers in a separate thread.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No to config, since users will have no idea what to set it. None of us have any idea either, since it will probably vary under load.

I say increase the hack for now, and then consider fixing this long term in the future.
LGTM

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I concur with Dan. At least I would have absolutely no feeling for what a good value for that would be, and also be worried why I have to make that decision.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, it would not be a config setting I'd heavily document. More of a tool for support if the issue popped up in the future, or we found out that 2,000 was too much and was causing some other unanticipated error. I think Dan and Valentin being unsure about a valid value is why I would like to make it configurable. But hopefully, this will fix this, and it will be a non-issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that if this happens again in the field, then we should put in the work to build a complete solution.

Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Nov 18, 2021
@openshift-merge-robot openshift-merge-robot merged commit 87364fa into containers:main Nov 18, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
5 participants