Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/openshift-install/gather: Recognize "connection refused" #2810

Closed

Conversation

wking
Copy link
Member

@wking wking commented Dec 13, 2019

Before this commit, bootstrap machines that failed to come up would look like:

level=info msg="Waiting up to 30m0s for the Kubernetes API at https://api.ci-op-6266tp8r-77109.origin-ci-int-aws.dev.rhcloud.com:6443..."
level=error msg="Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get https://api.ci-op-6266tp8r-77109.origin-ci-int-aws.dev.rhcloud.com:6443/apis/config.openshift.io/v1/clusteroperators: dial tcp 3.221.214.197:6443: connect: connection refused"
level=info msg="Pulling debug logs from the bootstrap machine"
level=error msg="Attempted to gather debug logs after installation failure: failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key: dial tcp 3.84.188.207:22: connect: connection refused"
level=fatal msg="Bootstrap failed to complete: waiting for Kubernetes API: context deadline exceeded"

With this commit, that last error will look like:

level=error msg="Attempted to gather debug logs after installation failure: failed to connect to the bootstrap machine: dial tcp 3.84.188.207:22: connect: connection refused"

without the unrelated (to this failure mode) distraction about SSH keys.

@openshift-ci-robot openshift-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Dec 13, 2019
@wking wking force-pushed the gather-ssh-recognize-connection-refused branch from 1589774 to 3733c59 Compare December 13, 2019 00:06
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign wking
You can assign the PR to them by writing /assign @wking in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Before this commit, bootstrap machines that failed to come up would
look like [1]:

  level=info msg="Waiting up to 30m0s for the Kubernetes API at https://api.ci-op-6266tp8r-77109.origin-ci-int-aws.dev.rhcloud.com:6443..."
  level=error msg="Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get https://api.ci-op-6266tp8r-77109.origin-ci-int-aws.dev.rhcloud.com:6443/apis/config.openshift.io/v1/clusteroperators: dial tcp 3.221.214.197:6443: connect: connection refused"
  level=info msg="Pulling debug logs from the bootstrap machine"
  level=error msg="Attempted to gather debug logs after installation failure: failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key: dial tcp 3.84.188.207:22: connect: connection refused"
  level=fatal msg="Bootstrap failed to complete: waiting for Kubernetes API: context deadline exceeded"

With this commit, that last error will look like:

  level=error msg="Attempted to gather debug logs after installation failure: failed to connect to the bootstrap machine: dial tcp 3.84.188.207:22: connect: connection refused"

without the unrelated (to this failure mode) distraction about SSH
keys.

[1]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/12076
return errors.Wrap(err, "failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key")
} else if err != nil {
if err != nil {
if errno, ok := err.(syscall.Errno); ok && errno == syscall.ECONNREFUSED {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm.. shouldn't see https://golang.org/pkg/net/#OpError ? and the OpError.Err contains the syscall error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll just wait for Go 1.13 and the Unwrap business.

@jstuever
Copy link
Contributor

/uncc @jstuever
Feel free to add me back if/when ready to move forward with this one.

@openshift-ci-robot openshift-ci-robot removed the request for review from jstuever January 10, 2020 23:27
@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 31, 2020
@openshift-ci-robot
Copy link
Contributor

@wking: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link
Contributor

@wking: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/yaml-lint 684d92c link /test yaml-lint
ci/prow/shellcheck 684d92c link /test shellcheck
ci/prow/tf-lint 684d92c link /test tf-lint
ci/prow/e2e-aws-upgrade 684d92c link /test e2e-aws-upgrade
ci/prow/govet 684d92c link /test govet
ci/prow/golint 684d92c link /test golint
ci/prow/images 684d92c link /test images
ci/prow/unit 684d92c link /test unit
ci/prow/gofmt 684d92c link /test gofmt
ci/prow/verify-vendor 684d92c link /test verify-vendor

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@abhinavdahiya abhinavdahiya added version/4.6 Tracking changes that should end up in 4.6 release and removed version/4.5 labels May 8, 2020
@abhinavdahiya
Copy link
Contributor

replaced by #3615

/close

@openshift-ci-robot
Copy link
Contributor

@abhinavdahiya: Closed this PR.

In response to this:

replaced by #3615

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@wking wking deleted the gather-ssh-recognize-connection-refused branch June 2, 2020 03:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. version/4.6 Tracking changes that should end up in 4.6 release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants