New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1990140: add connection with timeout in TBR accessibility check to expedite 'disconnected' mode #384
Conversation
@gabemontero: An error was encountered searching for bug 1990140 on the Bugzilla server at https://bugzilla.redhat.com. No known errors were detected, please see the full error message for details. Full error message.
could not unmarshal response body: invalid character '<' looking for beginning of value
Please contact an administrator to resolve this issue, then request a bug refresh with In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
pkg/stub/handler.go
Outdated
} | ||
defer connWithTimeout.Close() | ||
// still do the tls form of connect (which does not have the handy timeout form of dial) to confirm | ||
// ssl handshake is OK | ||
tlsConf := &tls.Config{} | ||
conn, err := tls.Dial("tcp", "registry.redhat.io:443", tlsConf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can use tls.Client(connWithTimeout, tlsConf)
(maybe with minor tweaks: https://cs.opensource.google/go/go/+/refs/tags/go1.16.7:src/crypto/tls/tls.go;l=154-164;drc=refs%2Ftags%2Fgo1.16.7)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep I'll add that - thanks @dmage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after an iteration to addres either ServerName or InsecureSkipVerify must be specified in the tls.Config
I have it working
pushing update momentarily
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update pushed @dmage thanks
ea3e195
to
1788e2f
Compare
…isconnected' mode
1788e2f
to
36d24c2
Compare
unrelated flake in e2e-aws that is already noted in sippy (fails ~20% of the time) /test e2e-aws |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, just one question for me
// we have seen cases in the field with disconnected cluster where the default connection timeout can be | ||
// very long (15 minutes in one case); so we do an initial non-tls connection were we can specify a quicker | ||
// timeout to filter out that scenario and default to tbr inaccessible / Removed in an expedient fashion | ||
connWithTimeout, err := net.DialTimeout("tcp", "registry.redhat.io:443", 15*time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
15 still a best guest number that might change in the future. is this something we can make configurable so we can have the customer change if it does not work for them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good question @dperaza4dustbit
config changes where we have to change the config object and add a new field are expensive / costly
admittedly a "gut feeling", but for what we are dealing with here, it would be better to just have a hard coded value that is sufficient
at most, maybe add an environment variable on the deployment that could be read
minimally, I would be agreeable to running say the e2e-aws-operator test suite repeated times (maybe a dozen) to get a warmer fuzzy
/test e2e-aws-operator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK understand @gabemontero , maybe take me through the process of making a config change to get a feeling on the price there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep let's add that to Thursday's call, along with the cross referencing CI flakes with sippy that I mentioned in your openshift/origin PR
/bugzilla refresh |
@gabemontero: This pull request references Bugzilla bug 1990140, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/skip |
/retest |
/retest |
several operators below samples were degraded in e2e-aws-upgrade /retest |
/test e2e-aws-operator |
/retest /test e2e-aws-operator |
/retest |
2 similar comments
/retest |
/retest |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dperaza4dustbit, gabemontero The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest |
1 similar comment
/retest |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
8 similar comments
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
@gabemontero: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
1 similar comment
/retest-required Please review the full test history for this PR and help us cut down flakes. |
@gabemontero: All pull requests linked via external trackers have merged: Bugzilla bug 1990140 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/assign @dperaza4dustbit