New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: Use longer timeout for ginkgo DNS lookups #12062
Conversation
FQDN tests have been failing due to DNS lookups from ginkgo not succeeding in 30 seconds. Use the longer HelperTimeout (4 minutes) instead. Also split the two lookups into two separate WithTimeout() invocations, so that we do not need to repeat the 1st if the 2nd fails. Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
test-me-please |
Do we really expect a DNS request to take more than 30 seconds? Maybe we could add retries instead? What's the risk of letting the tests proceed if the domains could not be resolved? We already have the IPs above and they don't seem that unstable. Would we risk missing actual test failures if we issue a warning instead of an error when the preparatory DNS request fails? |
@pchaigno Based on the test run of this PR |
addrs, lookupErr = net.LookupHost("vagrant-cache.ci.cilium.io") | ||
if lookupErr != nil { | ||
lookupErr = fmt.Errorf("error looking up vagrant-cache.ci.cilium.io: %s", lookupErr) | ||
addrs, err2 := net.LookupHost("vagrant-cache.ci.cilium.io") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use google.com or something similar with high availability guarantees?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think changing the domain name we request will have any impact on the success of DNS requests. We already use Google's DNS, if I recall correctly, to resolve domain names.
The problem is likely Packet's connection to the outside, so any DNS resolver will have the same issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hence a longer timeout may help to carry over intermittent connectivity issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That DNS resolution relies on getaddrinfo(3)
under the hood. I don't think getaddrinfo(3)
implements any retry. I couldn't find any retry mechanism at the Golang layer either.
In case of intermittent connectivity issues, the request packet won't stay around, it will be dropped. So I don't think waiting longer for an answer is going to solve this.
How many attempts are actually made? The Go net library doesn't document how long the |
@pchaigno @joestringer The default call interval of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I keep forgetting that WithTimeout
implements retries :/
LGTM!
This assumes that |
Good question! According to Go docs |
FQDN tests have been failing due to DNS lookups from ginkgo not
succeeding in 30 seconds. Use the longer HelperTimeout (4 minutes)
instead.
Also split the two lookups into two separate WithTimeout()
invocations, so that we do not need to repeat the 1st if the 2nd
fails.
Related: #11848
Related: #10538
Signed-off-by: Jarno Rajahalme jarno@covalent.io