-
Notifications
You must be signed in to change notification settings - Fork 511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Give the ping context a timeout. #1163
Conversation
Debugging why requests to an insecure registry seem to hang, I noticed that there are 3+ requests to the https endpoint before it falls back to http (at that point my logs truncate because the test was over). The time between the request being issued and completing in the roundtripper was ~30s each. This change gives each ping a generous 5s timeout to complete each request.
Codecov Report
@@ Coverage Diff @@
## main #1163 +/- ##
=======================================
Coverage 75.22% 75.22%
=======================================
Files 108 108
Lines 7850 7852 +2
=======================================
+ Hits 5905 5907 +2
Misses 1379 1379
Partials 566 566
Continue to review full report at Codecov.
|
I'm testing another mechanism for this, where I pass a custom http.Transport (based on the default transport) with a shorter dial timeout:
If this works, we may want to consider something like this as a better default transport for GGCR since the experience with a repeated dial timeout on the https ping retried 5(!!!) times is miserable. I haven't confirmed, but I believe I'm hitting the same thing publishing stuff with cosign as well and I'm sure all the other tools will be affected (ko, buildpacks, kaniko). I'll report back as to whether this actually works, and if so we can discuss which change we'd rather have. |
The custom transport works. Given our built-in retries, I'd be comfortable moving the direction of the transport, but I'm open to other options as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok to just merge this until we can find a better solution.
This reverts commit 080751a.
This reverts commit 080751a.
This reverts commit 080751a.
* Revert "Give the ping context a timeout. (#1163)" This reverts commit 080751a. * Define a new `remote.DefaultTransport`. Previously we used `http.DefaultTransport` (on which this is based), but this uses a default dial timeout of 30s, which when we (by default) wrap things in 5x retries can lead to ~150s delays simply pinging an http-based registry. If folks encounter issues with this via the library they can restore the current behavior with: ```go remote.WithTransport(http.DefaultTransport) ``` If folks encounter issues with this via `crane` they can restore the current behavior with: ``` --dial-timeout 30s ``` * Switch to an impossible domain name * Back out the crane flag
Debugging why requests to an insecure registry seem to hang, I noticed that there are 3+ requests to the https endpoint before it falls back to http (at that point my logs truncate because the test was over). The time between the request being issued and completing in the roundtripper was ~30s each.
This change gives each ping a generous 5s timeout to complete each request.