-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
T22555 Retry downloads on flaky networks #120
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This introduces no functional changes, but will make some upcoming refactoring a little easier. Signed-off-by: Philip Withnall <withnall@endlessm.com> Closes: #1599 Approved by: jlebon
This introduces no functional changes, but does make the code a little cleaner. Signed-off-by: Philip Withnall <withnall@endlessm.com> Closes: #1599 Approved by: jlebon
Rename from `fdata` to `fetch_data` to clarify things and make it consistent with other similar functionality in the file. This introduces no functional changes. Signed-off-by: Philip Withnall <withnall@endlessm.com> Closes: #1599 Approved by: jlebon
This introduces no functional changes, but will make upcoming support for retrying downloads easier to add. Signed-off-by: Philip Withnall <withnall@endlessm.com> Closes: #1599 Approved by: jlebon
This introduces no functional changes, but will make upcoming support for retrying downloads easier to add. Signed-off-by: Philip Withnall <withnall@endlessm.com> Closes: #1599 Approved by: jlebon
This introduces no functional changes, but will make upcoming support for retrying downloads easier to add. Signed-off-by: Philip Withnall <withnall@endlessm.com> Closes: #1599 Approved by: jlebon
Allow network requests to be re-queued if they failed with a transient error, such as a socket timeout. Retry each request up to a limit (default: 5), and only then fail the entire pull and propagate the error to the caller. Add a new ostree_repo_pull_with_options() option, n-network-retries, to control the number of retries (including setting it back to the old default of 0, if the caller wants). Currently, retries are not supported for FetchDeltaSuperData requests, as they are not queued. Once they are queued, adding support for retries should be trivial. A FIXME comment has been left for this. Signed-off-by: Philip Withnall <withnall@endlessm.com> Closes: #1594 Approved by: jlebon
This allows the retry code in ostree-repo-pull.c to recover from (for example) timeouts at the libsoup layer in the stack, as well as from the GSocket layer in the stack. Signed-off-by: Philip Withnall <withnall@endlessm.com> Closes: #1594 Approved by: jlebon
This is exactly like the --random-500s option, except that it will cause error 408 (request timeout) to be returned, rather than error 500 (internal server error). This will be used in a following commit to test pull behaviour when timeouts occur. Signed-off-by: Philip Withnall <withnall@endlessm.com> Closes: #1594 Approved by: jlebon
Extend test-pull-repeated.sh to test error 408 as well as error 500, to ensure that the new retry-on-network-timeout code in ostree-repo-pull.c correctly retries. Rather than the 200 iterations needed for the error 500 tests, only do 5 iterations. The pull code internally does 5 retries (by default), which means a full iteration count of 25. That seems to be sufficient to make the tests reliably pass, in my testing — we can always bump it up to 200 / 5 = 40 in future if needed (to put it in parity with the error 500 tests). Signed-off-by: Philip Withnall <withnall@endlessm.com> Closes: #1594 Approved by: jlebon
Various of the counters already have assertions like this; add some more for total paranoia. Signed-off-by: Philip Withnall <withnall@endlessm.com> Closes: #1594 Approved by: jlebon
Use the same G_IO_ERROR_* values for HTTP status codes in both fetchers. The libsoup fetcher still handles a few more internal error codes than the libcurl one; this could be built on in future. Signed-off-by: Philip Withnall <withnall@endlessm.com> Closes: #1594 Approved by: jlebon
Just like all the other requests made for delta parts and objects by the pull code, use a queue for delta superblocks. Currently this doesn’t do any prioritisation or retries after transient failures, but it could do in future. This means that delta superblocks are now subject to the parallel request limit in the fetcher, which was a problem highlighted here: ostreedev/ostree#1453 (comment). Signed-off-by: Philip Withnall <withnall@endlessm.com>
Use the recently introduced architecture for retrying network requests on transient failure to do the same for delta superblock requests, now that they’re queued. Signed-off-by: Philip Withnall <withnall@endlessm.com>
I've confirmed that these commits match those of the upstream PRs. Since this has all been reviewed upstream, I'm going to go ahead and accept the PR so we can get started with testing. If anyone else feels like reviewing more in-depth, go for it, and we can clean up later if any further changes needed. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a straight backport of upstream’s: