-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry RPM cloning on http/502 error (bad gateway) #7373
Conversation
for _, line := range strings.Split(stderr, "\n") { | ||
if strings.Contains(line, "Error: 502 when downloading") { | ||
logger.Log.Warn("Encountered possibly intermittent HTTP 502 error.") | ||
retriable = true | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things:
- Do we only want to deal with 502s or all 5XX errors? Since we're here, it might be worth handling all server-side issues.
- What do you think about dropping the
for
loop in favour of a regular expression?
for _, line := range strings.Split(stderr, "\n") { | |
if strings.Contains(line, "Error: 502 when downloading") { | |
logger.Log.Warn("Encountered possibly intermittent HTTP 502 error.") | |
retriable = true | |
} | |
} | |
serverErrorMatch := serverErrorsRegex.FindStringSubmatch(stderr) | |
if len(serverErrorMatch) > errorCodeIndex { | |
logger.Log.Warn("Encountered possibly intermittent HTTP %d error.", serverErrorMatch[errorCodeIndex]) | |
retriable = true | |
} |
The regex and the index could be defined as global, private variables (making it global forces a one-time compilation of the regex during program's start-up):
// In global context:
var (
(...)
serverErrorsRegex = regexp.MustCompile(`(?m)Error: (5\d{2}) when downloading`)
errorCodeIndex = 1
...
toolkit/tools/internal/packagerepo/repocloner/rpmrepocloner/rpmrepocloner.go
Show resolved
Hide resolved
// | ||
// *NOTE* These values are copied from downloader/downloader.go; they need not be the same but seemed like a | ||
// good enough starting point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment here just for the sake of the draft? It looks like something that we may not want in the final version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd actually appreciate suggestions of how best to share these -- or perspectives on whether they should be? (Should we create a wrapper with these defaults?)
Perhaps you or @dmcilvaney have thoughts on that?
downloadErr, retriable := tdnfDownload(finalArgs...) | ||
if downloadErr != nil { | ||
if retriable { | ||
logger.Log.Warnf("Attempt %d/%d: Failed to clone packages", retryNum, downloadRetryAttempts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd change that to a debug entry. These messages tend to generate noise, confuse people who don't know why they are generated (which is most users) and are red herrings when we debug issues.
logger.Log.Warnf("Attempt %d/%d: Failed to clone packages", retryNum, downloadRetryAttempts) | |
logger.Log.Debugf("Attempt %d/%d: Failed to clone packages", retryNum, downloadRetryAttempts) |
toolkit/tools/internal/packagerepo/repocloner/rpmrepocloner/rpmrepocloner.go
Show resolved
Hide resolved
toolkit/tools/internal/packagerepo/repocloner/rpmrepocloner/rpmrepocloner.go
Show resolved
Hide resolved
if strings.HasPrefix(trimmedLine, toyboxConflictsPrefix) { | ||
logger.Log.Warn("Ignoring known toybox conflict") | ||
err = nil | ||
continue | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've seen a question about toybox
and if we still need this check. It looks like toybox
is not present in 2.0, so I think we can remove this logic completely.
Moving the PR to #7445. |
Merge Checklist
All boxes should be checked before merging the PR (just tick any boxes which don't apply to this PR)
*-static
subpackages, etc.) have had theirRelease
tag incremented../cgmanifest.json
,./toolkit/scripts/toolchain/cgmanifest.json
,.github/workflows/cgmanifest.json
)./SPECS/LICENSES-AND-NOTICES/data/licenses.json
,./SPECS/LICENSES-AND-NOTICES/LICENSES-MAP.md
,./SPECS/LICENSES-AND-NOTICES/LICENSE-EXCEPTIONS.PHOTON
)*.signatures.json
filessudo make go-tidy-all
andsudo make go-test-coverage
passSummary
We continue to intermittently see HTTP 502 errors (Bad Gateway) when tdnf download'ing packages from packages.microsoft.com, usually in build pipelines. This PR introduces granular retry logic for that specific failure case. (We can't retry on any HTTP error, as 404 is an expected error that shows up in normal execution.)
Change Log
Update the rpmrepcloner to inspect stderr on tdnf download failure. If the specific known error string indicating a 502 error is found, then we retry with exponential backoff.
Does this affect the toolchain?
No
Test Methodology
(Testing in progress)