Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry artifact download when response stream is truncated #652

Closed

Conversation

hashtagchris
Copy link
Contributor

@hashtagchris hashtagchris commented Nov 26, 2020

Retry the artifact download when gunzip fails or the response stream is truncated.

Closes actions/download-artifact#53 and actions/download-artifact#55

@@ -263,26 +263,47 @@ export class DownloadHttpClient {
if (isGzip) {
const gunzip = zlib.createGunzip()
response.message
.on('error', error => {
Copy link
Contributor Author

@hashtagchris hashtagchris Nov 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing on('error', ...) callback below isn't called when gunzip fails, or response.message produces an error. pipe doesn't propagate errors across streams, so you have to monitor each. Without the new blocks the error is unhandled, and the new test times out.

I'm closing the downstream stream (hah!) on error. From https://nodejs.org/dist/latest-v14.x/docs/api/stream.html#stream_readable_pipe_destination_options:

One important caveat is that if the Readable stream emits an error during processing, the Writable destination is not closed automatically. If an error occurs, it will be necessary to manually close each stream in order to prevent memory leaks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little bit unclear about when to retry here. If the download failed because the socket is opened too long (for downloading large files), would retry help?

Copy link
Contributor Author

@hashtagchris hashtagchris Nov 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. In short, I'm hoping retries will help. 😄

In the case of smaller files, I think there are likely transient http disconnects (timeouts?) that will benefit from retries.

For larger files (actions/download-artifact#55), is there a technical reason we'd be able to upload large files we can't download? Do we only chunk on upload? Should we add a max size check to upload-artifact if download-artifact is unlikely to succeed?

One option to add retries now, and then revisit this code if customers continue to report issues, say with larger files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do chunked upload but I don't believe we do chunked download.

and then revisit this code if customers still report issues with larger files.

This is what I am afraid of, since in actions/download-artifact#55 the description says there is a 100% failure rate, I am afraid the retry will be in vain, and we will incur more minutes per build from those retries.

@hashtagchris hashtagchris force-pushed the hashtagchris-artifact-flaky-response-stream branch from 2be0cdb to 7ade2bb Compare November 30, 2020 16:24
Base automatically changed from hashtagchris-test-pipeResponseToFile to main November 30, 2020 17:12
@hashtagchris
Copy link
Contributor Author

#661 supersedes this PR. It includes the commits in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Z_BUF_ERROR
2 participants