-
Notifications
You must be signed in to change notification settings - Fork 874
DownloadWithResponse with multipartdownload #4136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9e63b12 to
fa59362
Compare
fa59362 to
3178a91
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a new DownloadWithResponseAsync API for the AWS S3 SDK that enables multipart downloads with concurrent part downloads for improved performance. The implementation follows the S3 Express Protocol (SEP) specifications for atomic file operations and composite checksum handling.
Key Changes:
- Adds new
DownloadWithResponseAsyncmethods to TransferUtility that return response metadata - Implements multipart download infrastructure with concurrent part downloads
- Adds atomic file handling using temporary files with
.s3tmp.{uniqueId}pattern - Includes comprehensive unit and integration tests
Reviewed Changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| TransferUtility.async.cs | Added DownloadWithResponseAsync methods for async multipart downloads |
| TransferUtility.sync.cs | Added synchronous DownloadWithResponse wrapper methods |
| MultipartDownloadCommand.cs | Core orchestration logic for multipart download execution |
| MultipartDownloadCommand.async.cs | Async implementation of multipart download with response mapping |
| FilePartDataHandler.cs | Handles concurrent file writes at specific offsets for downloaded parts |
| FileDownloadConfiguration.cs | Configuration class for file-based multipart downloads |
| AtomicFileHandler.cs | Manages atomic file operations with temporary files |
| MultipartDownloadCoordinator.cs | Added PrepareAsync call and null checks for discovery methods |
| IPartDataHandler.cs | Added PrepareAsync interface method |
| BufferedPartDataHandler.cs | Added no-op PrepareAsync implementation |
| MultipartDownloadTestHelpers.cs | Added helper methods for test file management and verification |
| MultipartDownloadCommandTests.cs | Unit tests for MultipartDownloadCommand |
| FilePartDataHandlerTests.cs | Unit tests for FilePartDataHandler |
| FileDownloadConfigurationTests.cs | Unit tests for FileDownloadConfiguration |
| AtomicFileHandlerTests.cs | Unit tests for AtomicFileHandler |
| TransferUtilityDownloadWithResponseTests.cs | Integration tests for end-to-end download functionality |
| 9d07dc1e-d82d-4f94-8700-c7b57f872043.json | Change log configuration for the new API |
sdk/test/Services/S3/IntegrationTests/TransferUtilityDownloadWithResponseTests.cs
Show resolved
Hide resolved
sdk/test/Services/S3/IntegrationTests/TransferUtilityDownloadWithResponseTests.cs
Outdated
Show resolved
Hide resolved
sdk/src/Services/S3/Custom/Transfer/Internal/AtomicFileHandler.cs
Outdated
Show resolved
Hide resolved
3178a91 to
87db7ab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 19 out of 19 changed files in this pull request and generated 6 comments.
sdk/src/Services/S3/Custom/Transfer/Internal/FilePartDataHandler.cs
Outdated
Show resolved
Hide resolved
sdk/src/Services/S3/Custom/Transfer/Internal/_async/MultipartDownloadCommand.async.cs
Outdated
Show resolved
Hide resolved
sdk/src/Services/S3/Custom/Transfer/Internal/AtomicFileHandler.cs
Outdated
Show resolved
Hide resolved
sdk/src/Services/S3/Custom/Transfer/_bcl+netstandard/TransferUtility.sync.cs
Show resolved
Hide resolved
sdk/src/Services/S3/Custom/Transfer/_bcl+netstandard/TransferUtility.sync.cs
Show resolved
Hide resolved
437adfe to
6bb7a18
Compare
fad732f to
0b3ed24
Compare
fd1aba4 to
7269f56
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated 8 comments.
sdk/src/Services/S3/Custom/Transfer/Internal/AtomicFileHandler.cs
Outdated
Show resolved
Hide resolved
sdk/src/Services/S3/Custom/Transfer/Internal/MultipartDownloadManager.cs
Show resolved
Hide resolved
sdk/src/Services/S3/Custom/Transfer/Internal/MultipartDownloadManager.cs
Show resolved
Hide resolved
sdk/src/Services/S3/Custom/Transfer/Internal/FileDownloadConfiguration.cs
Show resolved
Hide resolved
85641af to
7b371d0
Compare
faca805 to
42d259c
Compare
| Logger.DebugFormat("FilePartDataHandler: Opening file for writing at offset {0} with BufferSize={1}", | ||
| offset, _config.BufferSize); | ||
|
|
||
| // Open file with FileShare.Write to allow concurrent writes from other threads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ive created sdk/test/Services/S3/UnitTests/Custom/FilePartDataHandlerConcurrencyTests.cs which validates this is safe (i.e each part creating its own filestream object and writing to a different offset at the same time)
| // - Progress tracking with events | ||
| // - Size validation (ContentLength vs bytes read) | ||
| // - Buffered reading with proper chunk sizes | ||
| await response.WriteResponseStreamAsync( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated this to use the same helper function i made reading response stream. I retested again and i see same performance results
Total bytes per run: 5,368,709,120
Run:1 Secs:2.494888 Gb/s:17.215071
Run:2 Secs:1.263313 Gb/s:33.997642
Run:3 Secs:1.120677 Gb/s:38.324749
Run:4 Secs:1.062679 Gb/s:40.416394
Run:5 Secs:1.032120 Gb/s:41.613045
Run:6 Secs:1.117588 Gb/s:38.430699
Run:7 Secs:1.087574 Gb/s:39.491276
Run:8 Secs:1.088066 Gb/s:39.473393
Run:9 Secs:1.120819 Gb/s:38.319901
Run:10 Secs:1.113135 Gb/s:38.584438
| // Set ContentRange to represent the entire object: bytes 0-(ContentLength-1)/ContentLength | ||
| response.ContentRange = $"bytes 0-{discoveryResult.ObjectSize - 1}/{discoveryResult.ObjectSize}"; | ||
| // S3 returns null for 0-byte objects, so we match that behavior | ||
| if (discoveryResult.ObjectSize == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a integration test for multi part download to file discovered this issue, so adding here to be consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 24 out of 24 changed files in this pull request and generated 1 comment.
42d259c to
e6a92d7
Compare
| /// <param name="bufferSize">Buffer size for reading/writing operations</param> | ||
| /// <param name="cancellationToken">Cancellation token</param> | ||
| /// <param name="validateSize">Whether to validate copied bytes match ContentLength</param> | ||
| internal async System.Threading.Tasks.Task WriteResponseStreamAsync( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ive just added this as a helper function for writing to a response stream instead of a file so i can also use the same logic in BufferedPartDataHandler
52ae28f to
407aeb6
Compare
| } | ||
|
|
||
| // Try up to 100 times to create unique file atomically | ||
| for (int attempt = 0; attempt < 100; attempt++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if the for loop is really needed since in reality random should produce a unique id. we can remove this if we think its overkill
| // SEP Part GET Step 2: "send the request and wait for the response in a non-blocking fashion" | ||
| var firstPartResponse = await _s3Client.GetObjectAsync(firstPartRequest, cancellationToken).ConfigureAwait(false); | ||
|
|
||
| if (firstPartResponse == null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there was some null pointer exception discovered here in a unit test so added this check
normj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple minor nit comments that I don't need to rereview for.
| #if NETSTANDARD | ||
| Stream stream = this.ResponseStream; | ||
| #else | ||
| Stream stream = new BufferedStream(this.ResponseStream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does .NET Framework need to wrap the ResponseStream in a a BufferedStream but not .NET Core? If there is a good reason add a comment why it is being done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i dont have the answer for this but i kept the logic because it was there before, which is why i didnt add a comment because i dont know 100%
| /// Generates a cryptographically secure unique identifier of specified length. | ||
| /// Uses base32 encoding to avoid filesystem-problematic characters. | ||
| /// </summary> | ||
| private string GenerateUniqueId(int length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Name GenerateRandomId since you can't guarantee the id being unique and probably don't say it is unique in the comment block above. You have the other block of code calling this to generate a tmp file and retrying up to 100 times till you find a unique value.
407aeb6 to
7587349
Compare
Description
Add DownloadWithResponse api which returns the response metadata to the user as well as performing multi part download behind the scenes. This api will download to a file.
Progress tracking is not included in this PR. that will be included in the next pr #4139
Motivation and Context
#3806
Testing
DownloadWithResponseAsync api (includes multipart)
vs old DownloadAsync api (non multipart)
Types of changes
Checklist
License