-
Notifications
You must be signed in to change notification settings - Fork 874
Optimize Part Stream for Multi part Download #4162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4e0bd93 to
34dad4c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR optimizes the S3 multipart download manager to reduce memory usage by streaming responses directly to the consumer when parts arrive in sequential order, while falling back to buffering for out-of-order parts. The optimization is transparent to callers and maintains backward compatibility.
Key Changes:
- Introduced
StreamingDataSourceclass to stream GetObjectResponse directly without buffering - Modified
BufferedPartDataHandlerto intelligently choose between streaming (in-order parts) and buffering (out-of-order parts) - Added
IPartBufferManager.AddBufferAsync(IPartDataSource)overload to support both streaming and buffered data sources - Updated response disposal logic to handle ownership transfer for streaming vs immediate disposal for buffering
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| generator/.DevConfigs/9d07dc1e-d82d-4f94-8700-c7b57f872045.json | Dev config with patch version bump and changelog (contains spelling error) |
| sdk/src/Services/S3/Custom/Transfer/Internal/StreamingDataSource.cs | New class implementing IPartDataSource for direct streaming from GetObjectResponse without buffering |
| sdk/src/Services/S3/Custom/Transfer/Internal/BufferedPartDataHandler.cs | Enhanced to decide between streaming vs buffering based on part arrival order |
| sdk/src/Services/S3/Custom/Transfer/Internal/IPartBufferManager.cs | Added AddBufferAsync(IPartDataSource) method to support both streaming and buffered sources |
| sdk/src/Services/S3/Custom/Transfer/Internal/PartBufferManager.cs | Implemented new AddBufferAsync overload to handle IPartDataSource types |
| sdk/src/Services/S3/Custom/Transfer/Internal/MultipartDownloadManager.cs | Updated response disposal logic to account for ownership transfer in streaming path |
| sdk/test/Services/S3/UnitTests/Custom/StreamingDataSourceTests.cs | Comprehensive unit tests for new StreamingDataSource class (708 lines) |
| sdk/test/Services/S3/UnitTests/Custom/BufferedPartDataHandlerTests.cs | Updated tests to cover streaming vs buffering decision logic and mixed scenarios |
| sdk/test/Services/S3/UnitTests/Custom/PartBufferManagerTests.cs | Added integration tests for StreamingDataSource with PartBufferManager |
generator/.DevConfigs/9d07dc1e-d82d-4f94-8700-c7b57f872045.json
Outdated
Show resolved
Hide resolved
sdk/src/Services/S3/Custom/Transfer/Internal/MultipartDownloadManager.cs
Outdated
Show resolved
Hide resolved
sdk/src/Services/S3/Custom/Transfer/Internal/BufferedPartDataHandler.cs
Outdated
Show resolved
Hide resolved
sdk/src/Services/S3/Custom/Transfer/Internal/BufferedPartDataHandler.cs
Outdated
Show resolved
Hide resolved
e75f10a to
6a42d97
Compare
6a42d97 to
28dbce5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
sdk/src/Services/S3/Custom/Transfer/Internal/BufferedPartDataHandler.cs
Outdated
Show resolved
Hide resolved
generator/.DevConfigs/9d07dc1e-d82d-4f94-8700-c7b57f872045.json
Outdated
Show resolved
Hide resolved
sdk/src/Services/S3/Custom/Transfer/Internal/BufferedPartDataHandler.cs
Outdated
Show resolved
Hide resolved
28dbce5 to
6e40599
Compare
| /// </list> | ||
| /// <para><strong>Response Ownership:</strong></para> | ||
| /// <para> | ||
| /// This method takes ownership of the response and is responsible for disposing it in ALL cases, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the reason for this is because previously MultipartDownloadManager disposed of the response in the finally block. but with this change of adding StreamingDataSource, if we kept the dispose in the MultipartDownloadManager, the response would've been dispoed by the time StreamingDataSource has to read it back to the user. Therefore we need to have StreamingDataSource keep ownership of the response
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
12b92d1 to
e6e17e3
Compare
| { | ||
| discoveryResult.InitialResponse.WriteObjectProgressEvent -= wrappedCallback; | ||
| // Always detach the event handler to prevent memory leak | ||
| // This runs whether ProcessPartAsync succeeds or throws |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not related to the change at all, but one potential issue i found anyway, so fixing it here
e6e17e3 to
432b08c
Compare
| // This is critical because GetObjectResponse holds unmanaged resources that | ||
| // won't be cleaned up by GC - must be explicitly disposed to return HTTP | ||
| // connection to the pool and close network streams | ||
| _discoveryResult?.InitialResponse?.Dispose(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unrelated to the optimizing stream for multi part download, but i discovered this potential resource leak while working on this
a522b17 to
135eacc
Compare
432b08c to
d268c0f
Compare
| { | ||
| Logger.DebugFormat("BufferedPartDataHandler: [Part {0}] Starting to buffer part from response stream - ContentLength={1}", | ||
| partNumber, response.ContentLength); | ||
| if (partNumber == _partBufferManager.NextExpectedPartNumber) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this "felt" kind of weird putting this logic here but i felt it was fine for now
| "serviceName": "S3", | ||
| "type": "patch", | ||
| "changeLogMessages": [ | ||
| "Optimized multipart download manager to stream responses directly where applicable." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this an optimization for a feature we haven't yet released? Maybe we don't need a changelog in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah you are right i can remove this dev config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed it
135eacc to
105d5c6
Compare
remove dev config
493f981 to
6d72f9a
Compare
Previously in #4130, we would buffer every part, including the first part. This was not optimal because the first part can technically be streamed directly to the user because it will always be in order. Similarly, there is another case where if a part finishes downloading and is the next expected part, we can stream that right to the user instead of buffering.
This change adds support so that in those cases the part is streamed directly to the user instead of buffering in memory.
Changes
Motivation and Context
#3806
Testing
Re-ran performance tests and got similar results
Types of changes
Checklist
License