Skip to content

Conversation

@GarrettBeatty
Copy link
Contributor

@GarrettBeatty GarrettBeatty commented Nov 18, 2025

This PR adds progress tracking for multi part downloads.

Description

  1. We add the initiated, failed, completed, and in progress tracking

Motivation and Context

#3806

Testing

  1. Unit Tests
  2. Integration Tests

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • My code follows the code style of this project
  • My change requires a change to the documentation
  • I have updated the documentation accordingly
  • I have read the README document
  • I have added tests to cover my changes
  • All new and existing tests passed

License

  • I confirm that this pull request can be released under the Apache 2 license

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds progress tracking and lifecycle events (initiated, completed, failed) for multipart downloads to files in the S3 Transfer Utility. The implementation aims to provide parity with existing single-part download progress tracking by introducing three new events and integrating progress callbacks into the multipart download flow.

  • New Events: Adds DownloadInitiatedEvent, DownloadCompletedEvent, and DownloadFailedEvent to TransferUtilityDownloadRequest with corresponding event args classes
  • Progress Integration: Modifies MultipartDownloadCoordinator to accept and attach progress callbacks to individual part downloads
  • Comprehensive Tests: Adds three integration tests covering progress events, lifecycle events, and failure scenarios

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
sdk/test/NetStandard/IntegrationTests/IntegrationTests/S3/TransferUtilityTests.cs Adds three integration tests: multipart download progress tracking, initiated/completed events, and failed event handling
sdk/src/Services/S3/Custom/Transfer/Internal/_async/MultipartDownloadCommand.async.cs Fires initiated, completed, and failed events at appropriate points in the download lifecycle
sdk/src/Services/S3/Custom/Transfer/Internal/MultipartDownloadCoordinator.cs Adds progress callback parameter and attaches it to GetObjectResponse for each part download
sdk/src/Services/S3/Custom/Transfer/Internal/MultipartDownloadCommand.cs Implements progress tracking with thread-safe byte counting and event helper methods
sdk/src/Services/S3/Custom/Transfer/Internal/IDownloadCoordinator.cs Extends interface signature to accept optional progress callback parameter

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

@GarrettBeatty GarrettBeatty force-pushed the gcbeatty/multifile branch 6 times, most recently from 85641af to 7b371d0 Compare November 19, 2025 22:59
@GarrettBeatty GarrettBeatty force-pushed the gcbeatty/progresstrackmpd2 branch from 77417be to e479ed2 Compare November 20, 2025 15:59
Copilot finished reviewing on behalf of GarrettBeatty November 20, 2025 16:17
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

@GarrettBeatty GarrettBeatty force-pushed the gcbeatty/progresstrackmpd2 branch from 7575f5b to 999dd4e Compare November 20, 2025 17:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Comment on lines +159 to 182
// Attach progress callback to Part 1's response if provided
if (wrappedCallback != null)
{
discoveryResult.InitialResponse.WriteObjectProgressEvent += wrappedCallback;
}

// Process Part 1 from InitialResponse (applies to both single-part and multipart)
Logger.DebugFormat("MultipartDownloadManager: Buffering Part 1 from discovery response");
await _dataHandler.ProcessPartAsync(1, discoveryResult.InitialResponse, cancellationToken).ConfigureAwait(false);
Copy link

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The event handler attached to discoveryResult.InitialResponse.WriteObjectProgressEvent on line 162 is never detached, which could lead to memory leaks if the response object is retained. The response is disposed in a finally block (line 335), but event handlers can prevent proper garbage collection.

Consider detaching the handler after processing Part 1:

// Process Part 1 from InitialResponse
Logger.DebugFormat("MultipartDownloadManager: Buffering Part 1 from discovery response");
await _dataHandler.ProcessPartAsync(1, discoveryResult.InitialResponse, cancellationToken).ConfigureAwait(false);

// Detach the event handler after Part 1 is processed
if (wrappedCallback != null)
{
    discoveryResult.InitialResponse.WriteObjectProgressEvent -= wrappedCallback;
}

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have updated it to detach it.

@GarrettBeatty GarrettBeatty force-pushed the gcbeatty/progresstrackmpd2 branch 2 times, most recently from b5181d4 to e4bbada Compare November 21, 2025 22:51
@GarrettBeatty GarrettBeatty marked this pull request as ready for review November 21, 2025 22:51
@GarrettBeatty GarrettBeatty marked this pull request as draft November 24, 2025 16:57
@GarrettBeatty GarrettBeatty force-pushed the gcbeatty/progresstrackmpd2 branch from e4bbada to 9ea98d0 Compare November 24, 2025 17:03
@GarrettBeatty GarrettBeatty marked this pull request as ready for review November 24, 2025 17:03
@GarrettBeatty GarrettBeatty force-pushed the gcbeatty/progresstrackmpd2 branch from 9ea98d0 to f901ac8 Compare November 25, 2025 14:52
// Delegate data handling to the handler
await _dataHandler.ProcessPartAsync(partNumber, response, cancellationToken).ConfigureAwait(false);

Logger.DebugFormat("MultipartDownloadManager: [Part {0}] Buffering completed successfully", partNumber);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if i also need to detatch the event object here similar to https://github.com/aws/aws-sdk-net/pull/4139/files#r2551150489? Will add on next revision if so

long transferredBytes = Interlocked.Add(ref _totalTransferredBytes, e.IncrementTransferred);

// Use atomic CompareExchange to ensure only first thread fires completion
bool isComplete = false;
Copy link
Contributor Author

@GarrettBeatty GarrettBeatty Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this completion check, i was getting this in a download directory test later on.

e.g with just this code

 private void DownloadPartProgressEventCallback(object sender, WriteObjectProgressArgs e)
        {
        long transferredBytes = Interlocked.Add(ref _totalTransferredBytes, e.IncrementTransferred);
        bool isComplete = transferredBytes >= _totalObjectSize;
        
        var aggregatedArgs = CreateProgressArgs(e.IncrementTransferred, transferredBytes, isComplete);
        _userProgressCallback?.Invoke(this, aggregatedArgs);
        }

i get

WSSDK.IntegrationTests.S3.NetFramework test net472 failed with 1 error(s) (81.8s)
    C:\dev\repos\aws-sdk-net\sdk\test\Services\S3\IntegrationTests\TransferUtilityDownloadDirectoryWithResponseTests.cs(379): error TESTERROR:
      DownloadDirectoryWithResponse_NestedDirectories_PreservesStructure (3s 782ms): Error Message: Assert.AreEqual
       failed. Expected:<3>. Actual:<4>.
      Stack Trace:
         at AWSSDK_DotNet.IntegrationTests.Tests.S3.TransferUtilityDownloadDirectoryWithResponseTests.<DownloadDire
      ctoryWithResponse_NestedDirectories_PreservesStructure>d__13.MoveNext() in C:\dev\repos\aws-sdk-net\sdk\test\
      Services\S3\IntegrationTests\TransferUtilityDownloadDirectoryWithResponseTests.cs:line 379
      --- End of stack trace from previous location where exception was thrown ---
         at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
         at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
         at Microsoft.VisualStudio.TestPlatform.MSTestAdapter.PlatformServices.ThreadOperations.ExecuteWithAbortSaf
      ety(Action action)

Test summary: total: 10, failed: 1, succeeded: 9, skipped: 0, duration: 81.7s
Build failed with 1 error(s) in 84.2s

And this test was flaky. We can see that it was expected 3 (expected downloaded) but asserting 4 (actual downloaded). And this was not happening all of the time.

in download Directory the objectsDownloaded is populated when the task is complete on Progress event handler. https://github.com/aws/aws-sdk-net/pull/4141/files#diff-057d511c91cbedc245b30df19fa6c39b2ccb9d9a3047140ad6a999c0ba83e9faR79.

Copy link
Contributor Author

@GarrettBeatty GarrettBeatty Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so upon looking at the code, the reason for the double fire is

30MB file, 3 parts of 10MB each, all finishing at the same time:

Part 1: Interlocked.Add(10MB) → transferredBytes = 10MB
Part 2: Interlocked.Add(10MB) → transferredBytes = 20MB
Part 3: Interlocked.Add(10MB) → transferredBytes = 30MB ✓ sees 30 >= 30

[All three parts now fire their final completion events with increment=0]

Part 1: Interlocked.Add(0) → transferredBytes = 30MB ✓ sees 30 >= 30
Part 2: Interlocked.Add(0) → transferredBytes = 30MB ✓ sees 30 >= 30
Part 3: [already checked at 30MB]

Result: THREE completion events fire, all with isComplete=true!

So having the interlocked in our code handles the case where we only need to do one final completion event across all parts

@GarrettBeatty GarrettBeatty marked this pull request as draft November 25, 2025 16:17
@GarrettBeatty GarrettBeatty marked this pull request as ready for review November 25, 2025 16:31
/// <param name="progressCallback">Optional callback for progress tracking events.</param>
/// <returns>A task that completes when all downloads finish or an error occurs.</returns>
Task StartDownloadsAsync(DownloadDiscoveryResult discoveryResult, CancellationToken cancellationToken);
Task StartDownloadsAsync(DownloadDiscoveryResult discoveryResult, CancellationToken cancellationToken, EventHandler<WriteObjectProgressArgs> progressCallback = null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the strong convention that CancellationToken should be the last argument and I would avoid putting a default on progressCallback because that can cause the caller to forget to pass along the callback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the order in 7a4a1e1

/// Uses the last known transferred bytes from progress tracking.
/// </summary>
/// <param name="totalBytes">Total file size if known, otherwise -1</param>
private void FireTransferFailedEvent(long totalBytes = -1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious how this plays with the callback that @philasmar is creating for failed requests. We shouldn't have 2 events for the same thing.

Copy link
Contributor Author

@GarrettBeatty GarrettBeatty Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so these are separate. here is the difference.

  1. DownloadFailedEvent only applies to single file downloads. This will fire one time when a file fails to download.
  2. In phils code, add failure policy to download directory #4151, this is for Directory operations. His ObjectDownloadFailedEvent fires whenever a file in the directory fails to download (so it can be multiple times if there are multiple files in the directory). When a user downloads a directory, my DownloadFailedEvent will not be executed (because we are not linking it/setting it in the DirectoryDownload). The reason these two things are also not linked together is because phils logic has exception handling logic which cant be linked to mine

So in summary, only one of these events will fire for a given operation

@GarrettBeatty GarrettBeatty requested a review from normj November 26, 2025 21:06
@GarrettBeatty GarrettBeatty marked this pull request as draft November 26, 2025 23:36
@GarrettBeatty
Copy link
Contributor Author

GarrettBeatty commented Nov 26, 2025

.Need to make updates for the initiated, completed, and failed events to not fire in the baground. edit: that is done in https://github.com/aws/aws-sdk-net/pull/4170/files

Base automatically changed from gcbeatty/multifile to feature/transfermanager November 28, 2025 15:05
@GarrettBeatty GarrettBeatty force-pushed the gcbeatty/progresstrackmpd2 branch from 7a4a1e1 to 2c7b2c9 Compare November 28, 2025 15:07
@GarrettBeatty GarrettBeatty marked this pull request as ready for review November 28, 2025 15:08
@GarrettBeatty GarrettBeatty merged commit 18ad664 into feature/transfermanager Dec 1, 2025
1 check passed
@GarrettBeatty GarrettBeatty deleted the gcbeatty/progresstrackmpd2 branch December 1, 2025 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants