Skip to content

Conversation

@GarrettBeatty
Copy link
Contributor

@GarrettBeatty GarrettBeatty commented Dec 2, 2025

Description

Fixes the issue where in the previous code this happened

 // Creates semaphore with the right limit
fileOperationThrottler = DownloadFilesConcurrently ?
    new SemaphoreSlim(this._config.ConcurrentServiceRequests) :  // e.g., 4
    new SemaphoreSlim(1);

foreach (S3Object s3o in objs) // Say we have 100 files
{
    await fileOperationThrottler.WaitAsync(); // Wait for slot (max 4 waiting)
    
    try
    {
        var task = _failurePolicy.ExecuteAsync(/* download logic */);
        pendingTasks.Add(task); // Queue the task
    }
    finally
    {
        fileOperationThrottler.Release(); // ⚠️ RELEASE IMMEDIATELY!
    }
}

// ⚠️ ALL 100 DOWNLOADS RUN SIMULTANEOUSLY!
await TaskHelpers.WhenAllOrFirstExceptionAsync(pendingTasks, cancellationToken);

this was accidentally introduced in #4151

I also refactored download directory to be easier to read

Motivation and Context

#3806

Testing

  1. made unit tests which pass on development branch but failed on feature/transfermanager branch before this change. Then after i made the fix they pass

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • My code follows the code style of this project
  • My change requires a change to the documentation
  • I have updated the documentation accordingly
  • I have read the README document
  • I have added tests to cover my changes
  • All new and existing tests passed

License

  • I confirm that this pull request can be released under the Apache 2 license

long _transferredBytes;
string _currentFile;

internal DownloadDirectoryCommand(IAmazonS3 s3Client, TransferUtilityDownloadDirectoryRequest request)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing unused constructors

}

if (File.Exists(this._request.S3Directory))
if (File.Exists(this._request.LocalDirectory))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this was a bug. we were checking if s3 directory existed on machine and not local directory???

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need a dedicated dev config because of this line? @philasmar @normj ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are fixing an unrelated bug to the feature we should have a change log entry for it.

I'm curious if this should have been Directory.Exists instead of File.Exists. Also was the validation basically broken before and users have been successfully working when the directory already exists. This is fixing a bug but might be introducing a breaking change that might not be okay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in that case ill remove this line change from this PR and we can update later if we want

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ive removed the change from this pr for now since i didnt feel like checking,testing more that scenario. will recheck and create a new PR if required

/// Actual (broken): 5 concurrent downloads (all files download simultaneously)
/// </summary>
[TestMethod]
public async Task ExecuteAsync_ConcurrentServiceRequests_RespectsLimit()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these were failing before this fix

@GarrettBeatty GarrettBeatty changed the title Fix download directory concurrency issue Fix download directory concurrency issue and refactor Dec 2, 2025
@GarrettBeatty GarrettBeatty marked this pull request as ready for review December 2, 2025 23:59
Copy link
Member

@normj normj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main point of the PR looks good. Not sure if it is safe to make the unrelated bug fix given it changes existing behavior.

}

if (File.Exists(this._request.S3Directory))
if (File.Exists(this._request.LocalDirectory))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are fixing an unrelated bug to the feature we should have a change log entry for it.

I'm curious if this should have been Directory.Exists instead of File.Exists. Also was the validation basically broken before and users have been successfully working when the directory already exists. This is fixing a bug but might be introducing a breaking change that might not be okay.

@GarrettBeatty GarrettBeatty requested a review from normj December 3, 2025 00:26
Copy link
Member

@normj normj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve but we should continue the discussion about what to do with the bug in a separate thread/pr.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a critical concurrency control bug in the S3 Transfer Utility's DownloadDirectoryCommand where the semaphore was being released immediately after acquiring it, causing all files to download simultaneously instead of respecting the configured concurrency limits. The fix refactors the download logic to use a task pool pattern that properly manages concurrency throughout the download lifecycle.

Key Changes:

  • Fixed semaphore release timing to occur after download completion rather than immediately after acquiring
  • Refactored download logic into smaller, focused methods for better readability and maintainability
  • Introduced ForEachWithConcurrencyAsync helper to properly implement the task pool pattern

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
sdk/src/Services/S3/Custom/Transfer/Internal/_bcl+netstandard/DownloadDirectoryCommand.cs Refactored ExecuteAsync into smaller methods using task pool pattern; added logging; fixed concurrency control
sdk/src/Services/S3/Custom/Transfer/Internal/DownloadDirectoryCommand.cs Consolidated constructors to require TransferUtilityConfig parameter
sdk/src/Services/S3/Custom/Transfer/Internal/TaskHelpers.cs Added ForEachWithConcurrencyAsync method for task pool pattern implementation; added debug logging
sdk/test/Services/S3/UnitTests/Custom/DownloadDirectoryCommandTests.cs Updated test constructor calls to include config parameter; added concurrency control tests
sdk/test/Services/S3/UnitTests/Custom/FailurePolicyTests.cs Added comprehensive tests for path validation, sequential mode, and failure handling scenarios

Comment on lines 35 to +40

public bool DownloadFilesConcurrently { get; set; }

private Logger Logger
{
get { return Logger.GetLogger(typeof(DownloadDirectoryCommand)); }
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recursive property definition: the Logger property calls Logger.GetLogger() which references itself. This should be _logger field or use a different pattern.

Suggested change
public bool DownloadFilesConcurrently { get; set; }
private Logger Logger
{
get { return Logger.GetLogger(typeof(DownloadDirectoryCommand)); }
private readonly Logger _logger = Logger.GetLogger(typeof(DownloadDirectoryCommand));
public bool DownloadFilesConcurrently { get; set; }
private Logger Logger
{
get { return _logger; }

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

@GarrettBeatty GarrettBeatty Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i will fix this before merging

Comment on lines +30 to +33
private static Logger Logger
{
get { return Logger.GetLogger(typeof(TaskHelpers)); }
}
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recursive property definition: the Logger property calls Logger.GetLogger() which references itself. This should use a static field or a different pattern.

Copilot uses AI. Check for mistakes.
@GarrettBeatty GarrettBeatty merged commit 6a31b25 into feature/transfermanager Dec 3, 2025
1 check passed
@GarrettBeatty GarrettBeatty deleted the gcbeatty/downloadfix branch December 3, 2025 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants