Skip to content

Feat S3 Transfer Manager v2 UploadDirectory #3110

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 28, 2025
Merged

Conversation

wty-Bryant
Copy link
Contributor

Implement v2 s3 transfer manager's UploadDirectory api bound to single union client which mimics normal service client's initialization and api call. User could now upload a directory's files recursively/non-recursively to S3 with customized filtering and key naming in a single operation call.

Test: This PR passed unit tests for UploadDirectory which covers use cases with different config and folder structure to confirm this key-prefix uploader could find valid file location, generate correct key and pass upload requests to S3 concurrently. It also passed integration test which uploads source directory containing different sizes files to S3 via new directory uploader, then downloads objects back using S3 client and checks their content consistency

@wty-Bryant wty-Bryant requested a review from a team as a code owner June 9, 2025 17:48
@wty-Bryant wty-Bryant force-pushed the feat-keyprefix-upload branch from 775db9a to 7f51d9c Compare June 9, 2025 22:13
if err != nil {
return []string{}, err
}
subFiles, err := f.Readdir(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the docs on Readdir say

Most clients are better served by the more efficient ReadDir method.

link

However, there may be a reason why we're using this instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we only need sub file name, change to more efficient one


func (u *directoryUploader) uploadDirectory(ctx context.Context) (*UploadDirectoryOutput, error) {
u.init()
ch := make(chan fileChunk)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we're doing file chunking, so I would call this just fileChan or something similar.

break
}
if u.getErr() != nil {
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious about the continue here. I think the idea is to continue uploading files if there's an error with the current file, but I don't see a place where we clear an error once it has been set, so wouldn't this always return non-nil once it has been set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed with break since an undrained channel will still be gc once closed and not referred by any struct

Delimiter string
KeyPrefix string
ExpectFilesUploaded int
ExpectFilesFailed int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't have any test that uses this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed for now

@wty-Bryant wty-Bryant force-pushed the feat-keyprefix-upload branch from 7f51d9c to 8d95b62 Compare June 19, 2025 03:34
}
fileInfo, err := os.Lstat(absPath)
if err != nil {
u.setErr(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would this errors be presented to the users? I'm thinking if we should have something more user friendly, like error stating file X: (actual error). If we only present the raw error it may be hard to identify what actually went wrong

_, err := u.c.PutObject(ctx, input)
if err != nil {
u.setErr(err)
u.incrFilesFailed(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still not clear to me what the filesFailed field is supposed to be, since now we call setErr and on the next iteration we say if u.getErr() != nil { break }, so this value is always either 0 (no failure) of 1 (we failed right after the first one failed).

If we're adding some failure policy later on, I'd rather add this field when that happens and remove it on this iteration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, removed it for now, I can always add it back once failure policy is introduced

@wty-Bryant wty-Bryant force-pushed the feat-keyprefix-upload branch from 6987c35 to 8d7ae73 Compare June 26, 2025 21:37
@wty-Bryant wty-Bryant force-pushed the feat-keyprefix-upload branch from 8d7ae73 to 61d1967 Compare June 28, 2025 02:55
@wty-Bryant wty-Bryant merged commit 5f2cf1b into main Jun 28, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants