Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v2] Bug fix and optimization in s3transfer's download #9345

Draft
wants to merge 48 commits into
base: v2
Choose a base branch
from

Conversation

aemous
Copy link
Contributor

@aemous aemous commented Mar 5, 2025

Issue #, if available:

Context

This PR aims to address a bug that occurs when an IncompleteReadError is encountered while downloading a multipart AWS S3 object to a non-seekable stream (e.g. stdout). In this case, all bytes of the object with offset larger than the offset of the incomplete read get queued in a buffer, and the CLI terminates before the buffer is flushed.

The results of this are (1) the object does not fully get written to the stream, and (2) the memory usage of the CLI grows linearly with respect to the amount of object bytes with offset larger than the offset of the incomplete read. This means that most of the object may be stored in memory no matter how large the object is.

The scope of the changes in this PR to s3transfer will only impact users downloading multi-part AWS S3 objects to a non-seekable stream (e.g. stdout).

Description of changes

s3transfer

  • Fix bug that occurs when an IncompleteReadError is encountered while downloading a multipart AWS S3 object to a non-seekable stream (e.g. stdout).
  • Changed DeferQueue so that it will overwrite pending writes to the same offset with whichever write request has the most data. So, when the request with the incomplete read is retried, it will overwrite the incomplete data in the queue assuming the retry contains more data (e.g. the full part).
  • Changed DeferQueue so that if it has already dequeued an incomplete part, it will queue the subset of the bytes in the retry that were not previously dequeued.

Performance Benchmark Scripts

  • Added two new benchmark definitions to benchmarks.json: one simulates a standard multi-part download of a 10GB file (1,192 parts), and the other simulates the same download except an IncompleteReadError occurs (with retry) on the 32nd part.
  • Updated the benchmark harness to support loading mocked HTTP response bodies from files. Updated the performance scripts README with the updated definition schema.
  • Added support for a --debug-dir <path> path to make benchmark-definition debugging easier. When this path is specified, all output of the child (benchmarked) processes gets written to files in this directory.

Description of tests

Correctness

  • Added 2 new unit tests to cover the 2 main modifications of DeferQueue. Also updated names/documentation of existing related tests.
  • Verified that the new benchmark that simulates an incomplete read successfully writes the entire object to stdout by piping the output to a file and checking its size.
    • Before the changes to DeferQueue contained in this PR, I verified that the full object does NOT successfully get written.
  • Ran and passed all existing tests (unit, functional, integration, etc.)
  • Manually used this CLI build to successfully download a 10GB object from S3 to stdout using the default concurrency options.

Performance Testing

  • Ran the performance tests before and after the changes to DeferQueue contained in this PR and observed significant optimizations to memory usage and execution time. The results are summarized below

Summary

max_mem_issue
execution_time_mem_issue

Raw Data

Each cell in the table below is the average value of the metric taken over 5 iterations.

- 2.24.14 2.24.14 2.24.14 (w/ DeferQueue improvements)  2.24.14 (w/ DeferQueue improvements)
- Happy Path IncompleteReadError Happy Path IncompleteReadError
Max Memory (GB) 0.7246 5.242 0.207 0.20836
p95 Memory (GB) 0.7101 5.025 0.207 0.19397
Max CPU (%) 19.72 11.82 18.42 18.42
p95 CPU (%) 1.5 1.5 2.5 2.5
Total Time (s) 179.4 9.848 111.7 110.18

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

aemous and others added 30 commits February 20, 2025 15:13
* Bump PyInstaller to 6.11.1
This includes pinning `macholib` because of this change in `PyInstaller`
that makes `macholib` a build dependency.

pyinstaller/pyinstaller#5883

---------

Co-authored-by: Github Actions <>
Co-authored-by: Steve Yoo <hssyoo@amazon.com>
Co-authored-by: Alex Shovlin <shovlia@amazon.com>
@aemous aemous requested review from aws-sdk-python-automation and a team March 5, 2025 16:24
@aemous aemous marked this pull request as ready for review March 5, 2025 16:24
@kdaily kdaily removed the request for review from aws-sdk-python-automation March 5, 2025 16:26
@aemous aemous marked this pull request as draft March 5, 2025 16:42
Copy link
Member

@ashovlin ashovlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth running this new benchmark continuously?

Mostly asking for ergonomics, since GitHub won't even show the file diff now. If we don't want to run it as part of the new suite, wondering if it'd be better to have a separate benchmarks-s3-advanced.json (or whatever naming) separately.

@aemous
Copy link
Contributor Author

aemous commented Mar 10, 2025

Is it worth running this new benchmark continuously?

Mostly asking for ergonomics, since GitHub won't even show the file diff now. If we don't want to run it as part of the new suite, wondering if it'd be better to have a separate benchmarks-s3-advanced.json (or whatever naming) separately.

@ashovlin Agreed that would be a good idea, I iterated on this feedback in the latest revision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants