Simple & fast multi-threaded S3 download tool.
- Simple to use.
- Multi-threaded, allowing you to download multiple objects concurrently.
- Quickly download a subset of objects under a prefix without listing all objects first.
- Object listing occurs in a seperate thread and downloads start as soon as the first object key is returned while the object listing completes in the background.
- Filter list of objects using regular expressions.
- Uses standard Boto3 AWS SDK and standard AWS credential locations.
- List only mode if you just want to see what would be downloaded.
Why use S3Fetch?
Benchmarking shows (see below) that S3Fetch can finish downloading 428 objects from a bucket containing 12,204,097 objects in 8 seconds while other tools have not started downloading a single object after 60 minutes.
Downloading 428 objects under the
fake-prod-data/2020-10-17 prefix from a bucket containing a total of 12,204,097 objects.
With 100 threads
s3fetch s3://fake-test-bucket/fake-prod-data/2020-10-17 --threads 100 8.259 seconds
s4cmd get s3://fake-test-bucket/fake-prod-data/2020-10-17* --num-threads 100 Timed out while listing objects after 60min.
With 8 threads
s3fetch s3://fake-test-bucket/fake-prod-data/2020-10-17 --threads 8 29.140 seconds
time s4cmd get s3://fake-test-bucket/fake-prod-data/2020-10-17* --num-threads 8 Timed out while listing objects after 60min.
- Python >= 3.7
- AWS credentials in one of the standard locations
S3Fetch is available on PyPi and can be installed via one of the following methods.
Ensure you have pipx installed, then:
pipx install s3fetch
pip3 install s3fetch
Usage: s3fetch [OPTIONS] S3_URI Easily download objects from an S3 bucket. Example: s3fetch s3://my-test-bucket/birthday-photos/2020-01-01 The above will download all S3 objects located under the `birthday- photos/2020-01-01` prefix. You can download all objects in a bucket by using `s3fetch s3://my-test- bucket/` Options: --region TEXT Bucket region. Defaults to 'us-east-1'. -d, --debug Enable debug output. --download-dir TEXT Download directory. Defaults to current directory. -r, --regex TEXT Filter list of available objects by regex. -t, --threads INTEGER Number of threads to use. Defaults to core count. --dry-run, --list-only List objects only, do not download. --delimiter TEXT Specify the "directory" delimiter. Defaults to '/'. -q, --quiet Don't print to stdout. --version Print out version information. --help Show this message and exit.
Download using 100 threads into
~/Downloads/tmp, only downloading objects that end in
$ s3fetch s3://my-test-bucket --download-dir ~/Downloads/tmp/ --threads 100 --regex '\.dmg$' test-1.dmg...done test-2.dmg...done test-3.dmg...done test-4.dmg...done test-5.dmg...done
Download all objects from a bucket
Download objects with a specific prefix
Download all objects that strt with
Download objects to a specific directory
Download objects to the
s3fetch s3://my-test-bucket/ --download-dir ~/Downloads
Download multiple objects concurrently
Download 100 objects concurrently.
s3fetch s3://my-test-bucket/ --threads 100
Filter objects using regular expressions
Download objects ending in
s3fetch s3://my-test-bucket/ --regex '\.dmg$'