Skip to content

Fix for very large files on S3#892

Merged
bennybp merged 2 commits into
mainfrom
s3_fix
Feb 6, 2025
Merged

Fix for very large files on S3#892
bennybp merged 2 commits into
mainfrom
s3_fix

Conversation

@bennybp
Copy link
Copy Markdown
Contributor

@bennybp bennybp commented Feb 6, 2025

Description

S3 (or boto3?) has a limit of 10,000 parts. I used a part size of 10 MiB, which limits the file size to about 100GiB. We do have datasets that will be bigger than that.

This changes the part size to 256MiB, so a max limit of about 2.56TiB.

(Inside that function, we do not know the end file size, since it is meant to work with BinaryIO streams)

Status

  • Code base linted
  • Ready to go

@bennybp bennybp merged commit 8f463e8 into main Feb 6, 2025
@bennybp bennybp deleted the s3_fix branch February 6, 2025 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant