-
Notifications
You must be signed in to change notification settings - Fork 520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support creating files bigger than 50GB #20
Comments
is it support now?I still stuck at 10000 part with a 70GB file |
Not yet. You can try manually increasing the part size from 5MB to something larger. |
Thank you but goofys does not provide any option to increase part size. I must edit in source code? |
I lift the limit to 100GB for now. The proper fix will come later. |
Does we have any options to set part size manually yet? |
Implementing larger part sizes (preferably automatically based on the file size) would be highly appreciated here as well; we are storing genomics files up to 500GB. |
The difficulty is that when you write to a new file we don't know how large it's going to be. Will do some sort of staggering part sizes as mentioned in #20 (comment) |
this is in preparation for using different mpu part sizes so we can write larger files refs #20
@jindov @schelhorn could either of you give the latest revision a try? You will need to have at least 125MB free memory since goofys buffers each part in memory. this is not release quality just yet, I removed some gating code which means it is possible (although not likely) for goofys to flush thousands of parts concurrently, which is probably not what we want. |
with the last fix I successfully wrote 900GB:
test was done on a hi1.4xlarge which has 10GigE. goofys now supports writing up to 5MB * 1000 + 25 * 1000 + 125 * 8000 = 1005GB files. |
Right now each MPU part is fixed at 5MB, and since S3 has a 10000 maximum part limit we cannot create files bigger than 50GB. We can automatically adjust part size (ie: first 100 parts 5MB, then 50MB, etc) to support bigger files
The text was updated successfully, but these errors were encountered: