Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support storing multiple files in the same backend object ("fragments") #52

Closed
Nikratio opened this issue Dec 28, 2018 · 14 comments
Closed

Comments

@Nikratio
Copy link
Collaborator

[migrated from BitBucket]

Storing lots of small files is very inefficient, since every file requires its own block.

We should add support for fragments, so that multiple files can be stored in the same block.

With the new bucket interface, we should be able to implement this relatively easily:

  • Upload workers get list of cache entries, new blocks may be coalesced into single object
  • CommitThread() and expire() only call to worker threads once they have a reasonably big chunk of data ready
  • We keep objects until reference count of all contained blocks is zero
  • Therefore, blocks may continue to exist with refcount=0 and can possibly be reused
  • s3qladm may need a "cleanup" function to get rid of these blocks
  • When downloading object, db can be used to determine which blocks in the object belong to files (and should be added to cache) and which ones can be discarded
  • Minimum size of cache entries passed to workers could be adjusted dynamically based on upload bandwith, latency, and compression ratio of previous uploads
@szepeviktor
Copy link
Collaborator

Please add an option to enable/disable fragments.

@Nikratio
Copy link
Collaborator Author

Another option would be to use range downloads to download only the fragment that is needed at the time.

@Nikratio
Copy link
Collaborator Author

Nikratio commented Jan 4, 2019

Google Storage supports batched object uploads and downloads. This would give us all the advantages of fragments without any of the drawbacks. Need to check if S3 has something similar and, if not, if this is reason enough to stick with the old plan...

@Nikratio
Copy link
Collaborator Author

Nikratio commented Jan 4, 2019

Not that if we drop the plan to implement fragments we'd also be able to simplify the metadata schema and drop one table completely.

@Nikratio
Copy link
Collaborator Author

Nikratio commented Jan 4, 2019

S3 doesn't support batched operations.

But maybe the latency issue can be addressed by decoupling the number of parallel uploads from the number of upload threads (which can't be very high because it determines the amount of concurrent compressions and encryptions).

@Nikratio
Copy link
Collaborator Author

Nikratio commented Jan 4, 2019

I've decided not to implement this. Revisiting the pros and cons, it is not worth it.

Upload speed is better increased by using batched uploads or more concurrent connections (i.e, in the backend layer). For download of many small files, we should get much better results by implementing some sort of readahead then by hoping that files happen to be in the same fragment.

Thus opened #63 and #62 instead (I won't create a bug for read-ahead unless someone actually plans to work on it).

@Nikratio Nikratio closed this as completed Jan 4, 2019
@Nikratio Nikratio changed the title Support fragments Support storing multiple files in the same backend object ("fragments") Jan 4, 2019
@segator
Copy link

segator commented Jan 4, 2019

There are some providers that are banning by amount of requests, in case we have millons of little files this is a problem because to put or get files get lot of time.
I think it is a good idea to have fragments, others FS implemented it and works pretty fast!

@szepeviktor
Copy link
Collaborator

e.g. when backing up servers I put /etc in a tar to make it one file only.

@Nikratio
Copy link
Collaborator Author

Nikratio commented Jan 4, 2019

Could you provide some examples of providers where this is a problem, examples of some "other FS" that do this, and eloborate on what "pretty fast" means in this context? Otherwise I remain unconvinced :-).

@segator
Copy link

segator commented Jan 4, 2019

for example GDrive and blackblaze is banning when too much requests in a fixed time. so if you are running 100,000 1kb files you will get some time to upload them because the banning, of course s3ql retry can handle it waiiting and retrying but it got lot of time and you get soft-ban.

proxyFS use fragments concept. https://github.com/swiftstack/ProxyFS.

Sorry about "pretty fast" is not so descriptive 🥇
I mean is better to upload in case of 100K little files, 50 fragments than 100K objects, and then in case of
s3ql crash you need to pass fsck, fsck is also slow depending on the provider.
for example GDRive only allow you get 1000 objects by request and it get time to response all of them. in my case for example I have a FS with 900K objects I got more than 1h to pass fsck.
with fragments this can be easily reduced to the half or less.

too much requests it's a problem because the latency and Providers soft-ban.

Anway @Nikratio thank you for your amazing work!! I love s3ql.

@szepeviktor
Copy link
Collaborator

in case of s3ql crash you need to pass fsck, fsck is also slow

I second that.

@Nikratio
Copy link
Collaborator Author

Nikratio commented Jan 4, 2019

Neither GDrive nor Backblaze are currently supported by S3QL, so I don't think that rate limits on their end should influence this decision.

The fsck time is an interesting point - but I do not fully understand the problem. With 900k objects you'd need 900 separate requests. Even when assuming a (pretty high) 0.5 seconds round-trip latency, that's only 7.5 minutes. To need 60 minutes, a single request would have to take 4 seconds - that seems too high. Could you file a separate issue about this? Please include the backend that you are using.

If the time needed for bucket listings really is a problem, I can also thing of some other solutions (issue multiple listing requests in parallel), we don't need to introduce fragments just for that.

@Nikratio
Copy link
Collaborator Author

Nikratio commented Jan 4, 2019

Is there documentation for ProxyFS somewhere? Looking at the README file, it sounds to me as if ProxyFS is actually mapping files to objects 1:1.

@segator
Copy link

segator commented Jan 4, 2019

Neither GDrive nor Backblaze are currently supported by S3QL, so I don't think that rate limits on their end should influence this decision
hope gdrive it will be suported soon.

gdrive list object is slow, can take 2-10s per request
about parallel is not possible at least with gdrive, you get a pagination token to go the next page, I cannot jump between the whole list. I need to interate in order.
ProxyFS I talked with one of the developers in their slack to know more about how it works, but I finally See that have exactly same problem as s3ql, only single mount at a time is possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants