Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add additional dir level to blobs/ #39

Closed
GoogleCodeExporter opened this issue Mar 24, 2015 · 2 comments
Closed

Add additional dir level to blobs/ #39

GoogleCodeExporter opened this issue Mar 24, 2015 · 2 comments

Comments

@GoogleCodeExporter
Copy link

As it stands, the blobs/ dir is subdivided into 256 folders. For use cases 
involving very large datasets (ie 1M+ files) having directories with 3000+ 
files in them gets unwieldy and can effect performance. What are your thoughts 
on allowing an upgrade path for /blobs/12/34/1234567890abcdef ?

This would allow for virtually any size dataset (2 subdir nesting is what you 
often see in the urls of file and imagehosts that store files by hash). If you 
want to allow backward compatibility, you could specify a "repo version" 
property either in the main repo dir or in the session file?

Anyway, really liking boar. I need to brush up on python a bit, but I'd like to 
submit patches sooner rather than later, and not just endless issues/requests :P

Original issue reported on code.google.com by cryptob...@gmail.com on 21 Dec 2011 at 5:57

@GoogleCodeExporter
Copy link
Author

I agree that a repository can quickly become large enough to make it 
inconvenient to browse around the blobs manually. But then again, hundreds of 
thousands of files named something like "650ab14dd0caba8f71d2db9b4a3abb90" 
isn't very user friendly to begin with. I'm more concerned with the performance 
part. Do you have any numbers or examples backing up the performance problem 
claim? Boar performs these operations often:

* Checking the existence of a blob
* Opening a blob for reading
* Listing all the blobs

However, another problem is that fat32 only supports about 20000 file entries 
per directory (for 16 char long filenames), which allows a maximum of 5 million 
files in a repository for fat32. Not good. The easiest way out is of course to 
say that boar doesn't support fat32...

Original comment by ekb...@gmail.com on 29 Dec 2011 at 2:06

@GoogleCodeExporter
Copy link
Author

Closing this one for the time being, until someone can show in numbers that 
this is a real problem in some situations.

Original comment by ekb...@gmail.com on 29 Feb 2012 at 12:51

  • Changed state: WontFix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant