Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Use SHA1(Entry) to distribute entries in sub dirs
Previously, we wrote all entries to $BUCKET/encode($ENTRY). The bucket subdirectory could become large enough to run into file system limits on the number of files in a directory or file system performance issues with finding entries in a large directory. The new format takes the following form: $BUCKET/$AA/$BB/$CC/$DD/encode($ENTRY) where $AA, ..., $DD correspond to the first four bytes of SHA(encode($ENTRY)) in hex. Each subdirectory will contain at most 256 items. This keeps the directory size well below file system limits. We chose two character subdirs so that filesystems with linear directory search algorithms would be able to find entries quickly. The cost of the subdirectories is use of inodes. So for a given set of files we incur a 4x cost in inode use. We chose four levels as an unscientific compromise of inode cost and size of fileset that will avoid large directories. With four levels, we expect to be able to handle ~2^32 files consuming ~2^34 inodes. By keeping the basename of the file the encoded entry, we can support bucket listing via recursive file globbing and end up with an on disk format that is a bit easier for a human to understand/debug. Also worth noting is that we do not remove empty subdirectories when entries are deleted. At this time we plan to leave this as known behavior: inode usage will increase over time. Inode use can be reclaimed by removing empty directories and this could be done manually with the bookshelf service stopped. If this becomes a pain point, we could establish optional behavior on restart to do the cleaning. Removing the directories on-demand introduces a need for locking which we'd prefer to avoid if possible. Misc Dialyzer fix ups * Remove unused fun head for entry_delete * Add opscoderl_wm to analysis
- Loading branch information
Seth Falcon
committed
Sep 10, 2013
1 parent
c9870a1
commit a4bfaa1
Showing
3 changed files
with
51 additions
and
41 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters