Clone this wiki locally
Welcome to the Pomegranate File System (Chinese:石榴文件系统)wiki!
Some blogs about Pomegranate on Posterous.
IRC: freenode.net #pomegranatefs
Emerging popular Web applications, e.g. online photo and social network services, exhibit the requirement for accessing lots of small files. Small files are concurrently created and read in file systems. Particularly, NVersion and 1-N Mapping issues of distributed file systems, especially for small files, have not been addressed yet. In this paper, we propose to evolve the directory model, interface and storage structures to address NVersion and 1-N Mapping issues and support semantic data access. We implement a novel distributed file system to manage billions of small files, and focus on decreasing metadata service latency and improving small file I/O to promote the efficiency of handling small files.
Our evaluation demonstrates that the new directory model, interface and storage structures can not only save storage space, but also increase small file performance. With 32 servers, we deliver the leading metadata and small file I/O performance: 180,000 file creates per seconds and 1GB/s aggregated write bandwidth.
- Pomegranate is a distributed file system over distributed tabular storage (xTable) implemented in C;
- Implement a flexible memory caching layer, which is snapshot-able and fault tolerant;
- A reliable storage layer that support massive tiny file reads and writes;
- A key/value interface as a supplementary for POSIX I/O interface;
1. Pomegranate Key/Value Interface
2. Pomegranate File System Interface
3. User Level File System API
4. How to use hvfs.sh script?
5. Setup HTTP server
5. RESTful Interface
6. Thoughts on Pomegranate
7. FUSE client
8. Semantic Query Support
Want to know how to run Pomegranate Key/Value interface? Please refer to Getting-Start and Online Node Changing.
Want to know how to run Pomegranate File System interface Unit Test? Please refer to Getting-Start-fs.
An event delivery and processing framework of Pomegranate, please refer to Branch Framework.
New user level file system API provided, try it now! See document.
We have deployed 39 kernel level clients, 16 MDS, and 16 MDSL on our test cluster. Finally we have got a group of aggregate RPSs (file creation=86981.4,file stat=178925,REMOVE=177669) reported by mdtest! We even have some optimization space left.
The emerging popular Internet applications such as online photo and micro-blog services exhibit very different data access and storage requirements from traditional applications. These applications require challenging support for highly concurrent both high throughput and low latency access to tiny files. Millions to billions users issue requests at the same time. Thus, huge amounts of tiny data are generated, analyzed, and returned every second.
We propose Pomegranate, a novel distributed file system built over distributed tabular storage, to manage billions of tiny files and support highly concurrent low latency data access. Pomegranate uses distributed extendible hash to index metadata, log-structured storage format and columnar storage to exploit temporal and spatial locality, and snapshot-able and reconfigurable caching to increase parallelism and tolerant failures.
We have demonstrate that file system over tabular storage performs well for highly concurrent access. In our test cluster, we observed linearly increased more than 100,000 aggregate read and write requests served per second (RPS).
Please refer to Experiment to see detail performance results.
Follow me at Twitter @nkmacana
Roadmap and Schedule
- Implement and test dynamically online/offline MDSs @ 9/6/2010 – 9/30/2010
- Stabilizing the kernel level file system client and the full-functional communication library XNET @ 8/20/2010 – 10/10/2010
- Test Pomegranate on Dawning 6000 (known as Nebulae)
- Finalizing the FUSE client @ 10/31/2010
- Release the first beta version @ 11/30/2010 (Implement a new framework: Branch, release date delayed. Sorry!)