Hadoop and other batch processing frameworks run best against a file system designed for their data access pattern: sequential reads and sequential writes of large files of at least tens of megabytes, and often gigabytes.
We have developed the Quantcast File System (QFS), a high-performance distributed file system, to meet the need. It evolved from the Kosmos File System (KFS), which we adopted in 2008 for secondary storage, and began hardening and improving. In 2011, we migrated our primary processing to QFS and stopped using HDFS. Since then we have stored all our data in QFS, and we’re confident it’s ready for other users’ production workloads. We are releasing QFS as open source with the hope that it will serve as a platform for experimental as well as commercial projects.
QFS consists of 3 components:
QFS is implemented in C++ using standard system components such as TCP sockets, STL, and Boost libraries. The server components have been used in production on 64-bit x86 architectures running Linux CentOS 5 and 6, and the client library has been tested on CentOS 5 and 6, OSX 10.X, Cygwin, and Debian/Ubuntu.
QFS builds upon some of the ideas outlined in the Google File System (GFS) paper (SOSP 2003).