Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
SLASH2 File System
SLASH2 is an open source (GPL v2+) wide area network (WAN)-friendly distributed file system featuring data multi-residency, system-managed data transfer, inline checksum verification, and much more.
Several needs arise concerning the management of large data sets:
- geographical replication for access locality
- replicas for valuable data
- the continuing emergence of cloud computing and the need for universal interfaces
- research collaboration
- data set migration
In solving these issues, often the burden is placed upon users themselves. This requires all researchers needing access to data sets to learn tools and deal with environments to manage data transfers on their own. Replication is often done manually instead of being handled by the system according to policy.
Current Approaches and their Drawbacks
The current toolset of choices storage system administrators have is often at a level and interface too high or too low which gives researchers either too much or not enough control over this data management.
High level data management interfaces (from GridFTP to
scp) are built
on top of the storage system and completely externalize operations such
as replica management.
This places the burden of learning interfaces to utilize these features
as well as monitoring failed transfers and retries on the user.
Also, users vie for network performance against other activity.
As a result, network resources are often under or over utilized.
With very large data sets this problem escalates further out of control.
The low level approach adapts parallel file system techniques to the WAN. This places the burden of verifying file integrity of replicas on the user if unreliable transports are used during transmission.
The SLASH2 Solution
SLASH2 handles these tasks at the system level, alleviating burden from the user. It provides a POSIX interface and a small set of additional tools for managing replicas. SLASH2 also imposes no restrictions on underlying storage systems which allows administrators to run their systems unaltered. Support for POSIX backend storage systems as well as a number of non POSIX backend storage systems makes SLASH2 an excellent solution for varied and heterogeneous environments.
More information is available in the SLASH2 white paper.
Technical System Architecture
Visual diagrams are available highlighting the design of the various SLASH2 components.