Arvados is an open source platform for managing, processing, and sharing genomic and other large scientific and biomedical data. With Arvados, bioinformaticians run and scale compute-intensive workflows, developers create biomedical applications, and IT administrators manage large compute and storage resources.
The key components of Arvados are:
Keep: Keep is the Arvados storage system for managing and storing large collections of files. Keep combines content addressing and a distributed storage architecture resulting in both high reliability and high throughput. Every file stored in Keep can be accurately verified every time it is retrieved. Keep supports the creation of collections as a flexible way to define data sets without having to re-organize or needlessly copy data. Keep works on a wide range of underlying filesystems and object stores.
Crunch: Crunch is the orchestration system for running Common Workflow Language workflows. It is designed to maintain data provenance and workflow reproducibility. Crunch automatically tracks data inputs and outputs through Keep and executes workflow processes in Docker containers. In a cloud environment, Crunch optimizes costs by scaling compute on demand.
Workbench: The Workbench web application allows users to interactively access Arvados functionality. It is especially helpful for querying and browsing data, visualizing provenance, and tracking the progress of workflows.
Command Line tools: The command line interface (CLI) provides convenient access to Arvados functionality in the Arvados platform from the command line.
API and SDKs: Arvados is designed to be integrated with existing infrastructure. All the services in Arvados are accessed through a RESTful API. SDKs are available for Python, Go, R, Perl, Ruby, and Java.
To try out Arvados on your local workstation, you can use Arvbox, which provides Arvados components pre-installed in a Docker container (requires Docker 1.9+). After cloning the Arvados git repository:
$ cd arvados/tools/arvbox/bin $ ./arvbox start localdemo
In this mode you will only be able to connect to Arvbox from the same host. To configure Arvbox to be accessible over a network and for other options see http://doc.arvados.org/install/arvbox.html for details.
If you wish to build the Arvados documentation from a local git clone, see doc/README.textile for instructions.
The Arvados user mailing list is used to announce new versions and other news.
All participants are expected to abide by the Arvados Code of Conduct.
Development and Contributing
See CONTRIBUTING for information about Arvados development and how to contribute to the Arvados project.
The development road map outlines some of the project priorities over the next twelve months.
Arvados is Free Software. See COPYING for information about the open source licenses used in Arvados.