Skip to content

An open source platform for managing and analyzing biomedical big data

License

Notifications You must be signed in to change notification settings

arvados/arvados

Repository files navigation

Join the chat at https://gitter.im/arvados/community | Installing Arvados | Installing Client SDKs | Report a bug | Development and Contributing

Arvados is an open source platform for managing, processing, and sharing genomic and other large scientific and biomedical data. With Arvados, bioinformaticians run and scale compute-intensive workflows, developers create biomedical applications, and IT administrators manage large compute and storage resources.

The key components of Arvados are:

  • Keep: Keep is the Arvados storage system for managing and storing large collections of files. Keep combines content addressing and a distributed storage architecture resulting in both high reliability and high throughput. Every file stored in Keep can be accurately verified every time it is retrieved. Keep supports the creation of collections as a flexible way to define data sets without having to re-organize or needlessly copy data. Keep works on a wide range of underlying filesystems and object stores.

  • Crunch: Crunch is the orchestration system for running Common Workflow Language workflows. It is designed to maintain data provenance and workflow reproducibility. Crunch automatically tracks data inputs and outputs through Keep and executes workflow processes in Docker containers. In a cloud environment, Crunch optimizes costs by scaling compute on demand.

  • Workbench: The Workbench web application allows users to interactively access Arvados functionality. It is especially helpful for querying and browsing data, visualizing provenance, and tracking the progress of workflows.

  • Command Line tools: The command line interface (CLI) provides convenient access to Arvados functionality in the Arvados platform from the command line.

  • API and SDKs: Arvados is designed to be integrated with existing infrastructure. All the services in Arvados are accessed through a RESTful API. SDKs are available for Python, Go, R, Perl, Ruby, and Java.

Quick start

To try out Arvados on your local workstation, you can use Arvbox, which provides Arvados components pre-installed in a Docker container (requires Docker 1.9+). After cloning the Arvados git repository:

$ cd arvados/tools/arvbox/bin
$ ./arvbox start localdemo

In this mode you will only be able to connect to Arvbox from the same host. To configure Arvbox to be accessible over a network and for other options see http://doc.arvados.org/install/arvbox.html for details.

Documentation

Complete documentation, including the User Guide, Installation documentation, Administrator documentation and API documentation is available at http://doc.arvados.org/

If you wish to build the Arvados documentation from a local git clone, see doc/README.textile for instructions.

Community

Join the chat at https://gitter.im/arvados/community

The Arvados community channel channel at gitter.im is available for live discussion and support.

The Arvados developement channel channel at gitter.im is used to coordinate development.

The Arvados user mailing list is used to announce new versions and other news.

All participants are expected to abide by the Arvados Code of Conduct.

Reporting bugs

Report a bug on dev.arvados.org.

Development and Contributing

See CONTRIBUTING for information about Arvados development and how to contribute to the Arvados project.

The development road map outlines some of the project priorities over the next twelve months.

Licensing

Arvados is Free Software. See COPYING for information about the open source licenses used in Arvados.