Skip to content
SWZ edited this page May 27, 2021 · 10 revisions

Running CWL on Arvados

CWL workflows on Arvados

Arvados is an open source platform for managing, processing, and sharing genomic and other large scientific and biomedical data.

The Arvados architecture provides a modern open source platform for organizing, managing and processing terabytes to petabytes of data. It allows you to track your methods and datasets, share them securely, and easily re-run analyses. You can watch a video introducing Arvados and the technical components. Watch here >. The platform’s key components are a content addressable storage system and a containerized workflow engine.

Keep

Keep is the Arvados storage system for managing and storing large collections of files. Keep combines content addressing and a distributed storage architecture resulting in both high reliability and high throughput. Every file stored in Keep can be accurately verified every time it is retrieved. Keep supports the creation of collections as a flexible way to define data sets without having to re-organize or needlessly copy data. Keep works on a wide range of underlying filesystems and object stores. Learn More >

Crunch

Crunch is the orchestration system for running CWL workflows. It is designed to maintain data provenance and workflow reproducibility. Crunch automatically tracks data inputs and outputs through Keep and executes workflow processes in Docker containers. In a cloud environment, Crunch optimizes costs by scaling compute on demand. Learn More >

Arvados Playground

If you would like to try Arvados out, there is public Arvados playground which is a free-to-use installation of Arvados for evaluation and trial use.

More Information: