Examples for the Google Genomics Pipelines API.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bioconductor Re-license. Apache -> BSD. (#67) Jun 15, 2017
compress Re-license. Apache -> BSD. (#67) Jun 15, 2017
cwl_runner The input recursive argument is documented as "-I" but coded as "-r" Dec 1, 2017
fastqc Re-license. Apache -> BSD. (#67) Jun 15, 2017
pipelines_pylib
samtools Re-license. Apache -> BSD. (#67) Jun 15, 2017
set_vcf_sample_id Re-license. Apache -> BSD. (#67) Jun 15, 2017
tools Re-license. Apache -> BSD. (#67) Jun 15, 2017
wdl_runner Update wdl_runner/README.md to point to Cromwell on Google in the Ope… Dec 15, 2017
CONTRIBUTING.rst Add first draft of documentation. Feb 18, 2016
LICENSE Re-license. Apache -> BSD. (#67) Jun 15, 2017
README.md Add wdl_runner pipeline example. (#32) Jul 21, 2016

README.md

pipelines-api-examples

This repository contains examples for the [Google Genomics Pipelines API] (https://cloud.google.com/genomics/reference/rest/v1alpha2/pipelines).

Alpha
This is an Alpha release of Google Genomics API. This feature might be changed in backward-incompatible ways and is not recommended for production use. It is not subject to any SLA or deprecation policy.

The API provides an easy way to create, run, and monitor command-line tools on Google Compute Engine running in a Docker container. You can use it like you would a job scheduler.

The most common use case is to run an off-the-shelf tool or custom script that reads and writes files. You may want to run such a tool over files in Google Cloud Storage. You may want to run this independently over hundreds or thousands of files.

The typical flow for a pipeline is:

  1. Create a Compute Engine virtual machine
  2. Copy one or more files from Cloud Storage to a disk
  3. Run the tool on the file(s)
  4. Copy the output to Cloud Storage
  5. Destroy the Compute Engine virtual machine

You can submit batch operations from your laptop, and have them run in the cloud. You can do the packaging to Docker yourself, or use existing Docker images.

Prerequisites

  1. Clone or fork this repository.
  2. If you plan to create your own Docker images, then install docker: https://docs.docker.com/engine/installation/#installation
  3. Follow the Google Genomics getting started instructions to set up your Google Cloud Project. The Pipelines API requires that the following are enabled in your project:
    1. Genomics API
    2. Cloud Storage API
    3. Compute Engine API
  4. Follow the Google Genomics getting started instructions to install and authorize the Google Cloud SDK.
  5. Install or update the python client via pip install --upgrade google-api-python-client. For more detail see https://cloud.google.com/genomics/v1/libraries.

Examples

See Also