Google Lifesciences Pipelines Tools

This repository contains various tools that are useful when running pipelines with the Google Lifesciences API.

Quick Start Using Cloud Shell

Enable the Lifesciences API and the Compute Engine API in a new or existing Google Cloud project.
Start a Cloud Shell inside your project.
Inside the Cloud Shell, run the command
```
 go get github.com/googlegenomics/pipelines-tools/...
```
This command downloads and installs the pipelines tools. Note that to build these tools outside the Cloud Shell you will need the Go tool chain.

Make a bucket on GCS to store the output from the pipeline:

 export BUCKET=gs://${GOOGLE_CLOUD_PROJECT}-pipelines
 gsutil mb ${BUCKET}

Put some test data into the bucket:

 echo "Hello World" | gsutil cp - ${BUCKET}/input

Make a pipeline script that computes the SHA1 sum of a file:
```
 echo 'sha1sum ${INPUT0} > ${OUTPUT0}' > sha1.script
```

Run the script using the pipelines API:

 pipelines run --inputs=${BUCKET}/input --outputs=${BUCKET}/output sha1.script

Check the generated output file:
```
 gsutil cat ${BUCKET}/output
```

That's it: you've run your first pipeline. For more information about the input formats supported by the pipelines tool, check out the source code. To learn more about the Pipelines API, consult the reference documentation.

Usage

The `pipelines` tool

This tool provides support for running, cancelling and inspecting pipelines.

As a simple example, to run a pipeline that prints 'hello world':

$ cat <<EOF > hello.script
echo "hello world"
EOF
$ pipelines --project=my-project run hello.script --output=gs://my-bucket/logs

After the pipeline finishes, you can inspect the output using gsutil:

$ gsutil cat gs://my-bucket/logs/output

The script file format is described in the source code for the command.

Using gcsfuse with the pipelines tool

Use --fuse flag to allow the pipelines tool to use gcsfuse to localize input files instead of copying them one by one with gsutil.

Note: Files other than those directly mentioned by the --inputs flag will be available to container, since the entire bucket is mounted.

SSH into the worker machine

The --ssh flag supported by the pipelines tool will start an ssh container in the background to allow you to log in using SSH and view logs in real time.

The `migrate-pipeline` tool

This tool takes a JSON encoded v1alpha2 run pipeline request and attempts to emit a v2alpha1 request that replicates the same behaviour.

For example, given a file v1.jsonpb that has a request containing a v1alpha2 ephemeral pipeline and arguments, running:

$ migrate-pipeline < v1.jsonpb

will produce a v2alpha1 request that performs the same action on standard output.

Support

Please report problems using the issue tracker.

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
gce		gce
gcsfuse		gcsfuse
io		io
migrate-pipeline		migrate-pipeline
pipelines		pipelines
ssh-server		ssh-server
.travis.yml		.travis.yml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Google Lifesciences Pipelines Tools

Quick Start Using Cloud Shell

Usage

The `pipelines` tool

Using gcsfuse with the pipelines tool

SSH into the worker machine

The `migrate-pipeline` tool

Support

About

Releases

Packages

Contributors 6

Languages

License

googlegenomics/pipelines-tools

Folders and files

Latest commit

History

Repository files navigation

Google Lifesciences Pipelines Tools

Quick Start Using Cloud Shell

Usage

The pipelines tool

Using gcsfuse with the pipelines tool

SSH into the worker machine

The migrate-pipeline tool

Support

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

The `pipelines` tool

The `migrate-pipeline` tool

Packages