Tools for using Picard and GATK with the Google Genomics API.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
lib Moving to V1 of the API, updating all deps Jul 28, 2016
src/main
testdata Make output sort order to match Picard expectations, mostly dealing w… Jan 19, 2015
.gitignore
.travis.yml
CONTRIBUTING.rst First commit for GA4GH common classes and Picard runner Sep 2, 2014
LICENSE Initial commit Aug 19, 2014
README.md Documentation for adding ga4gh support Nov 16, 2015
adding-ga4gh-support.md Documentation for adding ga4gh support Nov 16, 2015
pom.xml

README.md

gatk-tools-java Build Status Coverage Status

Tools for using Picard and GATK with Genomics API.

  • Common classes for getting Reads from GA4GH Genomics API and exposing them as SAMRecord "Iterable" resource.

  • Implementation of a custom reader that can be plugged into Picard tools to handle reading of the input data specified via a url and coming from GA4GH API. Works both with REST and GRPC (faster) implementation of Google Genomics API.

  • A set of shell scripts that demonstrate how to run Picard tools with Ga4GH custom reader.

  • Requires HTSJDK v.1.128 and Picard v.1.133 and later.

To build this package: mvn compile package

This command produces 3 files:

gatk-tools-java-[ver]-SNAPSHOT.jar

A small JAR with just the classes from this package, needs to be run with mvn run, see example.sh script that demonstrates how to use classes in this package to get SAMRecords from GA4GH API. You can run the example like this:

gatk-tools-java$ src/main/scripts/example.sh

gatk-tools-java-[ver]-SNAPSHOT-jar-with-dependencies.jar

A large jar with ALL dependencies in it. This file is suitable for injecting a custom reader into a regularly built Picard distribution, without recompiling it.

You will need to download and build Picard tools, see insrtuctions here.

See view_sam_file.sh and run_picard.sh scripts for examples of usage. You can run the example like this:

gatk-tools-java$ src/main/scripts/view_sam_file.sh

gatk-tools-java-[ver]-SNAPSHOT-minimized.jar

JAR with dependencies suitable for compiling together with Picard tools. See instructions on building Picard with gatk-tools-java.

With Picard tools built this way you can specify GA4GH urls as INPUT parameters and do not have to use -Dsamjdk.custom_reader.

You should be able to run Picard tool like so:

picard$ java -jar dist/picard.jar ViewSam \
INPUT=https://www.googleapis.com/genomics/v1beta2/readgroupsets/CK256frpGBD44IWHwLP22R4/ \
GA4GH_CLIENT_SECRETS=../client_secrets.json

Converting more Picard tools

Please read the detailed description of the code changes in HTSJDK and Picard to support GA4GH APIs and how to convert more tools.