gatk-tools-java
Tools for using Picard and GATK with Genomics API.
-
Common classes for getting Reads from GA4GH Genomics API and exposing them as SAMRecord "Iterable" resource.
-
Implementation of a custom reader that can be plugged into Picard tools to handle reading of the input data specified via a url and coming from GA4GH API.
-
Requires htsjdk version 1.128 and greater and Picard latest version (past this commit https://github.com/iliat/picard/commit/ebe987313d799d58b0673351b95d3ca91fed82bf).
-
A set of shell scripts (src/main/scripts) that demonstrate how to run Picard tools with Ga4GH custom reader.
The typical command line would look like:
java -jar \
-Dsamjdk.custom_reader=https://www.googleapis.com/genomics, \
-Dga4gh.client_secrets= \
dist/picard.jar \
INPUT=
E.g
java -jar \
-Dsamjdk.custom_reader=https://www.googleapis.com/genomics,com.google.cloud.genomics.gatk.htsjdk.GA4GHReaderFactory,\
pwd/gatk-tools-java-1.0.jar \
-Dga4gh.client_secrets=client_secrets.json \
dist/picard.jar ViewSam \
INPUT=https://www.googleapis.com/genomics/v1beta2/readgroupsets/CK256frpGBD44IWHwLP22R4/
The test read group set used here is the ex1_sorted.bam that can be found in testdata/ folder.
The data has been uploaded to the cloud project: https://console.developers.google.com/project/genomics-test-data/
-
You can download Picard from: http://broadinstitute.github.io/picard/
-
For Picard tools that have not yet been instrumented to work with a custom reader, you can use Ga4GHPicardRunner. It is a wrapper around Picard tools that allows for INPUTS into Picard tools to be ga4gh:// urls by consuming the data via the API and using pipes to send it to Picard tool.
Build:
To build with Maven: mvn bundle:bundle
To build with ant: ant gatk-tools-java-jar.