These tests pass individually and as part of complete test suite runs, but cause an intermittent NoSuchElementException in maven when the unit tests are run on their own. Disabling these tests until the cause of this can be identified.
…ests Move GATKRunReport tests from private to public
-Hide AWS downloader credentials in a private properties file -Remove references to private ActiveRegion walker Allows phone home functionality to be tested at release time when we are running tests on the release jar.
Updated package-tests classpath, and allowing javac -cp <package>.jar.
GATK changes to conform to Tribble refactoring as part improving Tabix s...
…nce_to_rr Remove unused and unnecessary argument
Mark had mis-named this input callset to the knowledgebase. It's the pi...
…pilot2 liftover, not pilot1.
…x support in Tribble (among other things). 1. Enable on-the-fly indexing for vcf.gz. 2. Handle on-the-fly indexing where file to be indexed is not a regular file, thus index should not be created. 3. Add method setProgressLogger to all SAMFileWriter implementations. 4. Revved picard to 1.109.1722 5. IndelRealigner md5s change because the MC tag is added to records now. Fixed up and signed off by ebanks.
Bh sor new annotation
Added documentation category for CalculateGenotypePosteriors
…nfo_revise Improved criteria to select best haplotypes out from the assembly graph.
Package tests now hard coding just the gatk-framework tests jar, to include ONLY BaseTest, until the exclusions may be debugged. Removing cofoja's annotation service from the package jars, to allow javac -cp <package>.jar.
Currently the best haplotypes are those that accumulate the largest ABSOLUTE edge *multiplicity* sum across their path in the assembly graph. The edge *mulitplicity* is equal to the number of reads that expand through that edge, i.e. have a kmer that uniquely map to some vertex up-stream from the edge and the following base calls extend across that edge to vertices downstream from it. Despite that it is obvious that higher multiplicties correlated with haplotype probability this criterion fails short in some regards of which the most relevant is: As it is evaluated in condensed seq-graph (as supposed to uncompressed read-threading-graphs) it is bias to haplotypes that have more short-sequence vetices ( -> ATGC -> CA -> has worse score than -> A -> T -> G -> C -> C -> A ->). This is partly result of how we modify the edge multiplicities when we merge vertices from a linear chain. This pull-request addresses the problem by changing to a new scoring schema based in likelihood estimates: Each haplotype's likelihood can be calculated as the multiplication of the likelihood of "taking" its edges in the assembly graph. The likelihood of "taking" an edge in the assembly graph is calculated as its multiplicity divide by the sum of multiplicity of edges that share the same source vertex. This pull-request addresses the following stories: https://www.pivotaltracker.com/story/show/66691418 https://www.pivotaltracker.com/story/show/64319760 Change Summary: 1. Change to the new scoring schema. 2. Added a graph DOT printing code to KBestHaplotypeFinder in order to diagnose scoring. 3. Graph transformation have been modified in order to generate no 0-multiplicity edges. (Nevertheless the schema above should work with 0 edges assuming that they are in fact 0.5)
… and new sub-class StrandOddsRatio(). Latter is test based on symmetric odds ratio more appropriate than Fisher exact test when number of samples is large. https://www.pivotaltracker.com/story/show/66087886
…ng_error Unconditionally include all of commons-httpclient in the GATK/Queue jars
The maven shade plugin was eliminating a necessary class (IgnoreCookiesSpec) when packaging the GATK/Queue. Work around this by telling maven to always package all of commons-httpclient.
Fix for non-determinism in the VQSR with very large data sets
Added new functionality to the FastaAlternateReferenceMaker to have it o...
…t output IUPAC codes for het sites. Enable it with the new --useIUPAC argument. Added both unit and integration tests for the new functionality - and fixed up the exising tests once I was in there.
…or_all_sites_GVCF Added an option to CombineGVCFs to create basepair resolution gVCFs from...
…pairhmm Emit a warning whenever the VectorLoglessPairHMM is used
…rom banded ones. Use the --convertToBasePairResolution argument to enable this functionality.
…nsensus_mode Added the consensus mode used for the 1000 Genomes Project to the Haplot...
…lotypeCaller. -- All the provided alleles are added to the assembly graph as potential haplotypes but they aren't forcibly genotyped like in GGA mode. -- Added integration test for this mode
Rename existing PipelineTests to QueueTests to prepare for upcoming push of new pipeline tests
…ush of new pipeline tests -These tests are really integration tests for Queue rather than generalized pipeline tests, so it makes sense to call them QueueTests. -Rename test classes and maven build targets, and update shell scripts to reflect new naming.
Experimental native PairHMM implementation from Intel. Off by default.