Skip to content

Latest commit

 

History

History
357 lines (339 loc) · 35.1 KB

CHANGES.md

File metadata and controls

357 lines (339 loc) · 35.1 KB

ADAM

Version 0.17.0

  • ISSUE 691: fix BAM/SAM header setting when writing on cluster
  • ISSUE 688: make adamLoad public
  • ISSUE 694: Fix parent reference in distribution module
  • ISSUE 684: a few region-join nits
  • ISSUE 682: [ADAM-681] Remove menacing error message about reqd .adam extension
  • ISSUE 680: [ADAM-674] Delete Bam2ADAM.
  • ISSUE 678: upgrade to bdg utils 0.2.1
  • ISSUE 668: [ADAM-597] Move correction out of ADAM and into a downstream project.
  • ISSUE 671: Bug fix in ReferenceUtils.unionReferenceSet
  • ISSUE 667: [ADAM-666] Clean up key not found error in partitioner code.
  • ISSUE 656: Update Vcf2ADAM.scala
  • ISSUE 652: added filterByOverlappingRegion in GeneFeatureRDDFunctions
  • ISSUE 650: [ADAM-649] Support transform of all BAM/SAM files in a directory.
  • ISSUE 647: [ADAM-646] Special case reads with '*' quality during BQSR.
  • ISSUE 645: [ADAM-634] Create a local ParquetLister for testing purposes.
  • ISSUE 633: [Adam] Tests for SAMRecordConverter.scala
  • ISSUE 641: [ADAM-640] Fix incorrect exclusion for org.seqdoop.htsjdk.
  • ISSUE 632: [ADAM-631] Allow VCF conversion to sort on output after coalescing.
  • ISSUE 628: [ADAM-627] Makes ReferenceFile trait extend Serializable.
  • ISSUE 637: check for mac brew alternate spark install structure
  • ISSUE 624: Conceptual fix for duplicate marking and sorting stragglers
  • ISSUE 629: [ADAM-604] Remove normalization code.
  • ISSUE 630: Add flatten command.
  • ISSUE 619: [ADAM-540] Move to new HTSJDK release; should support Java 8.
  • ISSUE 626: [ADAM-625] Enable globbing for BAM.
  • ISSUE 621: Removes the predicates package.
  • ISSUE 620: [ADAM-600] Adding RegionJoin trait.
  • ISSUE 616: [ADAM-565] Upgrade to Parquet filter2 API.
  • ISSUE 613: [ADAM-612] Point to proper k-mer counters.
  • ISSUE 588: [ADAM-587] Clean up loading checks.
  • ISSUE 592: [ADAM-513] Remove ReferenceMappable trait.
  • ISSUE 606: [ADAM-605] Remove visualization code.
  • ISSUE 596: [ADAM-595] Delete the 'comparisons' code.
  • ISSUE 590: [ADAM-589] Removed pileup code.
  • ISSUE 586: [ADAM-452] Fixes SM attribute on ADAM to BAM conversion.
  • ISSUE 584: [ADAM-583] Add k-mer counting functionality for nucleotide contig fragments

Version 0.16.0

  • ISSUE 570: A few small conversion fixes
  • ISSUE 579: [ADAM-578] Update end of read when trimming.
  • ISSUE 564: [ADAM-563] Add warning message when saving Parquet files with incorrect extension
  • ISSUE 576: Changed hashCode implementations to improve performance of BQSR
  • ISSUE 569: Typo in the narrowPeak parser
  • ISSUE 568: Moved the Timers object from bdg-utils back to ADAM
  • ISSUE 478: Move non-genomics code
  • ISSUE 550: [ADAM-549] Added documentation for testing and CI for ADAM.
  • ISSUE 555: Makes maybeLoadVCF private.
  • ISSUE 558: Makes Features2ADAMSuite use SparkFunSuite
  • ISSUE 557: Randomize ports and turn off Spark UI to reduce bind exceptions in tests
  • ISSUE 552: Create test suite for FlagStat
  • ISSUE 554: privatize ADAMContext.maybeLoad{Bam,Fastq}
  • ISSUE 551: [ADAM-386] Multiline FASTQ input
  • ISSUE 542: Variants Visualization
  • ISSUE 545: [ADAM-543][ADAM-544] Fix issues with ADAM scripts and classpath
  • ISSUE 535: [ADAM-441] put a check in for Nothing. Throws an IAE if no return type is provided
  • ISSUE 546: [ADAM-532] Fix wigFix intermittent test failure
  • ISSUE 534: [ADAM-528][ADAM-533] Adds new RegionJoin impl that is shuffle-based
  • ISSUE 531: [ADAM-529] Attaching scaladoc to released distribution.
  • ISSUE 413: [ADAM-409][ADAM-520] Added local wigfix2bed tool
  • ISSUE 527: [ADAM-526] VcfAnnotation2ADAM only counts once
  • ISSUE 523: don't open non-.adam-extension files as ADAM files
  • ISSUE 521: quieting wget output
  • ISSUE 482: [ADAM-462] Coverage region calculation
  • ISSUE 515: [ADAM-510] fix for bash syntax error; add ADDL_JARS check to adam-submit

Version 0.15.0

  • ISSUE 509: Add a 'distribution' module to create assemblies
  • ISSUE 508: Upgrade from Parquet 1.4.3 to 1.6.0rc4
  • ISSUE 498: [ADAM-496] Changes VCF to flat ADAM command name and usage
  • ISSUE 500: [ADAM-495] Require SPARK_HOME for adam-submit
  • ISSUE 501: [ADAM-499] Add -onlyvariants option to vcf2adam
  • ISSUE 507: [ADAM-505] Removed adam-local from docs
  • ISSUE 504: [ADAM-502] Add missing Long implicit to ColumnReaderInput
  • ISSUE 503: [ADAM-473] Make RecordCondition and FieldCondition public
  • ISSUE 494: Fix foreach block for vcf ingest
  • ISSUE 492: Documentation cleanup and style improvements
  • ISSUE 481: [ADAM-480] Switch assembly to single goal.
  • ISSUE 487: [ADAM-486] Add port option to viz command.
  • ISSUE 469: [ADAM-461] Fix ReferenceRegion and ReferencePosition impl
  • ISSUE 440: [ADAM-439] Fix ADAM to account for BDG-FORMATS-35: Avro uses Strings
  • ISSUE 470: added ReferenceMapping for Genotype, filterByOverlappingRegion for GenotypeRDDFunctions
  • ISSUE 468: refactor RDD loading; explicitly load alignments
  • ISSUE 474: Consolidate documentation into a single location in source.
  • ISSUE 471: Fixed typo on MAVEN_OPTS quotation mark
  • ISSUE 467: [ADAM-436] Optionally output original qualities to fastq
  • ISSUE 451: add adam view command, analogous to samtools view
  • ISSUE 466: working examples on .sam included in repo
  • ISSUE 458: Remove unused val from Reads2Ref
  • ISSUE 438: Add ability to save paired-FASTQ files
  • ISSUE 457: A few random Predicate-related cleanups
  • ISSUE 459: a few tweaks to scripts/jenkins-test
  • ISSUE 460: Project only the sequence when kmer/qmer counting
  • ISSUE 450: Refactor some file writing and reading logic
  • ISSUE 455: [ADAM-454] Add serializers for Avro objects which don't have serializers
  • ISSUE 447: Update the contribution guidelines
  • ISSUE 453: Better null handling for isSameContig utility
  • ISSUE 417: Stores original position and original cigar during realignment.
  • ISSUE 449: read “OQ” attr from structured SAMRecord field
  • ISSUE 446: Revert "[ADAM-237] Migrate to Chill serialization libraries."
  • ISSUE 437: random nits
  • ISSUE 434: Few transform tweaks
  • ISSUE 435: [ADAM-403] Remove seqDict from RegionJoin
  • ISSUE 431: A few tweaks, typo corrections, and random cleanups
  • ISSUE 430: [ADAM-429] adam-submit now handles args correctly.
  • ISSUE 427: Fixes for indel realigner issues
  • ISSUE 418: [ADAM-416] Removing 'ADAM' prefix
  • ISSUE 404: [ADAM-327] Adding gene, transcript, and exon models.
  • ISSUE 414: Fix error in adam-local alias
  • ISSUE 415: Update README.md to reflect Spark 1.1
  • ISSUE 412: [ADAM-411] Updated usage aliases in README. Fixes #411.
  • ISSUE 408: [ADAM-405] Add FASTQ output.
  • ISSUE 385: [ADAM-384] Adds import from FASTQ.
  • ISSUE 400: [ADAM-399] Fix link to schemas.
  • ISSUE 396: [ADAM-388] Sets Kryo serialization with --conf args
  • ISSUE 394: [ADAM-393] Adds knobs to SparkContext creation in SparkFunSuite
  • ISSUE 391: [ADAM-237] Migrate to Chill serialization libraries.
  • ISSUE 380: Rewrite of MarkDuplicates which seems to improve performance
  • ISSUE 387: fix some deprecation warnings

Version 0.14.0

  • ISSUE 376: [ADAM-375] Upgrade to Hadoop-BAM 7.0.0.
  • ISSUE 378: [ADAM-360] Upgrade to Spark 1.1.0.
  • ISSUE 379: Fix the position of the jar path in the submit.
  • ISSUE 383: Make Mdtags handle '=' and 'X' cigar operators
  • ISSUE 369: [ADAM-369] Improve debug output for indel realigner
  • ISSUE 377: [ADAM-377] Update to Jenkins scripts and README.
  • ISSUE 374: [ADAM-372][ADAM-371][ADAM-365] Refactoring CLI to simplify and integrate with Spark model better
  • ISSUE 370: [ADAM-367] Updated alias in README.md
  • ISSUE 368: erasure, nonexhaustive-match, deprecation warnings
  • ISSUE 354: [ADAM-353] Fixing issue with SAM/BAM/VCF header attachment when running distributed
  • ISSUE 357: [ADAM-357] Added Java Plugin hook for ADAM.
  • ISSUE 352: Fix failing MD tag
  • ISSUE 363: Adding maven assembly plugin configuration to create tarballs
  • ISSUE 364: [ADAM-364] Fixing remaining cs.berkeley.edu URLs.
  • ISSUE 362: Remove mention of uberjar from README

Version 0.13.0

  • ISSUE 343: Allow retrying on failure for HTTPRangedByteAccess
  • ISSUE 349: Fix for a NullPointerException when hostname is null in Task Metrics
  • ISSUE 347: Bug fix for genome browser
  • ISSUE 346: Genome visualization
  • ISSUE 342: [ADAM-309] Update to bdg-formats 0.2.0
  • ISSUE 333: [ADAM-332] Upgrades ADAM to Spark 1.0.1.
  • ISSUE 341: [ADAM-340] Adding the TrackedLayout trait and implementation.
  • ISSUE 337: [ADAM-335] Updated README.md to reflect migration to appassembler.
  • ISSUE 311: Adding several simple normalizations.
  • ISSUE 330: Make mismatch and deletes positions accessible
  • ISSUE 334: Moving code coverage into a profile
  • ISSUE 329: Add count of mismatches to mdtag
  • ISSUE 328: [ADAM-326] Adding a 5-second retry on the HttpRangedByteAccess test.
  • ISSUE 325: Adding documentation for commit/issue nomenclature and rebasing

Version 0.12.1

  • ISSUE 308: Fixing the 'index 0' bug in features2adam
  • ISSUE 306: Adding code for lifting over between sequences and the reference genome.
  • ISSUE 320: Remove extraneous implicit methods in ReferenceMappingContext
  • ISSUE 314: Updates to indel realigner to improve performance and accuracy.
  • ISSUE 319: Adding scripts for publishing scaladoc.
  • ISSUE 315: Added table of (wall-clock) stage durations when print_metrics is used
  • ISSUE 312: Fixing sources jar
  • ISSUE 313: Making the CredentialsProperties file optional
  • ISSUE 267: Parquet and indexed Parquet RDD implementations, and indices.
  • ISSUE 301: Add Beacon's AlleleCount
  • ISSUE 293: Add aggregation and display of metrics obtained from Spark
  • ISSUE 295: Fix broken link to ADAM specification for storing reads.
  • ISSUE 292: Cleaning up scaladoc generation warnings.
  • ISSUE 289: Modifying interleaved fastq format to be hadoop version independent.
  • ISSUE 288: Add ADAMFeature to Kryo registrator
  • ISSUE 286: Removing some debug printout that was left in.
  • ISSUE 287: Cleaning hadoop dependencies
  • ISSUE 285: Refactoring read groups to increase the amount of data stored.
  • ISSUE 284: Cleaning up build warnings.
  • ISSUE 280: Move to bdg-formats
  • ISSUE 283: Fix reference name comment
  • ISSUE 282: Minor cleanup on interleaved FASTQ input format.
  • ISSUE 277: Implemented HTTPRangedByteAccess.
  • ISSUE 274: Added clarifying note to ADAMVariantContext
  • ISSUE 279: Simplify format-source
  • ISSUE 278: Use maven license plugin to ensure source has correct license
  • ISSUE 268: Adding fixed depth prefix trie implementation
  • ISSUE 273: Fixes issue in reference models where strings are not sanitized on collection from avro.
  • ISSUE 272: Created command categories
  • ISSUE 269: Adding k-mer and q-mer counting.
  • ISSUE 271: Consolidate Parquet logging configuration

Version 0.12.0

  • ISSUE 264: Parquet-related Utility Classes
  • ISSUE 259: ADAMFlatGenotype is a smaller, flat version of a genotype schema
  • ISSUE 266: Removed extra command 'BuildInformation'
  • ISSUE 263: Added AdamContext.referenceLengthFromCigar
  • ISSUE 260: Modifying conversion code to resolve #112.
  • ISSUE 258: Adding an 'args' parameter to the plugin framework.
  • ISSUE 262: Adding reference assembly name to ADAMContig.
  • ISSUE 256: Upgrading to Spark 1.0
  • ISSUE 257: Adds toString method for sequence dictionary.
  • ISSUE 255: Add equals, canEqual, and hashCode methods to MdTag class

Version 0.11.0

  • ISSUE 254: Cleanup import statements
  • ISSUE 250: Adding ADAM to SAM conversion.
  • ISSUE 248: Adding utilities for read trimming.
  • ISSUE 252: Added a note about rebasing-off-master to CONTRIBUTING.md
  • ISSUE 249: Cosmetic changes to FastaConverter and FastaConverterSuite.
  • ISSUE 251: CHANGES.md is updated at release instead of per pull request
  • ISSUE 247: For #244, Fragments were incorrect order and incomplete
  • ISSUE 246: Making sample ID field in genotype nullable.
  • ISSUE 245: Adding ADAMContig back to ADAMVariant.
  • ISSUE 243: Rebase PR#238 onto master

Version 0.10.0

  • ISSUE 242: Upgrade to Parquet 1.4.3
  • ISSUE 241: Fixes to FASTA code to properly handle indices.
  • ISSUE 239: Make ADAMVCFOutputFormat public
  • ISSUE 233: Build up reference information during cigar processing
  • ISSUE 234: Predicate to filter conversion
  • ISSUE 235: Remove unused contiglength field
  • ISSUE 232: Add -pretty and -o to the print command
  • ISSUE 230: Remove duplicate mdtag field
  • ISSUE 231: Helper scripts to run an ADAM Console.
  • ISSUE 226: Fix ReferenceRegion from ADAMRecord
  • ISSUE 225: Change Some to Option to check for unmapped reads
  • ISSUE 223: Use SparkConf object to configure SparkContext
  • ISSUE 217: Stop using reference IDs and use reference names instead
  • ISSUE 220: Update SAM to ADAM conversion
  • ISSUE 213: BQSR updates

Version 0.9.0

  • ISSUE 214: Upgrade to Spark 0.9.1
  • ISSUE 211: FastaConverter Refactor
  • ISSUE 212: Cleanup build warnings
  • ISSUE 210: Remove Scalariform from process-sources phase
  • ISSUE 209: Fix Scalariform issues and Maven warnings
  • ISSUE 207: Change from deprecated manifest erasure to runtimeClass
  • ISSUE 206: Add Scalariform settings to pom
  • ISSUE 204: Update Avro code gen to not mark fields as deprecated.

Version 0.8.0

  • ISSUE 203: Move package from edu.berkeley.cs.amplab to org.bdgenomics
  • ISSUE 199: Updating pileup conversion code to convert sequences that use the X and = (EQ) CIGAR operators
  • ISSUE 191: Add repartition parameter
  • ISSUE 183: Fixing Job.getInstance call that breaks hadoop 1 compatibility.
  • ISSUE 192: Add docs and scripts for creating a release
  • ISSUE 193: Issue #137, clarify role of CHANGES.{md,txt}

Version 0.7.2

  • ISSUE 187: Add summarize_genotypes command
  • ISSUE 178: Upgraded to Hadoop-BAM 0.6.2/Picard 1.107.
  • ISSUE 173: Parse annotations out of vcf files
  • ISSUE 162: Refactored SequenceDictionary
  • ISSUE 180: BQSR using vcf loader
  • ISSUE 179: Update maven-surefire-plugin dependency version to 2.17, also create an ...
  • ISSUE 175: VariantContext converter refactor
  • ISSUE 169: Cleaning up mpileup command
  • ISSUE 170: Adding variant field enumerations

Version 0.7.1

Version 0.7.3

Version 0.7.2

  • ISSUE 166: Pair-wise genotype concordance of genotype RDDs, with CLI tool

Version 0.7.0

  • ISSUE 171: Add back in allele dosage for genotypes.

Version 0.7.0

  • ISSUE 167: Fix for Hadoop 1.0.x support
  • ISSUE 165: call PluginExecutor in apply method, fixes issue 164
  • ISSUE 160: Refactoring FASTA work to break contig sizes.
  • ISSUE 78: Upgrade to Spark 0.9 and Scala 2.10
  • ISSUE 138: Display Git commit info on command line
  • ISSUE 161: Added switches to spark context creation code
  • ISSUE 117: Add a "range join" method.
  • ISSUE 151: Vcf work concordance and genotype
  • ISSUE 150: Remaining variant changes for adam2vcf, unit tests, and CLI modifications
  • ISSUE 147: Resurrect VCF conversion code
  • ISSUE 148: Moving createSparkContext into core
  • ISSUE 142: Enforce Maven and Java versions
  • ISSUE 144: Merge of last few days of work on master into this branch
  • ISSUE 124: Vcf work rdd master merge
  • ISSUE 143: Changing package declaration to match test file location and removing un...
  • ISSUE 140: Update README.md
  • ISSUE 139: Update README.md
  • ISSUE 129: Modified pileup transforms to improve performance + to add options
  • ISSUE 116: add fastq interleaver script
  • ISSUE 125: Add design doc to CONTRIBUTING document
  • ISSUE 114: Changes to RDD utility files for new variant schema
  • ISSUE 122: Add IRC Channel to readme
  • ISSUE 100: CLI component changes for new variant schema
  • ISSUE 108: Adding new PluginExecutor command
  • ISSUE 98: Vcf work remove old variant
  • ISSUE 104: Added the port erasure to SparkFunSuite's cleanup.
  • ISSUE 107: Cleaning up change documentation.
  • ISSUE 99: Encoding tag types in the ADAMRecord attributes, adding the 'tags' command
  • ISSUE 105: Add initial documentation on contributing
  • ISSUE 97: New schema, variant context converter changes, and removal of old genoty...
  • ISSUE 79: Adding ability to convert reference FASTA files for nucleotide sequences
  • ISSUE 91: Minor change, increase adam-cli usage width to 150 characters
  • ISSUE 86: Fixes to pileup code
  • ISSUE 88: Added function for building variant context from genotypes.
  • ISSUE 81: Update README and cleanup top-level cli help text
  • ISSUE 76: Changing hadoop fs call to be compatible with Hadoop 1.
  • ISSUE 74: Updated CHANGES.txt to include note about the recursive-load branch.
  • ISSUE 73: Support for loading/combining multiple ADAM files into a single RDD.
  • ISSUE 72: Added ability to create regions from reads, and to merge adjacent regions
  • ISSUE 71: Change RecalTable to use optimized phred calculations
  • ISSUE 68: sonatype-nexus-snapshots repository is already in parent oss-parent-7 pom
  • ISSUE 67: fix for wildcard exclusion maven warnings
  • ISSUE 65: Create a cache for phred -> double values instead of recalculating
  • ISSUE 60: Bugfix for BQSR: Offset into qualityScore list was wrong
  • ISSUE 66: add pluginDependency section and remove versions in plugin sections
  • ISSUE 61: Filter utility for inverse of Projection
  • ISSUE 48: Fix read groups mapping and add Y as base type
  • ISSUE 36: Adding reads to rods transformation.
  • ISSUE 56: Adding Yy as base in MdTag

Version 0.6.0

  • ISSUE 53: Fix Hadoop 2.2.0 support, upgrade to Spark 0.8.1
  • ISSUE 52: Attributes: Use 't' instead of ',', as , is a valid character
  • ISSUE 47: Adding containsRefName to SequenceDictionary
  • ISSUE 46: Reduce logging for the actual adamSave job
  • ISSUE 45: Make MdTag immutable
  • ISSUE 38: Small bugfixes and cleanups to BQSR
  • ISSUE 40: Fixing reference position from offset implementation
  • ISSUE 31: Fixing a few issues in the ADAM2VCF2ADAM pipeline.
  • ISSUE 30: Suppress parquet logging in FieldEnumerationSuite
  • ISSUE 28: Fix build warnings
  • ISSUE 24: Add unit tests for marking duplicates
  • ISSUE 26: Fix unmapped reads in sequence dictionary
  • ISSUE 23: Generalizing the Projection class
  • ISSUE 25: Adding support for before, after clauses to SparkFunSuite.
  • ISSUE 22: Add a unit test for sorting reads
  • ISSUE 21: Adding rod functionality: a specialized grouping of pileup data.
  • ISSUE 13: Cleaning up VCF<->ADAM pipeline
  • ISSUE 20: Added Apache License 2.0 boilerplate to tops of all the GB-(c) files
  • ISSUE 19: Allow the Hadoop version to be specified
  • ISSUE 17: Fix transform -sort_reads partitioning. Add -coalesce option to transform.
  • ISSUE 16: Fixing an issue in pileup generation and in the MdTag util.
  • ISSUE 15: Tweaks 1
  • ISSUE 12: Subclass testing bug in AdamContext.adamLoad
  • ISSUE 11: Missing brackets in VcfConverter.getType
  • ISSUE 10: Moved record field name enum over to the projections package.
  • ISSUE 8: Fixes to sorting in ReferencePosition
  • ISSUE 4: New SparkFunSuite test support class, logging util and new BQSR test.
  • ISSUE 1: Fix scalatest configuration and fix unit tests
  • ISSUE 14: Converting some of the Option() calls to Some()
  • ISSUE 13: Cleaning up VCF<->ADAM pipeline
  • ISSUE 9: Adding support for a Sequence Dictionary from BAM files
  • ISSUE 8: Fixes to sorting in ReferencePosition
  • ISSUE 7: ADAM variant and genotype formats; and a VCF->ADAM converter
  • ISSUE 4: New SparkFunSuite test support class, logging util and new BQSR test.
  • ISSUE 3: Adding in implicit conversion functions for going between Java and Scala...
  • ISSUE 2: Update from Spark 0.7.3 to 0.8.0-incubating
  • ISSUE 1: Fix scalatest configuration and fix unit tests