Skip to content
Compare
Choose a tag to compare

Picard and log4j upgrades

  • Picard 2.26.10, which includes log4j 2.17.1

Minor changes

  • Reduce memory requirement of MergeDgeSparse by making it multi-pass.

See the Drop-seq alignment cookbook and Census-seq computational protocols for detailed usage instructions, diagrams, and explanations of how these new systems work. For more information about Census-seq, see our manuscript on bioRxiv.

Compare
Choose a tag to compare

Picard and log4j upgrades

  • picard.jar v2.26.8, which includes log4j 2.16.0
  • remove stand-alone log4j.jar
  • upgrade to biojava 6.0.3
  • remove slf4j-log4j12

Dropulation programs - these programs are in support of our manuscript Natural variation in gene expression and Zika virus susceptibility revealed by villages of neural progenitor cells.

  • AssignCellsToSamples - This program takes as input a BAM file containing sequencing reads from a pool of donors, and a VCF file containing genotypes from those donors. The program emits the most likely donor for each cell.
  • DetectDoublets - This program takes as input a BAM file containing sequencing reads from a pool of donors, a VCF file containing genotypes from those donors, and the output from AssignCellsToSamples. The program emits the probability that each cell barcode is a doublet where a cell from two different donors have been co-encapsulated in the same droplet.

GTF parsing

  • pass command-line VALIDATION_STRINGENCY to GTFParser
  • Set default command-lin VALIDATION_STRINGENCY to LENIENT

Minor changes

  • improvements to sparse matrix I/O
  • allow merging of DGEs with same prefix if there are no cell barcode collisions.

See the Drop-seq alignment cookbook and Census-seq computational protocols for detailed usage instructions, diagrams, and explanations of how these new systems work. For more information about Census-seq, see our manuscript on bioRxiv

92c1eb2
Compare
Choose a tag to compare

Support for SBARRO experiments 

  • TagReadWithRabiesBarcodes - Tags an unaligned BAM with rabies virus barcode sequences and associated metrics. Barcode sequences are extracted using invariant sequences which flank the barcode regions.
  • FilterValidRabiesBarcodes - Filters rabies virus tags into pass/fail BAMs based on metrics associated with the extracted viral barcodes.
  • BipartiteRabiesVirusCollapse - Collapse rabies virus sequences in which half the barcode matches within some edit distance, within a cell.
     

Other updates

  • TrimStartingSequence update:  New algorithm with mismatch rate and sequence not require to be anchored to start of read.  Note that different sequence is recommended for 10X and Drop-seq.  Old, buggy behavior can be enabled with LEGACY=true.
  • PolyATrimmer now optionally stores length of polyA and trimmed adapter
  • ConvertTagToReadGroup improved error messages when read group missing from input data
  • FilterBamByGeneFunction - new program to filter reads to specified gene functions, in the same way DigtialGeneExpression and related programs internally filter data.
  • CensusSeq and related software can now take as input a list of BAMs.
  • Csi analysis now generates a contamination confidence interval.
  • Improvements to make GTF parsing more robust
Compare
Choose a tag to compare
  • Census-seq tools and documentation (CensusSeq, RollCall, CsiAnalysis)
  • New program SplitBamByCell - partitions a large BAM into many smaller BAMs by cell barcodes, so that each cell barcode is contained in only one output BAM, and BAMs have roughly the same size.
  • More flexible DGE merging
  • New program CountUnmatchedSampleIndices
  • Upgrade to Picard 2.20.5
  • Add option to FilterBam and FilterBamByTag to fail if not enough reads pass filter
  • Many bug fixes -- git log v2.3.0..v2.4.0 to view

See the Drop-seq alignment cookbook and Census-seq computational protocols for detailed usage instructions, diagrams, and explanations of how these new systems work. For more information about Census-seq, see our manuscript on bioRxiv

Compare
Choose a tag to compare
  • New program ConvertTagToReadGroup.
  • Encode quality tags as strings in TagBamWithReadSequenceExtended.
  • FilterBam bug fix: output file was not being close properly.
  • In DigitalExpression, allow null STRAND_STRATEGY and/or null LOCUS_FUNCTION_LIST, for BAMs that are not tagged with gene function.
  • In FilterBam and FilterBamByTag, add ability to output a summary file.

See the Drop-seq alignment cookbook for detailed usage instructions, diagrams, and explanations of how these new systems work.

Compare
Choose a tag to compare
  • Support for SpermSeq

    • New program SpermSeqMarkDuplicates.
    • New program GenotypeSperm.
    • New program CreateSnpIntervalFromVcf.
    • See SPERMSEQ_COOKBOOK_LINK for details about how to use these and other Drop-seq programs to run SpermSeq analysis.
  • Add mutational pathway collapse strategy to CollapseTagWithContext. This collapse strategy looks at graphs of edit distance for collapse. All barcodes collapsed form a graph connected by nodes the given edit distance. For a cell barcode, find all sibling barcodes at up to some edit distance away, and collapse those barcodes into the core. Then find additional barcodes edit distance away from the children in the graph and collapse those in as well, up to some specified maximum edit distance.

  • New program ComputeUMISharing. If you collapse your barcodes (for example cell barcodes) in some fashion and tag your BAM with these new barocdes, this allows you to compute the UMI sharing between the original cell barcode and all of the barcodes that were collapsed into it. UMI sharing should be significantly high (>80% shared) if your collapse strategy was successful.

  • Handle gzipped TAG_VALUES_FILE in FilterBamByTag.

  • In TagBamWithReadSequenceExtended, added option to store barcode quality scores per base.

  • A number of bugfixes and argument checking to satisfy bug reports since the last release.

See the Drop-seq alignment cookbook for detailed usage instructions, diagrams, and explanations of how these new systems work.

Compare
Choose a tag to compare
  • Make GTF parsing more robust
  • Better error reporting of bad gene name in GTF
  • Sample alignment script bug fixes
  • In wrapper scripts, don't set TMPDIR environment variable to a directory that doesn't exist
  • CollapseTagWithContext adaptive edit distance now does a better job of selecting the proper edit distance threshold to use for a few rare edge cases
  • Respect TMPDIR environment variable in unit tests, and better unit test cleanup
20b6aa9
Compare
Choose a tag to compare

Legacy Release v1.12

Pre-release
Pre-release

This version of the software is not tagged in GitHub repository, because it predates this repository. Don't download this version unless you know what you are doing. It is obsolete.

1ef3a59
Compare
Choose a tag to compare

There are a number of enhancements to the Drop-seq platform that come with version 2.0.0:

  1. New methods to clean up the cell barcodes from bead synthesis errors and PCR errors result in less clutter in the data when trying to decide which cell barcodes are truly cells.
  2. Enhanced Digital expression to be more flexible in how it interprets gene annotations, allowing the program to extract both intronic DGE data as well as the typical coding+utr data.
  3. New reference meta data tools and a shell script to generate reference meta data.
  4. Slightly less arbitrary program names and argument names, with a consistent set of rules so future programs don't run into the same problems.
  5. Bug fixes (!)

Please note that command line argument format will soon be changing slightly. This is due to Picard (which we build on top of) changing in the future: Below is an example command line showing the difference between the old and new syntax, with the old command line shown first:

java -jar build/libs/picard.jar SortSam I=testdata/picard/sam/namesorted.test.sam SORT_ORDER=coordinate O=sorted.sam

java -jar build/libs/picard.jar SortSam -I testdata/picard/sam/namesorted.test.sam -SORT_ORDER coordinate -O sorted.sam

See the Drop-seq alignment cookbook for detailed usage instructions, diagrams, and explanations of how these new systems work.

Compare
Choose a tag to compare

This is the version of Drop-seq tools that was available on http://mccarrolllab.org/dropseq/ prior to the creation of this GitHub repository.

Drop-seqAlignmentCookbookv1.2Jan2016.pdf

Note that the source code zip and tar.gz do not reflect the version of code used to build this release. The code used to build this release can be found in Drop-seq_tools-1.13.zip.

The version before this (1.12) can be found here