@daviesrob daviesrob released this Jul 18, 2018 · 40 commits to develop since this release

Assets 3

The samtools-1.9.tar.bz2 download is the full source code release. The “Source code” downloads are generated by GitHub and are incomplete as they don't bundle HTSlib and are missing some generated files.


  • Samtools mpileup VCF and BCF output is now deprecated. It is still functional, but will warn. Please use bcftools mpileup instead. (#884)

  • Samtools mpileup now handles the '-d' max_depth option differently. There is no longer an enforced minimum, and '-d 0' is interpreted as limitless (no maximum - warning this may be slow). The default per-file depth is now 8000, which matches the value mpileup used to use when processing a single sample. To get the previous default behaviour use the higher of 8000 divided by the number of samples across all input files, or 250. (#859)

  • Samtools stats new features:

    • The '--remove-overlaps' option discounts overlapping portions of templates when computing coverage and mapped base counting. (#855)

    • When a target file is in use, the number of bases inside the target is printed and the percentage of target bases with coverage above a given threshold specified by the '--cov-threshold' option. (#855)

    • Split base composition and length statistics by first and last reads. (#814, #816)

  • Samtools faidx new features:

    • Now takes long options. (#509, thanks to Pierre Lindenbaum)

    • Now warns about zero-length and truncated sequences due to the requested range being beyond the end of the sequence. (#834)

    • Gets a new option (--continue) that allows it to carry on when a requested sequence was not in the index. (#834)

    • It is now possible to supply the list of regions to output in a text file using the new '--region-file' option. (#840)

    • New '-i' option to make faidx return the reverse complement of the regions requested. (#878)

    • faidx now works on FASTQ (returning FASTA) and added a new fqidx command to index and return FASTQ. (#852)

  • Samtools collate now has a fast option '-f' that only operates on primary pairs, dropping secondary and supplementary. It tries to write pairs to the final output file as soon as both reads have been found. (#818)

  • Samtools bedcov gets a new '-j' option to make it ignore deletions (D) and reference skips (N) when computing coverage. (#843)

  • Small speed up to samtools coordinate sort, by converting it to use radix sort. (#835, thanks to Zhuravleva Aleksandra)

  • Samtools idxstats now works on SAM and CRAM files, however this isn't fast due to some information lacking from indices. (#832)

  • Compression levels may now be specified with the level=N output-fmt-option. E.g. with -O bam,level=3.

  • Various documentation improvements.

  • Bug-fixes:

    • Improved error reporting in several places. (#827, #834, #877, cd7197)

    • Various test improvements.

    • Fixed failures in the multi-region iterator (view -M) when regions provided via BED files include overlaps (#819, reported by Dave Larson).

    • Samtools stats now counts '=' and 'X' CIGAR operators when counting mapped bases. (#855)

    • Samtools stats has fixes for insert size filtering (-m, -i). (#845; #697 reported by Soumitra Pal)

    • Samtools stats -F now longer negates an earlier -d option. (#830)

    • Fix samtools stats crash when using a target region. (#875, reported by John Marshall)

    • Samtools sort now keeps to a single thread when the -@ option is absent. Previously it would spawn a writer thread, which could cause the CPU usage to go slightly over 100%. (#833, reported by Matthias Bernt)

    • Fixed samtools phase '-A' option which was incorrectly defined to take a parameter. (#850; #846 reported by Dianne Velasco)

    • Fixed compilation problems when using C_INCLUDE_PATH. (#870; #817 reported by Robert Boissy)

    • Fixed --version when built from a Git repository. (#844, thanks to John Marshall)

    • Use noenhanced mode for title in plot-bamstats. Prevents unwanted interpretation of characters like underscore in gnuplot version 5. (#829, thanks to M. Zapukhlyak)

    • blast2sam.pl now reports perfect match hits (no indels or mismatches). (#873, thanks to Nils Homer)

    • Fixed bug in fasta and fastq subcommands where stdout would not be flushed correctly if the -0 option was used.

    • Fixed invalid memory access in mpileup and depth on alignment records where the sequence is absent.

1.8

@daviesrob daviesrob released this Apr 3, 2018 · 100 commits to develop since this release

Assets 3
  • samtools calmd now has a quiet mode. This can be enabled by passing -Q to calmd. (Thanks to Colin Davenport)

  • In samtools depth -d 0 will effectively remove the depth limit. (#764)

  • Improvements made to samtools collate's interface and documentation. It is now possible to specify an output file name using -o, instead of deriving it from the prefix used for temporary files. The prefix itself is now optional if -o or -O (to stdout) is used. (#780)

  • Bug-fixes:

    • Make samtools addreplacerg choose output format by file extension. (#767; reported by Argy Megalios)

    • Merge tests now work on ungzipped data, allowing tests to be run against different deflate libraries.

    • samtools markdup error messages about missing tags have been updated with the suggestion that samtools fixmate is run beforehand. (#765; reported by Yudong Cai)

    • Enables the --reference option for samtools fastq. Now works like other programs when a reference sequence is needed for CRAM files. (#791, reported by Milana Kaljevic)


The samtools-1.8.tar.bz2 download is the full source code release. The “Source code” downloads are generated by GitHub and are incomplete as they don't bundle HTSlib and are missing some generated files.

1.7

@daviesrob daviesrob released this Jan 26, 2018 · 117 commits to develop since this release

Assets 3
  • HTSlib, and so samtools, now support BAMs which include CIGARs with more than 65535 operations as per HTS-Specs 18th November (dab57f4 and 2f915a8).

  • samtools quickcheck will now write a warning to stderr if it finds any problems. These messages can be suppressed with a new -q option.

  • samtools markdup can now mark supplementary alignments of reads where the primary alignment is found to be a duplicate. Supplementary marking can be turned on by passing the -S option to markdup. When this option is enabled, all the alignment data will be written to a temporary file so that supplementary alignments that occur before a duplicated primary can be correctly marked in the final output. The location of this temporary file can be influenced using the new -T option.

  • samtools view now supports HTSlib's new multi-region iterator. This can be enabled by passing the -M option to view. When using this option:

    • The BED filter (-L option) will use the index to skip through the file
    • Reads from overlapping regions will only be output once
  • samtools bedcov will now ignore BED comment and header lines (#571; thanks to Daniel Baker).

  • samtools collate now updates the @HD SO: and GO: tags, and sort will remove a GO: tag if present. (#757; reported by Imran Haque).

  • Bug-fixes:

  • maq2sam now checks for input files that end early. (#751; patch supplied by Alexandre Rebert of the Mayhem team, via Andreas Tille from Debian.)
  • Fixed incorrect check when looking up header tags that could lead to a crash in samtools stats. (#208; thanks to Dave Larson.)
  • Fixed bug in samtools fastq -O option where it would fail if the OQ tag in the input file had an unexpected type. (#758; reported by Taejeong Bae)
  • The MD5 calculations in samtools dict and md5fa did not handle non-alphabetic characters in the same way as the CRAM MD5 function. They have now been updated to match. (#704; reported by Chris Norman).
  • Fix possible infinite loop in samtools targetcut.
  • Building bam_tview_curses should no longer fail if a curses header file cannot be found.

The samtools-1.7.tar.bz2 download is the full source code release. The “Source code” downloads are generated by GitHub and are incomplete as they don't bundle HTSlib and are missing some generated files.

1.6

@daviesrob daviesrob released this Sep 28, 2017 · 142 commits to develop since this release

Assets 3
  • Added new markdup sub-command and -m option for fixmate. Used together,they allow duplicates to be marked and optionally removed. This fixes a number of problems with the old rmdup sub-command, for
    example issue #497. rmdup is kept for backwards compatibility but markdup should be used in preference.

  • Sort is now much better at keeping within the requested memory limit. It should also be slightly faster and need fewer temporary files when the file to be sorted does not fit in memory. (#593; thanks to Nathan Weeks.)

  • Sort no longer rewrites the header when merging from files. It can also now merge from memory, so fewer temporary files need to be written and it is better at sorting in parallel when everything fits in memory.

  • Both sort and merge now resolve ties when merging based on the position in the input file(s). This makes them fully stable for all ordering options. (Previously position sort was stable, but name and by tag
    sorts were not).

  • New --output-qname option for mpileup.

  • Support for building on Windows using msys2/mingw64 or cygwin has been improved.


The samtools-1.6.tar.bz2 download is the full source code release. The “Source code” downloads are generated by GitHub and are incomplete as they don't bundle HTSlib and are missing some generated files.

1.5

@valeriuo valeriuo released this Jun 21, 2017 · 160 commits to develop since this release

Assets 3
  • Samtools fastq now has a -i option to create a fastq file from an index tag, and a -T option (similar to -t) to add user specified aux tags to the fastq header line.

  • Samtools fastq can now create compressed fastq files, by giving the output filenames an extention of .gq, .bgz, or .bgzf

  • Samtools sort has a -t TAG option, that allows records to be sorted by the value of the specified aux tag, then by position or name. Merge gets a similar option, allowing files sorted this way to be merged. (#675; thanks to Patrick Marks of 10xgenomics).


The samtools-1.5.tar.bz2 download is the full source code release. The “Source code” downloads are generated by GitHub and are incomplete as they [don't bundle HTSlib and] are missing some generated files.

@jenniferliddle jenniferliddle released this May 8, 2017 · 135 commits to master since this release

Assets 4

This is primarily a security bug fix update.

  • Added options to fastq to create fastq files from BC (or other)
    tags.

  • Samtools view has gained a -G option to exclude on all bits
    set. For example to discard reads where neither end has been
    mapped use "-G 12".

  • Samtools cat has a -b option to ease concatenation of many
    files.

  • Added misc/samtools_tab_completion for bash auto-completion of
    samtools sub-commands. (#560)

  • Samtools tview now has J and K keys for verticale movement by 20
    lines. (#257)

  • Various compilation / portability improvements.

  • Fixed issue with more than 65536 CIGAR operations and SAM/CRAM files.
    (#667)


The samtools-1.4.1.tar.bz2 download is the full source code release. The “Source code” downloads are generated by GitHub and are incomplete as they [don't bundle HTSlib and] are missing some generated files.

1.4

@jenniferliddle jenniferliddle released this Mar 13, 2017 · 159 commits to master since this release

Assets 4

Release 1.4 (13 March 2017)


Noteworthy changes in samtools:
  
* Fixed Issue #345 - out-by-one error in insert-size in samtools stats

* bam_split now add a @PG header to the bam file
  
* Added mate cigar tag support to fixmate
  
* Multi-threading is now supported for decoding BAM and CRAM (as well
  as the previously supported encoding).  Most commands that read BAM
  or CRAM have gained an -@ or --threads arguments, providing a
  significant speed bonus.  For commands that both read and write
  files the threads are shared between decoding and encoding tasks.

* Added -a option to samtools mpileup to show all locations, including
  sites with zero depth; repeating the option as -aa or -a -a additionally
  shows reference sequences without any reads mapped to them (#496).

* The mpileup text output no longer contains empty columns at zero coverage
  positions.  Previously it would output "...0\t\t..." in some circumstances
  (zero coverage due to being below a minumum base quality); this has been
  fixed to output as "...0\t*\t*..." with placeholder '*' characters as in
  other zero coverage circumstances (see PR #537).

* To stop it from creating too many temporary files, samtools sort
  will now not run unless its per-thread memory limit (-m) is set to
  at least 1 megabyte (#547).

* The misc/plot-bamstats script now has a -l / --log-y option to change
  various graphs to display their Y axis log-scaled.  Currently this
  affects the Insert Size graph (PR #589; thanks to Anton Kratz).

* Fixmate will now also add and update MC (mate CIGAR) tags.

---

_The **[foo-1.x].tar.bz2** download is the full source code release. The “Source code” downloads are generated by GitHub and are incomplete as they [don't bundle HTSlib and] are missing some generated files._

@jmarshall jmarshall released this Apr 22, 2016 · 265 commits to develop since this release

Assets 3
  • The sort command creates any needed temporary files alongside the final output file (similarly to the pre-1.3 behaviour), and now aborts when it detects a collision with another sort invocation's temporary files.

    When the -T PREFIX option specified is a directory (or when sorting to standard output), a random component is now added to temporary filenames to try to avoid collisions (#432, #523, #529, #535, PR #530).

  • All samtools commands now check for I/O errors more carefully, especially when writing output files (#111, #253, #470, PR #467).

  • Build fixes for 32-bit systems; be sure to run configure on such systems to enable large file support and access to 2GiB+ files.

  • The fasta/fastq/bam2fq command no longer ignores reads when the -s option is used (#532).

  • The fastq -O option no longer crashes on reads that do not have an OQ tag field (#517).

  • The merge and sort commands now handle (unusual) BAM files that have no textual @SQ headers (#548, #550).

  • Sorting files containing @CO headers no longer duplicates the comment headers, which previously happened on large sorts for which temporary files were needed (#563).

  • The rmdup and view -l commands no longer crash on @RG headers that do not have a LB field (#538).

  • Fixed miscellaneous issues #128, #130, #131, #489, and #514.


The samtools-1.3.1.tar.bz2 download is the full source code release. The “Source code” downloads are generated by GitHub and are incomplete as they don't bundle HTSlib and are missing some generated files.

1.3

@jmarshall jmarshall released this Dec 15, 2015 · 304 commits to develop since this release

Assets 3
  • The obsolete samtools sort in.bam out.prefix usage has been removed. If you are still using ‑f, ‑o, or out.prefix, convert to use -T PREFIX and/or -o FILE instead. (#295, #349, #356, #418, PR #441; see also discussions in #171, #213.)
  • The bamshuf command has been renamed to collate (hence the term bamshuf no longer appears in the documentation, though it still works on the command line for compatibility with existing scripts).
  • The mpileup command now outputs the unseen allele in VCF/BCF as <*> rather than X or <X> as previously, and now has AD, ADF, ADR, INFO/AD, INFO/ADF, INFO/ADR --output-tags annotations that largely supersede the existing DV, DP4, DPR annotations.
  • The mpileup command now applies BAQ calculations at all base positions, regardless of which ‑l or ‑r options are used (previously with -l it was not applied to the first few tens of bases of each chromosome, leading to different mpileup results with -l vs. -r; #79, #125, #286, #407).
  • Samtools now has a configure script which checks your build environment and facilitates choosing which HTSlib to build against. See INSTALL for details.
  • Samtools's Makefile now fully supports the standard convention of allowing CC/CPPFLAGS/CFLAGS/LDFLAGS/LIBS to be overridden as needed. Previously it listened to $(LDLIBS) instead; if you were overriding that, you should now override LIBS rather than LDLIBS.
  • A new addreplacerg command that adds or alters @RG headers and RG:Z record tags has been added.
  • The rmdup command no longer immediately aborts (previously it always aborted with bam_get_library() not yet implemented), but remains not recommended for most use (#159, #252, #291, #393).
  • Merging files with millions of headers now completes in a reasonable amount of time (#337, #373, #419, #453; thanks to Nathan Weeks, Chris Smowton, Martin Pollard, Rob Davies).
  • Samtools index's optional index output path argument works again (#199).
  • Fixed calmd, targetcut, and potential mpileup segfaults when given broken alignments with POS far beyond the end of their reference sequences.
  • If you have source code using bam_md.c's bam_fillmd1_core(), bam_cap_mapQ(), or bam_prob_realn_core() functions, note that these now take an additional ref_len parameter. (The versions named without _core are unchanged.)
  • The tview command's colour scheme has been altered to be more suitable for users with colour blindness (#457).
  • Samtools depad command now handles CIGAR N operators and accepts CRAM files (#201, #404).
  • Samtools stats now outputs separate "N" and "other" columns in the ACGT content per cycle section (#376).
  • Added -a option to samtools depth to show all locations, including zero depth sites (#374).
  • New samtools dict command, which creates a sequence dictionary (as used by Picard) from a FASTA reference file.
  • Samtools stats --target-regions option works again.
  • Added legacy API sam.h functions sam_index_load() and samfetch() providing bam_fetch()-style iteration over either BAM or CRAM files. (In general we recommend recoding against the htslib API directly, but this addition may help existing libbam-using programs to be CRAM-enabled easily.)
  • Fixed legacy API's samopen() to write headers only with "wh" when writing SAM files. Plain "w" suppresses headers for SAM file output, but this was broken in 1.2.
  • samtools fixmate - - works in pipelines again; with 1.0 to 1.2, this failed with [bam_mating] cannot determine output format.
  • Restored previous samtools calmd -u behaviour of writing compression level 0 BAM files. Samtools 1.0 to 1.2 incorrectly wrote raw non-BGZF BAM files, which cannot be read by most other tools. (Samtools commands other than calmd were unaffected by this bug.)
  • Restored bam_nt16_nt4_table[] to legacy API header bam.h.
  • Fixed bugs #269, #305, #320, #328, #346, #353, #365, #392, #410, #445, #462, #475, and #495.

The samtools-1.3.tar.bz2 download is the full source code release. The “Source code” downloads are generated by GitHub and are incomplete as they don't bundle HTSlib and are missing some generated files.

1.2

@jmarshall jmarshall released this Feb 4, 2015 · 554 commits to develop since this release

Assets 3

Noteworthy changes in samtools:

  • flagstat now works on SAM, BAM, or CRAM files (rather than BAM only)
  • stats calculates mismatches per cycle for unclipped length
  • merge can now merge SAM input files
  • CRAM reference files are now cached by default (see HTSlib release notes and samtools(1) man page)
  • Tested against Intel-optimised zlib (https://github.com/jtkukunas/zlib; see README for details)
  • Fixed bugs #302, #309, #318, and #327 and many other improvements and bugs fixed in HTSlib—see the HTSlib release notes

The samtools-1.2.tar.bz2 download is the full source code release. The “Source code” downloads are generated by GitHub and are incomplete as they don't bundle HTSlib and are missing some generated files.