@arq5x arq5x released this Dec 14, 2017 · 5 commits to master since this release

Assets 3

Fixed a bug in the Makefile that caused a substantial penalty in performance.

@arq5x arq5x released this Dec 6, 2017 · 11 commits to master since this release

Assets 3

Version 2.27.0 (6-Dec-2017)

  1. Fixed a big memory leak and algorithmic flaw in the split option. Thanks to Neil Kindlon!
  2. Resolved compilation errors on OSX High Sierra. Many thanks to @jonchang!
  3. Fixed a bug in the shift tool that caused some intervals to exceed the end of the chromosome. Thanks to @wlholtz
  4. Fixed major bug in groupby that prevented proper functionality.
  5. Speed improvements to the shuffle tool.
  6. Bug fixes to the p-value calculation in the fisher tool. Thanks to Brent Pedersen.
  7. Allow BED headers to start with chrom or chr
  8. Fixes to the "k-closest" functionality in the closest tool. Thanks to Neil Kindlon.
  9. Fixes to the output of the freqasc, freqdesc, distinct_sort_num and distinct_sort, and num_desc operations in the groupby tool. Thanks to @ghuls.
  10. Many minor bug fixes and compilation improvements from Luke Goodsell.
  11. Added the -fullHeader option to the maskfasta tool. Thanks to @ghuls.
  12. Many bug fixes and performance improvements from John Marshall.
  13. Fixed bug in the -N/-f behavior in subtract.
  14. Full support for .fai files as genome (-g) files.
  15. Many other minor bug fixes and functionality improvements.

@arq5x arq5x released this Jul 6, 2016 · 178 commits to master since this release

Assets 3
  1. Fixed a major memory leak when using -sorted. Thanks to Emily Tsang and Stephen Montgomery.
  2. Fixed a bug for BED files containing a single record with no newline. Thanks to @jmarshall.
  3. The getfasta tool includes name, chromosome and position in fasta headers when the -name option is used. Thanks to @rishavray.
  4. Fixed a bug that now forces the coverage tool to process every record in the -a file.
  5. Fixed a bug preventing proper processing of BED files with consecutive tabs.
  6. VCF files containing structural variants now infer SV length from either the SVLEN or END INFO fields. Thanks to Zev Kronenberg.
  7. Resolve off by one bugs when intersecting GFF or VCF files with BED files.
  8. The shuffle tool now uses roulette wheel sampling to shuffle to -incl regions based upon the size of the interval. Thanks to Zev Kronenberg and Michael Imbeault.
  9. Fixed a bug in coverage that prevented correct calculation of depth when using the -split option.
  10. The shuffle tool warns when an interval exceeds the maximum chromosome length.
  11. The complement tool better checks intervals against the chromosome lengths.
  12. Fixes for stddev, min, and max operations. Thanks to @jmarshall.
  13. Enabled stdev, sstdev, freqasc, and freqdesc options for groupby.
  14. Allow -s and -w to be used in any order for makewindows.
  15. Added new -bedOut option to getfasta.
  16. The -r option forces the -F value for intersect.
  17. Add -pc option to the genomecov tool, allowing coverage to be calculated based upon paired-end fragments.

@arq5x arq5x released this May 28, 2015 · 361 commits to master since this release

Assets 3
  1. The coverage tool now takes advantage of pre-sorted intervals via the -sorted option. This allows the coverage tool to be much faster,
    use far less memory, and report coverage for intervals in their original order in the input file.
  2. We have changed the behavior of the coverage tool such that it is consistent with the other tools. Specifically, coverage is now
    computed for the intervals in the A file based on the overlaps with the B file, rather than vice versa.
  3. The subtract tool now supports pre-sorted data via the -sorted option and is therefore much faster and scalable.
  4. The -nonamecheck option provides greater tolerance for chromosome labeling when using the -sorted option.
  5. Support for multiple SVLEN tags in VCF format, and fixed a bug that failed to process SVLEN tags coming at the end of a VCF INFO field.
  6. Support for reverse complementing IUPAC codes in the getfasta tool.
  7. Provided greater flexibility for "BED+" files, where the first 3 columns are chrom, start, and end, and the remaining columns are free-form.
  8. We now detect stale FAI files and recreate an index thanks to a fix from @gtamazian.
  9. New feature from Pierre Lindenbaum allowing the sort tool to sort files based on the chromosome order in a faidx file.
  10. Eliminated multiple compilation warnings thanks to John Marshall.
  11. Fixed bug in handling INS variants in VCF files.

@arq5x arq5x released this Feb 22, 2015 · 435 commits to master since this release

Assets 3

New features.

  1. Added -k option to the closest tool to report the k-closest features in one or more -b files.
  2. Added -fd option to the closest tool to for the reporting of downstream features in one or more -b files. Requires -D to dictate how "downstream" should be defined.
  3. Added -fu option to the closest tool to for the reporting of downstream features in one or more -b files. Requires -D to dictate how "downstream" should be defined.
  4. @lindenb added a new split tool that will split an input file into multiple sub files. Unlike UNIX split, it can balance the chunking of the sub files not just by number of lines, but also by total number of base pairs in each sub file.
  5. Added a new spacing tool that reports the distances between features in a file.
  6. @jayhesselberth added a -reverse option to the makewindows tool that reverses the order of the assigned window numbers.

Bug fixes.

  1. Fixed a bug that caused incorrect reporting of overlap for zero-length BED records. Thanks to @roryk.
  2. Fixed a bug that caused the map tool to not allow -b to be specified before -a. Thanks to @semenko.
  3. Fixed a bug in makewindows that mistakenly required -s with -n.

@arq5x arq5x released this Jan 2, 2015 · 481 commits to master since this release

Assets 3
  1. When using -sorted with intersect, map, and closest, bedtools can now detect and warn you when your input datasets employ different chromosome sorting orders.
  2. Fixed multiple bugs in the new, faster closest tool. Specifically, the -iu, -id, and -D options were not behaving properly with the new "sweeping" algorithm that was implemented for the 2.22.0 release. Many thanks to Sol Katzman for reporting these issues and for providing a detailed analysis and example files.
  3. We FINALLY wrote proper documentation for the closest tool.
    http://bedtools.readthedocs.org/en/latest/content/tools/closest.html
  4. Fixed bug in the tag tool when using -intervals, -names, or -scores. Thanks to Yarden Katz for reporting this.
  5. Fixed issues with chromosome boundaries in the slop tool when using negative distances. Thanks to @acdaugherty!
  6. Multiple improvements to the fisher tool. Added a -m option to the fisher tool to merge overlapping intervals prior to comparing overlaps between two input files. Thanks to @brentp
  7. Fixed a bug in makewindows tool requiring the use of -b with -s.
  8. Fixed a bug in intersect that prevented -split from detecting complete overlaps with -f 1. Thanks to @tleonardi .
  9. Restored the default decimal precision to the groupby tool.
  10. Added the -prec option to the merge and map tools to specific the decimal precision of the output.

@arq5x arq5x released this Nov 11, 2014 · 547 commits to master since this release

Assets 3

Enhancements

  1. Multiple database support for the closest tool. The closest tool now requires sorted input, but it is between 10 and 60 times faster depending on the use case.

As an example:

➜  cat mq1.bed
chr1    80  100 q1  1   +

➜  cat mdb1.bed
chr1    5   15  d1.1    1   +
chr1    20  60  d1.2    2   -
chr1    200 220 d1.3    3   -

➜  cat mdb2.bed
chr1    15  35  db2.1   1   -
chr1    120 170 db2.2   2   -
chr1    210 230 db3 3   +

➜  cat mdb3.bed
chr1    70  90  d3.1    3   -

Find the closest interval in each B file.

➜  bedtools closest -a mq1.bed \
          -b mdb1.bed mdb2.bed mdb3.bed \
          -names foo bar biz
chr1    80  100 q1  1   +   foo chr1    20  60  d1.2    2   -
chr1    80  100 q1  1   +   bar chr1    120 170 db2.2   2   -
chr1    80  100 q1  1   +   biz chr1    70  90  d3.1    3   -

Find the closest interval among all B files.

➜  bedtools closest -a mq1.bed \
         -b mdb1.bed mdb2.bed mdb3.bed \
         -names foo bar biz \
         -mdb all
chr1    80  100 q1  1   +   biz chr1    70  90  d3.1    3   -
  1. Support for IMPRECISE SVs in VCF format.
  2. Added the -prec option to grouby to allow control over the reported decimal precision

Bug fixes

  1. Fixed a bug with zero length records.
  2. Fixed a precision bug in the fisher tool. Thanks to @brentp
  3. Fixed a bug in the bamtofastq tool. Thanks to @ryan-williams

@arq5x arq5x released this Sep 18, 2014 · 579 commits to master since this release

Assets 3

Version 2.21.0 (18-Sep-2014)

  • Added ability to intersect against multiple -b files in the intersect tool.
  • Fixed a bug causing slowdowns in the -sorted option when using -split with very large split alignments.
  • Added a new fisher tool to report a P-value associated with the significance of the overlaps between two interval sets. Thanks to @brentp!
  • Added a “genome” file for GRCh38. Thanks @martijnvermaat!
  • Fixed a bug in the -pct option of the slop tool. Thanks to @brentp!
  • Tweak to the Makefile to accomodate Intel compilers. Thanks to @jmarshall.
  • Many updates to the docs from the community. Thank you!

@arq5x arq5x released this May 24, 2014 · 620 commits to master since this release

Assets 3
  1. Fixed a float rounding bug causing occassional off-by-one issues in the slop added by the slop tool. Thanks to @slw287r.
  2. Fixed a bug injected in 2.19 arising when files have a single line not ending in a newline. Thanks to @cwarden45.