Skip to content
andypohl edited this page Sep 5, 2014 · 42 revisions

bwtool is a command-line utility for bigWig files. bigWigs are an indexed and compressed form of wig file, a somewhat standard format for storing genome-wide real-valued signal data. Much of the ENCODE processed data is in this form, and it is appearing more often in GEO as well. The purpose of bwtool is to make these files more useful by providing some convenient functions.

Documentation

bwtool's functionality is subdivided into subprograms that roughly fall into three categories: data extraction, analysis, and data modification, although e.g. in the case of the matrix program or the sax program, the boundary between data extraction and analysis isn't very strong. The data modification programs all have the behavior that a bigWig is inputted and a new bigWig is outputted.

Data Extraction Analysis Data modification
extract aggregate fill
matrix chromgraph lift
paste distribution remove
random* find shift
sax summary
window

* deleted (6 June 2014)

In-depth usage examples

To get an idea of how to use bwtool from real-world examples, the examples page lists a number of those scenarios, some of them presented in workshops.

Installation

Installing bwtool should just be a matter of a few commands, hopefully. libbeato is the only requirement.

To install bwtool system-wide (i.e. with root/sudo access):

git clone https://github.com/CRG-Barcelona/libbeato.git
git clone https://github.com/CRG-Barcelona/bwtool.git
cd libbeato/
./configure
make
sudo make install
cd ../bwtool/
./configure
make
sudo make install

to install bwtool locally:

git clone https://github.com/CRG-Barcelona/libbeato.git
git clone https://github.com/CRG-Barcelona/bwtool.git
cd libbeato/
./configure --prefix=$HOME CFLAGS="-g -O0 -I${HOME}/include" LDFLAGS=-L${HOME}/lib
make
make install
cd ../bwtool/
./configure --prefix=$HOME CFLAGS="-g -O0 -I${HOME}/include" LDFLAGS=-L${HOME}/lib
make
make install

FAQ

Q: The program crashed with a "Segmentation fault" or "Abort" at some stage while it was running. Why?

A: I try my best to test the programs thoroughly, but ultimately some bugs will always be there. Here are some common reasons:

  1. Malformed input. Be very sure that your input conforms to expected formats. I'm especially talking about bed files here... bigWigs are typically solid because they won't be created unless they're in the correct format).
  2. Unforeseen issues. A good example was when someone tried giving as input a file that didn't exist. I try to anticipate as much as I can, because it's always more helpful to provide a nice error message that mentions the mistake than to just have the program crash silently.
  3. Logical errors. These are the most sinister because it indicates the program failed while doing computation. These are the main types of bugs I'm most zealous about seeking and destroying and I will catch most of them before they're public. Nevertheless they're still likely to be found sometimes... and when they are, hopefully I can fix it quickly after I'm made aware (or we... this is open-source now I suppose).

Q: Will it work in Windows?

A: Maybe, with Cygwin. I haven't had access to a Windows computer yet to try.

Q: Is my computer powerful enough?

A: Although my Mac laptop is able to run bwtool just fine, its 8 GB of RAM is a limitation, particularly with data modification operations. I tend to run bwtool on industrial-strength servers, where 40 GB of RAM or more is quite normal. But I work with data from human cancer cell lines, so if I worked with yeast or even worms, my laptop would probably be sufficient.

Q: Why is it called "bwtool"?

A: Simply because it's a tool for bigWig files, which often end in .bw. In addition, it follows the mold of interface set by bedtools or samtools. Perhaps in retrospect, a little more creativity could have been put into the name because it could erroneously be thought to be associated with Burrows-Wheeler transforms, which are another topic in bioinformatics entirely.

Q: How do I cite bwtool?

A: There is a short article in Bioinformatics.

Q: I don't understand the license. Can I use it?

A: The program may be used for any purpose: commercial or academic. What is restricted is the source code, which except for the code inside the libjkweb directory, falls under the GNU Public License v3... which means if the source code is used, it needs to be for open source software. Using the code in a closed-source project is allowed only with permission, but don't expect permission to be granted. The code in the libjkweb directory is a slightly-modified version of the code found in Jim Kent's C library. The licensing structure of Jim Kent's code is a bit complex, but essentially:

  • The code used in bwtool's libjkeb is from kent/src/inc and kent/src/lib. This code is freeware and basically unrestricted in its use to copy, change, and redistribute as desired.
  • The code in kent/src/jkOwnLib is a commercial product: it's code that is copyrighted by Jim Kent and only available through Kent Informatics. This is includes the code for the BLAT alignment software.
  • kent/src/hg contains code for the UCSC Genome Browser and is copyrighted by the University of California. Well... some parts of this are freeware also, but because bwtool does not use it, I'm not so concerned what.

It is a pity that the bwtool paper incorrectly stated that libjkweb was copyrighted in a more restricted way. Perhaps a corrigendum will be mentioned to the publisher if this isn't deemed visible enough.

Contact

Please direct questions to Andy Pohl.