bustools is a program for manipulating BUS files for single cell RNA-Seq datasets.
The design and motivation for the BUS format and BUStools are described in detail in
P Melsted, V Ntranos, L Pachter, The Barcode, UMI, Set format and BUStools, bioRxiv 2018 pp: 472571.
bustools works with BUS files which can be generated efficiently from raw sequencing data, e.g. using kallisto.
Download bustools with
git clone https://github.com/BUStools/bustools.git
Move to the bustools directory
Make a build directory and move there:
Build the code:
The bustools executable is now located in build/src. To install bustools into the cmake install prefix path type:
To see a list of available commands type
bustools in the terminal
> bustools Usage: bustools <CMD> [arguments] .. Where <CMD> can be one of: sort Sort bus file by barcodes and UMI text Output as tab separated text file Running bustools <CMD> without arguments prints usage information for <CMD>
Raw BUS output from pseudoalignment programs may be unsorted. To simply and accelerate downstream processing BUS files can be sorted using
> bustools sort Usage: bustools sort [options] bus-files Options: -t, --threads Number of threads to use -o, --output File for sorted output
This will create a new BUS file where the BUS records are sorted by barcode first, UMI second, and equivalence class third.
BUS files can be converted to a tab-separated format for easy inspection and processing using shell scripts or high level languages.
> bustools text Usage: bustools text [options] bus-files Options: -o, --output File for text output