Substring index for paths in a graph
Switch branches/tags
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
include/gbwt
tests
.gitignore
LICENSE
Makefile
README.md
algorithms.cpp
benchmark.cpp
build_gbwt.cpp
dynamic_gbwt.cpp
files.cpp
gbwt.cpp
internal.cpp
merge_gbwt.cpp
prepare_text.cpp
support.cpp
utils.cpp
variants.cpp

README.md

GBWT

Graph BWT is an independent implementation of the graph extension (gPBWT) of the positional Burrows-Wheeler transform (PBWT). Its initial purpose is to embed observed haplotypes in a variation graph. Haplotypes are essentially sequences of nodes in the variation graph, and GBWT is best seen as the multi-string BWT of the node sequences.

The implementation uses Succinct Data Structures Library 2.0 (SDSL). To compile, set SDSL_DIR in the Makefile to point to your SDSL directory. As the implementation uses C++11, OpenMP, and libstdc++ parallel mode, you need g++ 4.9 or newer to compile. On Apple systems, GBWT can also be built with Apple Clang 9.1, but libomp must be installed via Macports or Homebrew, and the lack of libstdc++'s parallel mode extensions will result in slower index construction.

See the wiki for further documentation.

Citing GBWT

Jouni Sirén, Erik Garrison, Adam M. Novak, Benedict Paten, and Richard Durbin: Haplotype-aware graph indexes. Proc. WABI 2018, LIPIcs 113, pp. 4:1-4:13, Helsinki, Finland, August 20-22, 2018. DOI: 10.4230/LIPIcs.WABI.2018.4

Other references

Richard Durbin: Efficient haplotype matching and storage using the Positional Burrows-Wheeler Transform (PBWT). Bioinformatics 30(9):1266-1272, 2014. DOI: 10.1093/bioinformatics/btu014

Adam M. Novak, Erik Garrison, and Benedict Paten: A graph extension of the positional Burrows-Wheeler transform and its applications. Algorithms for Molecular Biology 12:18, 2017. DOI: 10.1186/s13015-017-0109-9