Installation

ntHash is an efficient rolling hash function for k-mers and spaced seeds.

Installation

Make sure Meson is installed on the system.

Download the repo (either from the releases section or close using git clone https://github.com/bcgsc/ntHash). Setup meson in an arbitrary directory (e.g. build), by running the following command in the project's root (include --prefix=PREFIX set the installation prefix to PREFIX):

meson setup --buildtype=release --prefix=PREFIX build

Then, install the project and its dependencies using:

meson install -C build

This will install include/nthash and lib/libnthash.a to the installation prefix.

Usage

To use ntHash in a C++ project:

Import ntHash in the code using #include <nthash/nthash.hpp>
Access ntHash classes from the nthash namespace
Add the include directory (pass -IPREFIX/include to the compiler)
Link the code with libnthash.a (i.e. pass -LPREFIX/lib -lnthash to the compiler, where PREFIX is the installation prefix)
Compile your code with -std=c++17 (and preferably -O3) enabled

Refer to docs for more information.

Examples

Generally, the nthash::NtHash and nthash::SeedNtHash classes are used for hashing sequences:

nthash::NtHash nth("TGACTGATCGAGTCGTACTAG", 1, 5);  // 1 hash per 5-mer
while (nth.roll()) {
    // use nth.hashes() for canonical hashes
    //     nth.get_forward_hash() for forward strand hashes
    //     nth.get_reverse_hash() for reverse strand hashes
}

std::vector<std::string> seeds = {"10101", "11011"};
nthash::SeedNtHash nth("TGACTGATCGAGTCGTACTAG", seeds, 3, 5);
while (nth.roll()) {
    // nth.hashes()[0] = "T#A#T"'s first hash
    // nth.hashes()[1] = "T#A#T"'s second hash
    // nth.hashes()[2] = "T#A#T"'s third hash
    // nth.hashes()[3] = "TG#CT"'s first hash
}

For developers

If you would like to contribute to the development of ntHash, after forking/cloning the repo, create the build directory without the release flag:

meson setup build

Compile the code, tests, and benchmarking script using:

meson compile -C build

If compilation is successful, libnthash.a will be available in the build folder. The benchmarking script is also compiled as the bench binary file in build.

Before sending a PR, make sure that:

tests pass by running meson test in the project directory
code is formatted properly by running ninja clang-format in the build folder (requires clang-format to be available)
coding standards have been met by making sure running ninja clang-tidy-check in build returns no errors (requires clang-tools to be installed)
documentation is up-to-date by running ninja docs in build (requires doxygen)

Publications

Parham Kazemi, Johnathan Wong, Vladimir Nikolić, Hamid Mohamadi, René L Warren, Inanç Birol, ntHash2: recursive spaced seed hashing for nucleotide sequences, Bioinformatics, 2022;, btac564, https://doi.org/10.1093/bioinformatics/btac564

Hamid Mohamadi, Justin Chu, Benjamin P Vandervalk, and Inanc Birol. ntHash: recursive nucleotide hashing. Bioinformatics (2016) 32 (22): 3492-3494. doi:10.1093/bioinformatics/btw397

Name		Name	Last commit message	Last commit date
Latest commit History 163 Commits
docs		docs
examples		examples
include/nthash		include/nthash
src		src
tests		tests
vendor/doxygen-awesome-css		vendor/doxygen-awesome-css
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.clang-tidy-ignore		.clang-tidy-ignore
.gitignore		.gitignore
CITATION.bib		CITATION.bib
Doxyfile		Doxyfile
LICENSE		LICENSE
README.md		README.md
azure-pipelines.yml		azure-pipelines.yml
meson.build		meson.build
nthash-logo.png		nthash-logo.png
update-docs.sh		update-docs.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

Usage

Examples

For developers

Publications

About

Releases 6

Contributors 8

Languages

License

bcgsc/ntHash

Folders and files

Latest commit

History

Repository files navigation

Installation

Usage

Examples

For developers

Publications

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 6

Contributors 8

Languages