Skip to content

paulsengroup/hictk

Repository files navigation

hictk


Downloads Bioconda   Conan Center Index   DockerHub   Zenodo
Documentation Documentation
License License
Coverage Coverage
CI Ubuntu CI Status   macOS CI Status   Windows CI Status   Build Dockerfile Status
CodeQL CodeQL (C++) Status   CodeQL (Python) Status   CodeQL (GH Actions) Status
Fuzzy Testing Fuzzy Testing Status
Static Analysis clang-tidy Status   Lint CMakeLists.txt files Status   Lint CITATION.cff Status

hictk is a blazing fast toolkit to work with .hic and .cool files.

The toolkit consists of a native CLI application and a C++ library running on Linux, macOS, and Windows.
hictk offers native IO support for Cooler and .hic files, meaning that its implementation is independent of that of cooler, JuicerTools, or straw.

hictk can also be accessed from several programming languages using one of the following libraries:

  • hictkpy - Python bindings for hictk: read and write .cool and .hic files directly from Python
  • hictkR - R bindings for hictk: read .cool and .hic files directly from R
  • libhictk - The native C++ library that underlies hictk

Features

Supported formats

The CLI application and C++ library are capable of reading and writing files in the following formats:

Format Revision Read Write
.cool v1-3 (all) 1
.mcool v1-2 (all) 2
.scool v1 (all) 3
.hic v6-9 4

1 v3 only
2 v2 only
3 libhictk only
4 v9 only

Supported operations

  • Seamless conversion between Cooler and .hic formats (from hic to cool and vice versa)
  • Uniform interface to query interaction matrices
  • High performance and low memory requirements (see benchmarks in the Supplementary Text from our paper)
  • Easy access to file metadata
  • Create files from interaction pairs or pre-binned interaction counts (e.g. 4DN-DCIC pairs or BEDPE/bedGraph2)
  • Merge interactions from multiple files into a single file (also supports merging files in different formats)
  • Detect (and when possible fix) corrupted files
  • Balance interaction matrices using ICE, SCALE, or VC
  • Create multi-resolution files suitable for visualization with JuiceBox and HiGlass

All the above operations can be performed on both Cooler and .hic files and yield identical results.

Installation

hictk can be installed using containers, bioconda, Conan, or directly from source.
Refer to the Installation section in the documentation for more information.

Quickstart

hictk (CLI)

hictk provides the following subcommands:

Subcommand Description
balance Balance Hi-C files using ICE, SCALE, or VC.
convert Convert Hi-C files between different formats.
dump Read interactions and other kinds of data from .hic and Cooler files and write them to stdout.
fix-mcool Fix corrupted .mcool files.
load Build .cool and .hic files from interactions in various text formats.
merge Merge multiple Cooler or .hic files into a single file.
metadata Print file metadata to stdout.
rename-chromosomes Rename chromosomes found in a Cooler file.
validate Validate .hic and Cooler files.
zoomify Convert single-resolution Cooler and .hic files to multi-resolution by coarsening.

Refer to the Quickstart (CLI) and CLI Reference sections in the documentation for more details.

libhictk

libhictk can be installed in various ways, including with Conan and CMake FetchContent.
Section Quickstart (API) of hictk documentation contains further details on how this can be accomplished.

Quickstart (API) also demonstrates the basic functionality offered by libhictk.
For more complex examples refer to the sample programs under the examples/ folder as well as to the source code of hictk.

The public C++ API of hictk is documented in the C++ API Reference section of hictk documentation.

Citing

If you use hictk or any of its language bindings in your research, please cite the following publication:

Roberto Rossini, Jonas Paulsen, hictk: blazing fast toolkit to work with .hic and .cool files Bioinformatics, Volume 40, Issue 7, July 2024, btae408, https://doi.org/10.1093/bioinformatics/btae408

BibTex
@article{hictk,
    author = {Rossini, Roberto and Paulsen, Jonas},
    title = "{hictk: blazing fast toolkit to work with .hic and .cool files}",
    journal = {Bioinformatics},
    volume = {40},
    number = {7},
    pages = {btae408},
    year = {2024},
    month = {06},
    issn = {1367-4811},
    doi = {10.1093/bioinformatics/btae408},
    url = {https://doi.org/10.1093/bioinformatics/btae408},
    eprint = {https://academic.oup.com/bioinformatics/article-pdf/40/7/btae408/58385157/btae408.pdf},
}

Packages

 
 
 

Contributors 4

  •  
  •  
  •  
  •