Skip to content
Permalink
Browse files

.

  • Loading branch information...
rcalandrelli committed Mar 28, 2019
1 parent 746fce0 commit 1a71355c96b346896e1749ec1a080a6819cbe131
BIN +0 Bytes (100%) .DS_Store
Binary file not shown.
Binary file not shown.
@@ -0,0 +1,23 @@
# HiCtool compressed format

HiCtool contact map storage is documented in the Supplementary Material of [Calandrelli et al. (2018). GITAR: An open source tool for analysis and visualization of Hi-C data. *Genomics, proteomics & bioinformatics.*](https://www.sciencedirect.com/science/article/pii/S1672022918304339#s0055).

## Intra-chromosomal compressed format

Data are compressed based on the fact that contact maps are symmetric (contacts between loci i and j are the same than those between loci j and i) and usually sparse, since most of the elements are zeros, and this property is stronger with the decrease in the bin size. Given these two properties, it is not needed to save mirrored data and moreover it would be useful to “compress” the zero data within the matrices. To accomplish this, first we selected only the upper-triangular part of the contact matrices (including the diagonal) and reshaped the data by rows to form a vector. After that, we replaced all the consecutive zeros in the vector with a “0” followed by the number of zeros that are repeated consecutively; all the non-zero elements are left as they are. Finally, the data are saved in a txt file.

![](/figures/HiCtool_compression.png)

The figure shows a simplified example of the compression workflow, where the intra-chromosomal contact matrix is represented by a 4 × 4 symmetric and sparse matrix.

(1) The upper-triangular part of the matrix is selected (including the diagonal);
(2) data are reshaped to form a vector;
(3) all the consecutive zeros are replaced with a “0” followed by the number of zeros that are repeated consecutively;
(4) data are saved into a txt file.

## Inter-chromosomal compressed format





@@ -16,6 +16,7 @@ The **explicit-factor correction model of Yaffe and Tanay** is applied to normal
## [3. TAD analysis](/tutorial/tad-analysis.md)

***
## Supplementary information

### HiCtool compressed format information
- ### [HiCtool contact matrix compressed format](/tutorial/HiCtool_compressed_format.md)

0 comments on commit 1a71355

Please sign in to comment.
You can’t perform that action at this time.