# hicstuff command line interface demo

## Preparing the data

If using bowtie, genome must first be indexed using bowtie2-build

```bash
bowtie2-build genome.fa genome
```
The input reads can be in fastq format, or if in SAM/BAM format if already aligned to the genome.


## Generating matrices

The pipeline command can be used to generate the Hi-C contact map from the input reads.

```bash
hicstuff pipeline --no-cleanup --enzyme DpnII --filter --threads 12 --plot --iterative --genome genome --output output/ --prefix demo forward.fq reverse.fq
```
For instance, this will create a directory named "output", containing the output files with the prefix "demo". The ouput directory will contain two subdirectories; "tmp", containing all temporary files and "plots", containing figures generated at different stages of the pipeline. Reads will be truncated to 20bp and aligned to the genome by iterative extension. The process is parallelized on 12 threadsd. Hi-C pairs will also be filtered to exclude uninformative religation events.

## Output files
The output files should look like this:
```
output
├── demo.chr.tsv
├── demo.frags.tsv
├── demo.hicstuff_20190423185220.log
├── demo.mat.tsv
├── plots
│   ├── event_distance.pdf
│   ├── event_distribution.pdf
│   └── frags_hist.pdf
└── tmp
    ├── demo.for.sam
    ├── demo.genome.fasta
    ├── demo.rev.sam
    ├── demo.valid_idx_filtered.pairs
    ├── demo.valid_idx.pairs
    └── demo.valid.pairs
```

There are 3 output files in the base `output` directory: the contact matrix (demo.mat.tsv), the info_contigs file (demo.chr.tsv) and the fragments_list (demo.frags.tsv). The `tmp` directory contains the fasta genome extracted from the bowtie2 index, the alignments in SAM format and all temporary files in .pairs fomat.


## Visualizing the matrix

The view command can be used to visualise the output Hi-C matrix.

```bash
hicstuff view --binning 5kb --normalize --frags output/demo.frags.tsv output/demo.mat.tsv
```

This will show an interactive heatmap using matplotlib. In order to save the matrix to a file instead, one could add `--output output/demo.png`
