## Converting DeepLoop output to Cooler

We provide the `convert_to_cooler.py` script for converting valid pairs format to .cool or .mcool files. There are two main ways to use this script: either to convert the data exactly (typically for analysis) or convert the data approximately to uniform bin sizes (typically for visualization in HiGlass).

In [1]:
! python3 ../utils/convert_to_cooler.py -h

usage: convert_to_cooler.py [-h] [--anchor_dir ANCHOR_DIR]
                            [--loop_dir LOOP_DIR] [--out_file OUT_FILE]
                            [--bin_size BIN_SIZE] [--min_val MIN_VAL]
                            [--force_bin_size] [--zoomify]
                            [--multires_outfile MULTIRES_OUTFILE]
                            [--col_names COL_NAMES [COL_NAMES ...]]
                            [--cooler_col COOLER_COL]
                            [--single_chrom SINGLE_CHROM] [--verbose]

optional arguments:
  -h, --help            show this help message and exit
  --anchor_dir ANCHOR_DIR
                        directory containing chromosome .bed files
  --loop_dir LOOP_DIR   directory containing interaction files
  --out_file OUT_FILE   path to output cooler file
  --bin_size BIN_SIZE   manually set bin size, only used for visualization in
                        HiGlass or in tandem with force_bin_size
  --min_val MIN_VAL     drop interacti

### 1. Exact conversion

This option allows you to convert all DeepLoop interactions to a cooler file without any loss of information. This will store the variable bin sizes corresponding to the anchor bed files. The only reason you might not want to do this is if you wish to visualize the output in HiGlass as it only supports uniform bin sizes. 

In [36]:
! python3 ../utils/convert_to_cooler.py --anchor_dir ../training_data/anchor_bed/ \
                                        --loop_dir ../training_data/H9_denoised/H9_full/ \
                                        --out_file coolers/H9_denoise.cool \
                                        --col_names a1 a2 denoise \
                                        --cooler_col denoise;

coolers/H9_denoise.cool
100%|███████████████████████████████████████████| 24/24 [01:10<00:00,  2.94s/it]
              a1      a2  denoise   chr  chr_a1  chr_a2
0              0       1   0.0841  chr1       0       1
1              0       2   0.0827  chr1       0       2
2              0       3   0.0324  chr1       0       3
3              0       4   0.0075  chr1       0       4
4              0       7   0.0122  chr1       0       7
...          ...     ...      ...   ...     ...     ...
23008367  334610  334612   0.0789  chrY    2987    2989
23008368  334610  334613   0.1060  chrY    2987    2990
23008369  334611  334612   0.1226  chrY    2988    2989
23008370  334611  334613   0.1777  chrY    2988    2990
23008371  334612  334613   0.1589  chrY    2989    2990

[23008372 rows x 6 columns]
a1           int64
a2           int64
denoise    float64
chr         object
chr_a1       int64
chr_a2       int64
dtype: object
Saving cooler...
334612 334613 334614
     chrom     start       e

### 2. Approximate conversion (uniform bin sizes)

This setting allows you to set a uniform bin size for visualization in HiGlass. This requires us to smear the non-uniform pixels into uniform pixels (e.g a single non-uniform pixel might be an $n \times m$ box of uniform pixels) since HiGlass does not support variable bin sizes. This takes much longer than simply writing a cooler file with variable bin size.

In [43]:
! python3 ../utils/convert_to_cooler.py --anchor_dir ../training_data/anchor_bed/ \
                                        --loop_dir ../training_data/H9_denoised/H9_full/ \
                                        --out_file coolers/H9_denoise_10kb_chr1.cool \
                                        --col_names a1 a2 denoise \
                                        --cooler_col denoise \
                                        --single_chrom chr1 \
                                        --bin_size 5000 \
                                        --min_val 1.0 \
                                        --force_bin_size \
                                        --zoomify \
                                        --multires_outfile coolers/H9_denoise_10kb_chr1.mcool;

coolers/H9_denoise_10kb_chr1.cool
100%|███████████████████████████████████████████| 24/24 [06:17<00:00, 15.74s/it]
Saving cooler...
Zoomifying cooler to following resolutions: [10000, 20000, 40000]


Note: if you are running this notebook remotely, you will need to specify the `server_port` argument. This cannot be the same port used as the notebook (default 8888) and must be forwarded in the same way you would port-forward for running notebooks remotely.

In [44]:
from higlass.client import View, Track
from higlass.tilesets import cooler
import higlass

ts1 = cooler('coolers/H9_denoise_10kb_chr1.mcool')
tr1 = Track('heatmap', tileset=ts1)
view1 = View([tr1])
display, server, viewconf = higlass.display([view1], server_port=8889)

display

HiGlassDisplay(viewconf={'editable': True, 'views': [{'uid': 'Pbq158BBQqK0vNc-mkjNRw', 'tracks': {'top': [], '…