Skip to content

kaistcbfg/pyDIscore

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

pyDIscore is an implementation version of the published Directionality Index score calculation script by Dixon et al. [1]. Unlike the original written in Perl (http://renlab.sdsc.edu/yanxiao/download/hic-pip/domaincall_software/perl_scripts/DI_from_matrix.pl), it's written in Python and works compatible with the covNorm (https://github.com/kaistcbfg/covNormRpkg) output and numpy array.

Installlation

pyDIscore works with basic Python packages. Most of them are installed by default when installing Python (such as anaconda). The code was tested on Python 3.8.10 and 2.7.15.

  • numpy
  • argparse
  • pickle
  • gzip
  • sys
git clone https://github.com/kaistcbfg/pyDIscore.git

Usage

$ python pyDIscore -h
usage: pyDIscore.py [-h] --input-file INPUT_FILE [--input-format INPUT_FORMAT]
                    --chrname CHRNAME --fai-file FAI_FILE
                    [--resolution RESOLUTION] [--window-size WINDOW_SIZE]
                    [--double-count-flag DOUBLE_COUNT_FLAG]
                    [--chrname-number-flag CHRNAME_NUMBER_FLAG]
                    [--fullbin-output-flag FULLBIN_OUTPUT_FLAG]
                    [--output-file OUTPUT_FILE]

python DI score calc.

optional arguments:
  -h, --help            show this help message and exit
  --input-file INPUT_FILE
                        input *.gz file (required)
    --input-format INPUT_FORMAT
                        default cov, format: cov (covNorm) or pkl (numpy
                        pickled array)
  --chrname CHRNAME     target chromosome (required)
  --fai-file FAI_FILE   FAI file for chr size (required)
  --resolution RESOLUTION
                        bin resolution (default 40kb)
  --window-size WINDOW_SIZE
                        window size (default 2Mb)
  --double-count-flag DOUBLE_COUNT_FLAG
                        default False, if True, apply /2 to matrix
  --chrname-number-flag CHRNAME_NUMBER_FLAG
                        default False, if True chr1 -> 1 (X:23, Y:24, M:25
  --fullbin-output-flag FULLBIN_OUTPUT_FLAG
                        default True, if False, print from starter bin
  --output-file OUTPUT_FILE
                        output DI score bedgraph file. if None, print

Input format preparation:

  • covNorm (cov)
write.table(final_df, file=gzfile("covnorm.gz"), row.names=FALSE, col.names=TRUE, sep="\t", quote=FALSE)
  • numpy pickled array (pkl): Check Output section of covNorm github for 'contact_map' generation.
import pickle
import gzip

pickle.dump(contact_map, gzip.open('np_pkl_arr.gz', 'wb'))

Citation

[1] Dixon JR, Selvaraj S, Yue F, Kim A, Li Y, Shen Y, Hu M, Liu JS, Ren B. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature, 485, 376–380 (2012). https://doi.org/10.1038/nature11082

About

Python DI score computation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages