This repository contains several utility tools for ChIP-seq and other epigenome analysis. These programs are written in ANCI C and C++11 using Boost library.
We recommend to use the latest Docker image from DockerHub.
To use docker command, type:
docker pull rnakato/ssp_drompa
docker run -it --rm rnakato/ssp_drompa <command>
Note: When using the docker image, it is necessary to mount the directory by -v
option to access the input files as follows:
docker run -it --rm -v $(pwd):/mnt rnakato/ssp_drompa parse2wig+ \
-i /mnt/ChIP.bam -o ChIP --odir /mnt/parse2wigdir+ --gt /mnt/genometable.txt
This command mounts the current directory to /mnt
directory in the container.
Please see also the document of Docker.
Singularity can also be used to execute the docker image:
singularity build ssp_drompa.sif docker://rnakato/ssp_drompa
singularity exec ssp_drompa.sif <command>
Singularity mounts the current directory automatically. If you access the files in the other directory, please mount by --bind
option, for instance:
singularity exec --bind /work ssp_drompa.sif <command>
This command mounts /work
directory.
for Ubuntu:
sudo apt install git build-essential libboost-all-dev libcurl4-gnutls-dev ibgtkmm-3.0-dev \
libgsl-dev liblzma-dev libz-dev libbz2-dev libgzstream0 libgzstream-dev cmake
for CentOS:
sudo yum -y install git gcc-c++ boost-devel
On Mac:
brew install gsl gtk gtkmm pkgconfig curl xz zlib boost cmake gzstream
git clone --recursive https://github.com/rnakato/ChIPseqTools.git
cd ChIPseqTools
make
export PATH = $PATH:(PATH_TO_ChIPseqTools)/ChIPseqTools/bin
git pull origin master
git submodule foreach git pull origin master
output shared peaks between bed1 and bed2
compare_bs -1 <bed1> -2 <bed2> -and
output bed1-unique peaks
compare_bs -1 <bed1> -2 <bed2> -not
output stats only
compare_bs -1 <bed1> -2 <bed2> -and -nobs
include neighboring peaks within 5kbp as overlaped peaks
compare_bs -1 -2 -and -l 5000
consider peak summit (default: whole peak region)
compare_bs -1 -2 -and -maxposi
gene (gtf) and peaks
compare_bed2tss -g <gtf> -b <peak> --gt <genome table>
gene (refFlat) and peaks
compare_bed2tss -g <refFlat> --refFlat -b <peak> --gt <genome table>
comare with gene body
compare_bed2tss -g <gtf> -b <peak> --gt <genome table> --mode 1
proportion of peaks against whole genome
compare_bed2tss -g <gtf> -b <peak> --gt <genome table> --mode 2
proportion of peaks against whole genome (distinguish exon and instron)
compare_bed2tss -g <gtf> -b <peak> --gt <genome table> --mode 2 --intron
Usage: compare_bed2loop [option] --bed1 <1st bed> --bed2 <2nd bed> --loop <loop file> -o <output> -gt <genome_table>
Options:
--bed1 arg 1st bed file
--bed2 arg 2nd bed file
--loop arg Loop file
-l [ --length ] arg (=0) Extend length for overlap
--gt arg Genome table (tab-delimited file describing the name
and length of each chromosome)
--nobs do not output the overlapped loop list
--hiccups HICCups format as input (default: Mango)
-h [ --help ] print this message
mergebed2CRM -i <bs file> -name <name> [-i <bs file> -name <name> ...]
-l: extend length (default:0)
-n: number of peaks for clustering (default:3000, setting 0 means use all peaks)
-qnt: quantitative analysis
Repeat analysis.
FRiR [option] -r <repeatfile> -i <inputfile> -o <output> --gt <genome_table>
is the RepeatMasker file downloaded from the UCSC genome browser. FRiR can allow a gzipped repeat file.