Skip to content
C++ implementation of the SEM algorithm
C C++
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bin TFMPvalue bin no longer needed Sep 25, 2018
examples moved all to base directory Sep 25, 2018
lib moved all to base directory Sep 25, 2018
src Added option to use genome other than hg19 Apr 29, 2019
.gitignore
.gitmodules Changed submodules from SSH to HTTPS Apr 18, 2019
Makefile Added option to use genome other than hg19 Apr 29, 2019
README.md Update README.md Apr 29, 2019

README.md

SEM_CPP

C++ implementation of the SEM algorithm

System Requirements

Hardware Requirements

Generation of a SEM requires variable RAM and disk storage based on the size of the initial PWM being considered. For minimal performance, we recommend a computer with the following specs:

RAM: 64+ GB
CPU: 8+ cores, 3.4+ GHz/core

The runtime on this minimal system is approximately 38 CPU hours. Compile time is approximately 35 seconds.

Software Requirements

The package development version is tested on Linux operating systems. The developmental version of the package has been tested on the following systems:

Linux: Ubuntu 18.04
Packages: libcurl4-dev

Demo

We include a small of generation of the SEM for HNF4A in HepG2 cells. Execution time of this demo is approximately 6791 seconds on 20 threads. The expected output is:

Running Iterative SEM building..
        PWM: examples/MA0114.1.pwm
        merge_file: examples/wgEncodeOpenChromDnaseHepg2Pk.narrowPeak.gz
        bigwig: examples/wgEncodeHaibTfbsHepg2Hnf4asc8987V0416101RawRep1.bigWig
        TF_name: HNF4A
         output: results/HNF4A/
        cachefile flag: results/HNF4A/HNF4A.cache.db
        verbose
....

Installation

Clone a copy of the SEMpl repository and submodules:

git clone --recurse-submodules https://github.com/Boyle-Lab/SEM_CPP.git

Build external libraries:

cd SEM_CPP/lib/libBigWig
make
cd ..
make
mv */*.so .
cd ..

Symlink to bowtie index location (use your own index location):

ln -s /data/genomes/hg19/bowtie_index/ data

Build SEMpl

make

Usage information

SEMpl runs as an iterative process and requires specific input files (need more details). The following example will build the SEM for HNF4a in HepG2 cells given the example data

./iterativeSEM -PWM examples/MA0114.1.pwm -merge_file examples/wgEncodeOpenChromDnaseHepg2Pk.narrowPeak -big_wig examples/wgEncodeHaibTfbsHepg2Hnf4asc8987V0416101RawRep1.bigWig -TF_name HNF4A -genome data/hg19 -output results/HNF4A

Testing

Run "make test" to compile and run this input example.

You can’t perform that action at this time.