A hybrid deep convolutional neural network for predicting chromatin accessibility
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
demo
src
LICENSE
README.md

README.md

Deopen

Deopen is a hybrid deep learning based framework to automatically learn the regulatory code of DNA sequences and predict chromatin accessibility.

Requirements

  • h5py
  • hickle
  • Scikit-learn=0.18.2
  • Theano=0.8.0
  • Lasagne=0.2.dev1
  • nolearn=0.6.0

Installation

Download Deopen by

git clone https://github.com/kimmo1019/Deopen

Installation has been tested in a Linux/MacOS platform with Python2.7.

Instructions

Preprocessing data for model training

python Gen_data.py <options> -pos <positive_bed_file> -neg <negative_bed_file> -out <outputfile>
Arguments:
  positive_bed_file: positive samples (bed format)
  e.g. chr1	9995	10995	
       chr3	564753	565753
       chr7	565935	566935
       
  negative_bed_file: negative samples (bed format)
  e.g. chr1	121471114	121472114	
       chr2	26268350	26269350
       chr5	100783702	100784702
  
  outputfile: preprocessed data for model training (hkl format)
 
Options:
  -l <int> length of sequence (default: 1000)

Run Deopen classification model

THEANO_FLAGS='device=gpu,floatX=float32' python Deopen_classification.py -in <inputfile> -out <outputfile>
 Arguments:  
  inputfile: preprocessed data for model training (hkl format)  
  outputfile: prediction outcome to be saved (hkl format)

Run Deopen regression model

THEANO_FLAGS='device=gpu,floatX=float32' python Deopen_regression.py -in <inputfile> -reads <readsfile> -out <outputfile>
 Arguments:  
  inputfile: preprocessed file containing different features (hkl format)  
  readsfile: reads count for each sample (hkl format)  
  outputfile: trained model to be saved (hkl format)

Citation

Liu Q, Xia F, Yin Q, et al. Chromatin accessibility prediction via a hybrid deep convolutional neural network[J]. Bioinformatics, 2017, 1: 7.

License

This project is licensed under the MIT License - see the LICENSE.md file for details