An integrative deep learning framework for functional genomic data inference.
A project from the Churchman Lab, Harvard Medical School Department of Genetics.
Based on: [http://biorxiv.org/content/early/2016/10/17/081380.full.pdf]
Ongoing:
- Generalized data preparation pipeline
- GUI interface
- Installation and Quick Start
- Input File Details
- HMS Orchestra HPC Instructions
The quick start can be done on a local machine, an HPC environment is more desirable however.
NOTE: Requires python 2.7 and pip. Anaconda can be a nuisance, make sure to comment out any "export PATH"s to Anaconda in your ~/.bash_profile or ~/.bashrc and then re-source it (or even restart current terminal session):
$ sudo easy_install pip
$ sudo pip install virtualenv
$ git clone https://github.com/ueser/FIDDLE.git
$ sudo virtualenv venvFIDDLE
$ source venvFIDDLE/bin/activate
$ pip install -r requirements.txt
$ cd FIDDLE/
$ mkdir -p data/hdf5datasets/
Place the following datasets in /FIDDLE/data/hdf5datasets/
WARNING: several gb of data
$ cd fiddle
Documentation Interlude
There are two (of many) methods to examine FIDDLE's internal documentation and docstrings:
$ python
>>> import main # or any other FIDDLE Python script
>>> help(main)
$ python main.py --help
$ python main.py
$ python visualization.py
$ python analysis.py
Change directories to FIDDLE/results/ < --runName (default = experiment) > /. The training trajectory visualization files (.png and .gif) are found in this directory. The representations and predictions created in step 5 are found in the hdf5 files "representations.h5" and "predictions.h5".
Change directories to FIDDLE/fiddle and instantiate a jupter notebook session, start up the 'predictions_visualization.ipynb' and follow the instructions outlined in the Markdown cells.
To download Jupyter Notebook, start here: http://jupyter.readthedocs.io/en/latest/install.html.
$ jupyter notebook
For more complete instructions on file types and FIDDLE's work flow, open up the 'guide.ipynb' jupyter notebook.
$ cd FIDDLE/fiddle
$ jupyter notebook
$ bsub -Is -q interactive bash
$ cd FIDDLE/
$ module load dev/tensorflow/1.0-GPU
Orchestra's Tensorflow module does not play nice with virtual environments, the module above must be loaded before instantiating and then sourcing a virtual environment. More here: https://wiki.med.harvard.edu/Orchestra/PersonalPythonPackages
$ virtualenv venvFIDDLE --system-site-packages
$ source venvFIDDLE/bin/activate
$ vim requirements.txt
tensorflow==1.0.1 --> #tensorflow==1.0.1
$ pip install -r requirements.txt
A template for submission lies in FIDDLE/fiddle/, modify accordingly. More on GPU usage here: https://wiki.med.harvard.edu/Orchestra/OrchestraNvidiaGPUs.
$ vim orchestra_job_submit.sh
$ bash orchestra_job_submit.sh