Skip to content

YDaiLab/PopPhy-CNN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PopPhy-CNN

PopPhy-CNN,a novel convolutional neural networks (CNN) learning architecture that effectively exploits phylogentic structure in microbial taxa. PopPhy-CNN provides an input format of 2D matrix created by embedding the phylogenetic tree that is populated with the relative abundance of microbial taxa in a metagenomic sample. This conversion empowers CNNs to explore the spatial relationship of the taxonomic annotations on the tree and their quantitative characteristics in metagenomic data.

Publication:

  • Reiman D, Metwally AA, Sun J, Dai Y. PopPhy-CNN: A Phylogenetic Tree Embedded Architecture for Convolutional Neural Networks to Predict Host Phenotype From Metagenomic Data. IEEE J Biomed Health Inform. 2020 Oct;24(10):2993-3001. doi: 10.1109/JBHI.2020.2993761. Epub 2020 May 11. PMID: 32396115. [paper]

Execution:

We provide a python environment which can be imported using the Conda python package manager.

Deep learning models are built using Tensorflow. PopPhy-CNN has been updated to use Tensorflow v1.14.0.

To fully utilize GPUs for faster training of the deep learning models, users will need to be sure that both CUDA and cuDNN are properly installed.

Other dependencies should be downloaded upon importing the provided environment.

Clone Repository

git clone https://github.com/YDaiLab/PopPhy-CNN.git
cd PopPhy-CNN

Import Conda Environment

conda env create -f PopPhy.yml
source activate PopPhy
cd src

Set Configuration Parameters:

Edit config.py to customize your PopPhy-CNN execution. Datasets need to be placed in their own folder within the data/ directory. There needs to be an abundance file in which each column is a sample and each row is a taxon structured following the example below:

k__Bacteria|p__Actinobacteria|c__Actinobacteria|o__Actinomycetales|f__Actinomycetaceae|g__Actinomyces|s__Actinomyces_graevenitzii

In this example, the taxa is Actinomyces graevenitzii and comes from the Bacteria kingdom, Actinobacteria phylum, Actinobacteria class, Actinomycetales order, Actinoycetaceae family, Actinomyces genus, and graevenitzii species. Note that the 's__' identifier should include the genus and species.

Run PopPhy-CNN:

Once the configuration file is set, PopPhy-CNN is executed with

python train.py

Results are saved in the results directory under a subdirectory with the same name as the dataset's folder.

Visualizing the Results

Cytoscape can be used to visualize the results from PopPhy-CNN's analysis. To do so, install and run Cytoscape. In the results timestamped folder, load the file 'network.json' into cytoscape. Then import the Cytoscape style found 'style.xml' found in the 'cytoscape_style' directory. It may also be useful to install the yFiles layouts and visualize the tree using the yFile radial layout.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages