Skip to content

IcarPA-TBlab/MetagenomicDC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MetagenomicDC

MetagenomicDC: Metagenomic Data Classifier based on deep learning models

A 16S short-read sequences classification tool based on k-mer representation and deep learning architecture (CNN and DBN). It can be integrated into the most common pipelines for metagenomic analysis. It can be successfully used for classifying both SG and AMP data.

Dependences:

  • Python (2.7.x)
  • Theano (0.8.2)
  • Keras library (2.x)

Datasets

This distribution contains three datasets. They are available in "data" folder.

  • The "1000seq.fasta" dataset has been used for experiments in the manuscript submitted for pubblication at BMC Bioinformatics journal. It is composed by a training dataset with 1000 16S fasta sequences in unaligned fasta format, with 100 genera and 10 species of each genus. All of them are extracted from Rfam database (release 11, update 5 dated September 30, 2016).
  • The "16S-AMP-trimmed.fa.zip" dataset contains simulated amplicon technology short-reads (with the Grinder tool in V3-V4 hypervariable region using the following primers: CCTACGGGAGGCAGCAG and CCGTCAATTCMTTTRAGT), from "1000seq.fasta" dataset.
  • The "16S-SG-reads.fa.zip" dataset contains simulated shotgun short-reads (with the Grinder tool), from "1000seq.fasta" dataset.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages