Skip to content

asleofn/APD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 

Repository files navigation

APD ( Advanced Probiotics species Detection )

This python script is developed for probiotic species detection from metagenomic sample using whole metagenome shotgun sequencing. This method using the read mapping coverage for species detection and representative genome sequence for each species and sub-species were selected based on all pairwise comparison of all available complete genomes for lactic acid generating bacteria.

[Prerequisites]

- python2	(https://www.python.org/)
- bowtie2	(http://bowtie-bio.sourceforge.net/bowtie2/index.shtml)
- SAMtools	(http://samtools.sourceforge.net/)
- bedtools	(https://bedtools.readthedocs.io/en/latest/)
- TMAP  	(https://github.com/iontorrent/TS/tree/master/Analysis/TMAP) * for Ion Torrent platform

 # Executables of all program have to be added to ystem Path.

[Installation]

  1. Download pre-builted index database.

  2. Extract the database index file.

     mkdir Database && mkdir Database/proportion
     
     (for Illumina platform)
     tar -zxvf 132_reference_database_bowtie2_index.tar.gz -C Database
     tar -zxvf 23_proportion_database_bowtie2_index.tar.gz -C Database/proportion
     
     (for Ion Torrent platform)
     tar -zxvf 132_reference_database_TMAP_index.tar.gz -C Database
     tar -zxvf 23_proportion_database_TMAP_index.tar.gz -C Database/proportion
    

[Basic Usage]

1. Execute the pipeline.

	python APD.py

2. Select analysis. 
	
	"Analysis (identification or proportion):" --> Select type the Analysis. (Example: identification)

3. Select Sequencing Platform (Current support Illumina, Ion torrent)
	
	"platform(illumina or ion_torrent):" --> Select the platform for the sequencing file.(Example: illumina)

4. Prefix of the Result file
	
	"OUTPUT_FILE name:" --> Specify the prefix of output file name. (Example: Result_product_A)

5. Path of rawdata
	
	"fastq_1 dir:" --> Select forward fastq file.(Example: probiotics_product_A_1.fastq)
	"fastq_2 dir:" --> Select reverse fastq file.(Example: probiotics_product_A_2.fastq)

[Output]

1.Identification
	
(example) Result_product_A.txt

52676,Lactobacillus_acidophilus,GCF_000389675.2 	0.998884
38578,Streptococcus_thermophilus,GCF_900094135.1 	0.976236
48963,Enterococcus_faecalis,GCF_001886675.1 		0.940688
54998,Bifidobacterium_bifidum,GCF_000164965.1 		0.940438
52829,Lactobacillus_helveticus,GCF_000525715.1 		0.935566
53113,Lactobacillus_paracasei,GCF_001514415.1 		0.809714
53140,Lactobacillus_reuteri,GCF_001046835.1 		0.804741
49823,Lactococcus_lactis,GCF_000006865.1 		0.350231

...

Column 1 : Representative genome of LAB
Column 2 : mapping coverage

* mapping coverage > 0.7137, it is judged to be identified.
* In the above example, L.lactis is not included in the product (mapping coverage : 0.350231 < 0.7137).

2.Proportion

The proportion can be calculated through the Result_product_A.genomecov file, see published article M&M for details.

About

Advanced Probiotics species Detection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published