NGSPanPipe

pipeline.pl -- A perl script for pangenome identification.

ABSTRACT:

We report here a one step pipeline to perform pan genome construction. This pipeline employing a novel approach uses raw reads as input in FastQ format, collapsing them and generating contigs so as to give maximum genome coverage and a matrix file depicting presence/absence of genes present in reference genome and also acquired genes in binary format. This script will also give a novel.txt file with reads that can be analyzed further for identification of novel genes or pathways.

PREREQUISITE:

NGSPanPipe is platform independent. Before execution, it requires input from the command line such as reads in FASTQ format (zipped or unzipped), reference sequence file (FASTA format), parameter for filtering of reads (number of strains a read should be present in to be considered as real read, it has been taken as n=5 for test data) and reference genome protein translation table (PTT) file (*.ptt format as present in NCBI) to which the aligned reads are to be mapped. The user should have perl and BWA (Burrows-Wheeler Aligner) tool installed in his system. The reference file and ptt file for desired organism has to be placed in the script folder.

IMPLEMENTATION:

The single script pipeline (pipeline.pl) integrates multiple perl scripts. Users have to download NGSPanPipe and execute pipeline.pl by the command 'perl pipeline.pl'. This generates output panmatrix.txt in binary format and coverage.txt gives pangenome coverage. One additional file 'novel.txt' is also obtained as output containing unannotated novel reads.

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
README.md		README.md
align-nt.pl		align-nt.pl
align2.pl		align2.pl
collapse1.pl		collapse1.pl
comaprison.xls		comaprison.xls
comp.pl		comp.pl
comparison.xls		comparison.xls
count1.pl		count1.pl
ext-read.pl		ext-read.pl
index.pl		index.pl
mapped.pl		mapped.pl
matrix-nt.pl		matrix-nt.pl
matrix.pl		matrix.pl
matrix1-nt.pl		matrix1-nt.pl
matrix1.pl		matrix1.pl
matrix_final-nt.pl		matrix_final-nt.pl
matrix_final.pl		matrix_final.pl
pipeline.pl		pipeline.pl
prepare.pl		prepare.pl
prepare1.pl		prepare1.pl
ptt.pl		ptt.pl
ptt1-nt.pl		ptt1-nt.pl
ptt1.pl		ptt1.pl
ptt2.pl		ptt2.pl
ptt_count.pl		ptt_count.pl
ref.pl		ref.pl
sam.pl		sam.pl
split.pl		split.pl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NGSPanPipe

About

Releases

Packages

Languages

Biomedinformatics/NGSPanPipe

Folders and files

Latest commit

History

Repository files navigation

NGSPanPipe

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages