FUNpore is designed to do frame-shift correction of nanopore 1D/2D reads (fasta format) and then carried out taxonomic annotation of the corrected reads with taxator-tk and KRAKEN. Please read this manual carefully to avoid unnecessary errors for FUNpore implementation.
Author: Yu XIA
Email: shuixia100@gmail.com
version 1.0
ruby 2.3.1p112
python2.7
samtools 1.9
R and library: foreach, doParallel, seqinr, plyr, data.table, Rsamtools
GNU parallel
blast+ 2.8.1+ or higher
git clone https://github.com/sustc-xylab/FUNpore.git
cd FUNpore
Once download FUNpore package, all needed analysis is wrapped up in one executable named FUNpore.sh
The FUNpore_CONFIG contains the PATH for database required for FUNpore, this file should always be stored in the same directory with FUNpore.sh.
Before runing FUNpore, users should modify FUNpore_CONFIG with their specific database PATH
lastal, one copy of last-938 included in FUNpore package
taxator-tk, one copy of taxator-tk included in FUNpore package
KRAKEN, one copy of KRAKEN 0.10.6-unreleased included in FUNpore package
Downloading comprehensive database usually takes quite long time. please stay patient :)
** LASTAL database **
we recommend run lastal against NCBI refseq_protein database, you may download the blast-formatted refseq_protein database fron NCBI and then extract fasta from the blast-formatted database downloaded, finally build last index with refseq_protein
1. download the blast-formatted refseq_protein database
perl $PATH_to_FUNpore/bin/update_blastdb.pl --passive --decompress refseq_protein
2. extracted the fasta from the blast-formatted refseq_protein database
$PATH_to_FUNpore/bin/blastdbcmd -db refseq_protein -dbtype prot -entry all -outfmt "%f" -out refseq_protein.fa
3. build last index with the fasta file extracted
$PATH_to_FUNpore/bin/last-938/src/lastdb -Q 0 -P 50 -p -cR01 -v refseq_protein_lastdb refseq_protein.fa
4. specify the path and name of the last database in FUNpore_CONFIG
** NCBI nt database for BLAST+ and nodes.dmp of corresponding taxonomy **
1. download NCBI preformatted nt database with:
perl $PATH_to_FUNpore/bin/update_blastdb.pl --passive --decompress nt
specify the name of the nt database in FUNpore_CONFIG
2. download NCBI taxonomy with:
wget ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz
Place the files names.dmp and nodes.dmp in a folder and specify its path in FUNpore_CONFIG. keep in mind that the taxonomy files are modified on a regular basis.
** Default database of KRAKEN **
To create the standard Kraken database, you can use the following command:
$PATH_to_FUNpore/bin/libexec/kraken/kraken-build --standard --threads 24 --db $DBNAME
Replace "$DBNAME" with your preferred database name/location.
specify the path of the KRAKEN database in FUNpore_CONFIG
** Lineage databse converted from NCBI taxonomy **
wget https://gitlab.com/zyxue/ncbitax2lin-lineages/blob/master/lineages-2019-02-20.csv.gz
gunzip lineages-2019-02-20.csv.gz
mv lineages-2019-02-20.csv $PATH_to_FUNpore/database
There are two steps to run FUNpore as below. Please NOTICE, since FUNpore will overwrite intermediate files from previous run, as a result, each working directory can only be used for ONE instance of FUNpore run. Additionally, please use bash instead of sh to initiate FUNpore.
step1 create a working directory for FUNpore run and copy the test.fa to the created directory
mkdir testdir
cp test.fa testdir
step2 run FUNpore in the directory created, $PATH is the FUNpore installing directory
cd testdir
bash $PATH_to_FUNpore/FUNpore.sh -f ./test.fa -t 20 -c 200000 > FUNpore.log
all output files of FUNpore are stored in a directory named $YOUR_TARGET_FASTA_NAME_FUNpore_nowtime, this directory will include:
input_framecorrect.fa frame-shift corrected nanopore reads
input_framecorrect.summary.csv summary of the frame-shift correction
input_framecorrect.fa_taxa.tab phylogenetic assignment of frame-shift corrected reads by taxator-tk(nt based homology search) and KRAKEN (kmer search)
If you use FUNpore in your nanopore dataset analysis please cite:
Xia, Y., Li, A.-D., Deng, Y., Jiang, X.-T., Li, L.-G., and Zhang, T. (2017) MinION Nanopore Sequencing Enables Correlation between Resistome Phenotype and Genotype of Coliform Bacteria in Municipal Sewage. Front Microbiol 8: 2105.
Tools included in FUNpore should be also cited, these tools includes:
last, blast+, taxator-tk, kraken, samtools, GNU parallel, ruby, R