Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLAST Database error #321

Closed
aweimann opened this issue Apr 3, 2017 · 1 comment
Closed

BLAST Database error #321

aweimann opened this issue Apr 3, 2017 · 1 comment

Comments

@aweimann
Copy link

aweimann commented Apr 3, 2017

Hi Andrew,

I'm running Roary on three genomes, which I annotated with Prokka. Apparently the BLAST steps fails, although the program still finishes. The combined genes from the three genomes end up as the core genome. Do you have any idea what's going on there?

Cheers, Aaron

roary -f roary_out -e -n -v prokka*out/*gff

2017/04/03 15:14:46 Output directory name exists already so adding a timestamp to the end
2017/04/03 15:14:46 Output directory created: roary_out_1491225286
2017/04/03 15:14:46 Fixing input GFF files
2017/04/03 15:14:49 Extracting proteins from GFF files
Extracting proteins from /home/aaron/prokka_out/PROKKA_04032017.gff
Extracting proteins from /home/aaron/prokka_PA14_out/PROKKA_04032017.gff
Extracting proteins from /home/aaron/prokka_PAO_out/PROKKA_04032017.gff
Combine proteins into a single file
Iteratively run cd-hit
Parallel all against all blast
BLAST Database error: No alias or index file found for protein database [/home/aaron/roary_out_1491225286/J3RyMjOrGu/output_contigs] in search path [/home/aaron/roary_out_1491225286::]
Cluster with MCL
2017/04/03 15:15:06 Running command: pan_genome_post_analysis -o clustered_proteins -p pan_genome.fa -s gene_presence_absence.csv -c _clustered.clstr --output_multifasta_files -i /home/aaron/roary_out_1491225286/vhFjmCR9Ey//_gff_files -f /home/aaron/roary_out_1491225286/vhFjmCR9Ey//_fasta_files -t 11 --dont_create_rplots -v --mafft -j Local --processors 1 --group_limit 50000 -cd 99
Use of uninitialized value in require at /usr/lib/perl/5.18/Encode.pm line 60.
2017/04/03 15:15:06 Reinflate clusters
2017/04/03 15:15:06 Split groups with paralogs
2017/04/03 15:15:07 Labelling the groups
2017/04/03 15:15:07 Transfering the annotation to the groups
2017/04/03 15:15:11 Creating accessory binary gene presence and absence fasta
2017/04/03 15:15:11 Creating accessory binary gene presence and absence tree
2017/04/03 15:15:11 The input file is too small so not creating a tree
2017/04/03 15:15:11 Creating accessory gene presence and absence clusters
2017/04/03 15:15:11 Theres no accessory binary file so skipping accessory binary clustering
2017/04/03 15:15:11 Creating the spreadsheet with gene presence and absence
2017/04/03 15:15:19 Creating summary statistics of the spreadsheet
2017/04/03 15:15:24 Creating tab files for R
2017/04/03 15:15:25 Create EMBL files
2017/04/03 15:15:26 Creating files with the nucleotide sequences for every cluster
2017/04/03 15:15:34 Cleaning up files
Aligning each cluster
Use of uninitialized value in require at (eval 2091) line 1.
2017/04/03 15:15:34 Running command: pan_genome_core_alignment -cd 99
2017/04/03 15:15:34 pan_genome_core_alignment -cd 99

--------------------- WARNING ---------------------
MSG: Got a sequence without letters. Could not guess alphabet

--------------------- WARNING ---------------------
MSG: Got a sequence without letters. Could not guess alphabet

--------------------- WARNING ---------------------
MSG: Got a sequence without letters. Could not guess alphabet

Output of roary -a is

2017/04/03 15:07:41 Looking for 'Rscript' - found /usr/bin/Rscript
2017/04/03 15:07:41 Determined Rscript version is 3.0
2017/04/03 15:07:41 Looking for 'awk' - found /usr/bin/awk
2017/04/03 15:07:41 Looking for 'bedtools' - found /usr/bin/bedtools
2017/04/03 15:07:41 Determined bedtools version is 2.17
2017/04/03 15:07:41 Looking for 'blastp' - found /usr/bin/blastp
2017/04/03 15:07:41 Determined blastp version is 2.2.28
2017/04/03 15:07:41 Looking for 'grep' - found /bin/grep
2017/04/03 15:07:41 Optional tool 'kraken' not found in your $PATH
2017/04/03 15:07:41 Optional tool 'kraken-report' not found in your $PATH
2017/04/03 15:07:41 Looking for 'mafft' - found /usr/bin/mafft
Use of uninitialized value in concatenation (.) or string at /usr/local/share/perl/5.18.2/Bio/Roary/External/CheckTools.pm line 129.
2017/04/03 15:07:42 Determined mafft version is
2017/04/03 15:07:42 Looking for 'makeblastdb' - found /usr/bin/makeblastdb
2017/04/03 15:07:42 Determined makeblastdb version is 2.2.28
2017/04/03 15:07:42 Looking for 'mcl' - found /usr/bin/mcl
2017/04/03 15:07:42 Determined mcl version is 12-135
2017/04/03 15:07:42 Looking for 'parallel' - found /usr/bin/parallel
2017/04/03 15:07:42 Determined parallel version is 20130922
2017/04/03 15:07:42 Looking for 'prank' - found /usr/bin/prank
2017/04/03 15:07:42 Looking for 'sed' - found /bin/sed
2017/04/03 15:07:42 Looking for 'cdhit' - found /usr/bin/cdhit
2017/04/03 15:07:42 Determined cdhit version is 4.6
2017/04/03 15:07:42 Looking for 'fasttree' - found /usr/bin/fasttree
2017/04/03 15:07:42 Determined fasttree version is 2.1
2017/04/03 15:07:42 Roary version 3.8.0
2017/04/03 15:07:42 Error: You need to provide at least 2 files to build a pan genome

@aweimann
Copy link
Author

aweimann commented Apr 5, 2017

It appears to work now. Maybe the path to the Blast db wasn't set properly before.

@aweimann aweimann closed this as completed Apr 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant