Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mmseqs prefilter error: database has wrong type #34

Closed
ShailNair opened this issue Sep 18, 2023 · 2 comments
Closed

mmseqs prefilter error: database has wrong type #34

ShailNair opened this issue Sep 18, 2023 · 2 comments

Comments

@ShailNair
Copy link

Hi,
I am trying to annotate virus contigs ( 5kb and above) identified via virsorter2 and deepvirfinder. However the mmseqs prefilter throws the following error:

[14:07:34] Executing genomad annotate.
[14:07:34] Previous execution detected. Steps will be skipped unless their outputs are not found. Use the --restart option to force the execution of all the steps again.
[14:07:34] final.vcontigs.fixed_proteins.faa was found. Skipping gene prediction with prodigal-gv.
Traceback (most recent call last):
  File "/home/user/miniconda3/envs/genomad/lib/python3.8/site-packages/genomad/mmseqs2.py", line 190, in run_mmseqs2
    subprocess.run(command, stdout=fout, stderr=fout, check=True)
  File "/home/user/miniconda3/envs/genomad/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['mmseqs', 'prefilter', PosixPath('0.6.viral_taxo/0.2.genomad/final.vcontigs.fixed_annotate/final.vcontigs.fixed_mmseqs2/query_db/query_db'), PosixPath('/home/user/database/genomad-1.5/genomad_db'), PosixPath('0.6.viral_taxo/0.2.genomad/final.vcontigs.fixed_annotate/final.vcontigs.fixed_mmseqs2/search_db/prefilter_db'), '--threads', '30', '-s', '4.2', '--split', '0', '--split-mode', '0', '--max-seqs', '10000000', '--min-ungapped-score', '25', '-k', '5']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/miniconda3/envs/genomad/bin/genomad", line 10, in <module>
    sys.exit(cli())
  File "/home/user/miniconda3/envs/genomad/lib/python3.8/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/user/miniconda3/envs/genomad/lib/python3.8/site-packages/rich_click/rich_group.py", line 21, in main
    rv = super().main(*args, standalone_mode=False, **kwargs)
  File "/home/user/miniconda3/envs/genomad/lib/python3.8/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/user/miniconda3/envs/genomad/lib/python3.8/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/user/miniconda3/envs/genomad/lib/python3.8/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/user/miniconda3/envs/genomad/lib/python3.8/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/user/miniconda3/envs/genomad/lib/python3.8/site-packages/genomad/cli.py", line 441, in annotate
    genomad.annotate.main(
  File "/home/user/miniconda3/envs/genomad/lib/python3.8/site-packages/genomad/modules/annotate.py", line 203, in main
    mmseqs2_obj.run_mmseqs2(threads, sensitivity, evalue, splits)
  File "/home/user/miniconda3/envs/genomad/lib/python3.8/site-packages/genomad/mmseqs2.py", line 193, in run_mmseqs2
    raise Exception(f"'{command_str}' failed.") from e
Exception: 'mmseqs prefilter 0.6.viral_taxo/0.2.genomad/final.vcontigs.fixed_annotate/final.vcontigs.fixed_mmseqs2/query_db/query_db /home/user/database/genomad-1.5/genomad_db 0.6.viral_taxo/0.2.genomad/final.vcontigs.fixed_annotate/final.vcontigs.fixed_mmseqs2/search_db/prefilter_db --threads 30 -s 4.2 --split 0 --split-mode 0 --max-seqs 10000000 --min-ungapped-score 25 -k 5' failed.

I checked the mmseqs2.log and it says Input database has the wrong type (Generic):

Time for merging to query_db: 0h 0m 0s 8ms
Database type: Aminoacid
Time for processing: 0h 0m 0s 124ms
prefilter 0.6.viral_taxo/0.2.genomad/final.vcontigs.fixed_annotate/final.vcontigs.fixed_mmseqs2/query_db/query_db /home/user/database/genomad-1.5/genomad_db 0.6.viral_taxo/0.2.genomad/final.vcontigs.fixed_annotate/final.vcontigs.fixed_mmseqs2/search_db/prefilter_db --threads 30 -s 4.2 --split 0 --split-mode 0 --max-seqs 10000000 --min-ungapped-score 25 -k 5 

MMseqs Version:           	14.7e284
Substitution matrix       	aa:blosum62.out,nucl:nucleotide.out
Seed substitution matrix  	aa:VTML80.out,nucl:nucleotide.out
Sensitivity               	4.2
k-mer length              	5
k-score                   	seq:2147483647,prof:2147483647
Alphabet size             	aa:21,nucl:5
Max sequence length       	65535
Max results per query     	10000000
Split database            	0
Split mode                	0
Split memory limit        	0
Coverage threshold        	0
Coverage mode             	0
Compositional bias        	1
Compositional bias        	1
Diagonal scoring          	true
Exact k-mer matching      	0
Mask residues             	1
Mask residues probability 	0.9
Mask lower case residues  	0
Minimum diagonal score    	25
Selected taxa             	
Include identical seq. id.	false
Spaced k-mers             	1
Preload mode              	0
Pseudo count a            	substitution:1.100,context:1.400
Pseudo count b            	substitution:4.100,context:5.800
Spaced k-mer pattern      	
Local temporary path      	
Threads                   	30
Compressed                	0
Verbosity                 	3

Input database "/home/user/database/genomad-1.5/genomad_db" has the wrong type (Generic).

Allowed input:
- Index
- Nucleotide
- Profile
- Aminoacid

I tried by re-downloading the database, and changing the output directory but had the same error.
The database files were manually downloaded and extracted to /home/user/database/genomad-1.5
Environment info

genomad --version
geNomad, version 1.7.0  (installed through conda)

 mmseqs version
14.7e284

database =1.5

ls /home/user/database/genomad-1.5
genomad_db
genomad_hmm_v1.5  
genomad_metadata_v1.5.tsv  
genomad_msa_v1.5  
mmseqs_vrefseq  
version.txt
@ShailNair
Copy link
Author

my bad. the database should be /home/user/database/genomad-1.5/genomad_db.

@apcamargo
Copy link
Owner

No worries! Let me know if you have any other questions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants