Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When should we use Metaviral-spades instead of Meta-spades, given that it is designed for complete genomes? #1205

Closed
1 task done
actledge opened this issue Nov 7, 2023 · 0 comments

Comments

@actledge
Copy link

actledge commented Nov 7, 2023

Description of bug

Hi,

Perhaps I have misunderstood, it seems that the paper mentions metaviral-spades also outputs partial contigs from metaspades?
"To give users an option to examine both complete viral sequences (identified based on analyzing small subgraphs of the METASPADES assembly graphs) and partial viral sequences (corresponding to METASPADES contigs), the VIRALASSEMBLY output is combined with the regular METASPADES output."
"we compared VIRALASSEMBLY against METASPADES on 18 real datasets described in Supplementary Table S3. We analyzed only complete (i.e. circular contigs or linear contigs starting in sources and ending in sinks) and high-coverage (>5×) sequences for benchmarking (VIRALASSEMBLY and METASPADES report the same set of partial contigs)"

But another issue mentioned that metaviral-spades can only output "complete" viral sequences when analyzing a metagenomic dataset? I have also found that when using metaviral-spades, many datasets only yield single-digit scaffolds, and in 25 out of 100 metagenomic test datasets that I downloaded for testing, errors occurred due to the absence of complete sequences. In comparison to metaspades, the number of virus sequences assembled and high-quality virus sequences (evaluated by CheckV and genomad) using metaviral-spades is several times fewer.

In the above mentioned issue1106, the author suggested using metaspades to reassemble the metagenome reads because metaviral only outputs complete genomes. This has left me puzzled because it seems that complete genomes are very rare in metagenomes, and outputting only them may not meet the needs for metagenomic assembly. Therefore, I would like to inquire about the usual purposes and circumstances under which we would use metaviral-spades. Or in other words, when I have a metagenomic dataset and I want to assemble to obtain virus sequences for my analysis, how should I typically choose between metaspades and metaviral-spades?

Thanks!

spades.log

Just a question, not a bug

params.txt

spades.py --metaviral -1 test_1.fastq -2 test_2.fastq -t 7 -o spades

SPAdes version

v3.15.5

Operating System

CentOS 3.10.0-1160.15.2.el7.x86_64

Python Version

No response

Method of SPAdes installation

download the github release

No errors reported in spades.log

  • Yes
@ablab ablab locked and limited conversation to collaborators Nov 7, 2023
@asl asl converted this issue into discussion #1206 Nov 7, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant