Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-SCV2 amplicon run returns consensus genomes with no low-coverage masking #420

Open
ddomman opened this issue Feb 27, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@ddomman
Copy link

ddomman commented Feb 27, 2024

Description of the bug

Viralrecon has worked perfectly for our SCV2 and some hybrid capture protocols (with the metagenomic side). However, when I ran the pipeline by passing a custom bed file and fasta reference for RSV, the default pipeline produced consensus genomes that have no low coverage mask or Ns. It appears the bcftools consensus pipeline IS substituting the variants but for all low coverage areas, the reference base is given rather than Ns.

Switching over to the iVar consensus option ( --variant_caller ivar, --consensus_caller ivar), pipeline produces correct consensus genomes with low coverage areas masked with Ns.

Command used and terminal output

No response

Relevant files

No response

System information

No response

@ddomman ddomman added the bug Something isn't working label Feb 27, 2024
@ddomman ddomman changed the title Non-SCV2 amplicon run returns consensus genomes matching reference genome Non-SCV2 amplicon run returns consensus genomes with no low-coverage masking Feb 27, 2024
@svarona
Copy link
Contributor

svarona commented May 9, 2024

Hi @ddomman ! We are using viralrecon with amplicon RSV data too and it masks perfectly the consensus using ivar as variant caller and bcftools as consensus genome generator. We would need to replicate you specific analysis. Would you mind to send us the files you used and the command to run viralrecon?

@chocogangsta
Copy link

Halo @svarona

could you please inform me of the specific commands you input to conduct the analysis for RSV? I've been attempting it myself but without success. Could you kindly provide the commands used for the Illumina and Nanopore platforms, if possible?

@svarona
Copy link
Contributor

svarona commented Jun 6, 2024

Hi @chocogangsta. Viralrecon can't run on RSV nanopore data, because it uses ARTIC protocol, which works only for SARS-CoV-2. For Illumina sequencing these are the commands we're using:

nextflow run nf-core-viralrecon-2.6.0/workflow/main.nf \
          --input samplesheet.csv \
          --outdir EPI_ISL_1653999_viralrecon_mapping \
          --fasta RSV/EPI_ISL_1653999.fasta \
          --gff RSV/EPI_ISL_1653999.gff \
          --primer_bed merged_1653999_scheme.bed \
          --primer_fasta merged_primers.fasta \
          --nextclade_dataset_name 'rsv_b' \
          --nextclade_dataset false \
          --nextclade_dataset_tag '2023-10-02T12:00:00Z' \
          --platform illumina \
          --protocol amplicon \
          --variant_caller ivar \
          --consensus_caller bcftools \
          --skip_pangolin \
          -resume

@svarona
Copy link
Contributor

svarona commented Jun 6, 2024

@ddomman We've seen cases were bcftools won't substitute variants neither mask the consensus genome, rendering a consensus genome exactly the same as the reference genome. Which can be due to non linux new line characters on the reference .fasta used (usually due to its edition on microsoft word). I assume that this one is not your case as you say that it is replacing variants properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants