Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggplot2 error when generating mosdepth amplicon plot with Swift v2 primers #169

Closed
rrdavis77 opened this issue Apr 1, 2021 · 5 comments
Closed
Labels
bug Something isn't working
Milestone

Comments

@rrdavis77
Copy link

Running the viralrecon pipeline (version 1.1.0) using docker. The kit used to make the libraries is the swift v2 (additional coverage) panel. When trying to generate the mosdepth amplicon plots ggplot2 throws an error:

Error: Dimensions exceed 50 inches (height and width are specified in 'cm' not pixels). If you're sure you want a plot that big, use limitsize = FALSE`.
Backtrace:

  1. └─ggplot2::ggsave(...)
  2. └─ggplot2:::plot_dim(...)
    Execution halted`

@drpatelh directed me to the correct area to modify:

ggsave(file=outfile, plot, height=3+(0.3*length(unique(sample_dat$region))), width=16, units="cm")

I changed the following line from:
ggsave(file=outfile, plot, height=3+(0.3*length(unique(sample_dat$region))), width=16, units="cm")
to
ggsave(file=outfile, plot, height=3+(0.3*length(unique(sample_dat$region))), width=16, units="cm",limitsize = FALSE)

adding limitsize = FALSE like the error suggested allowed the pipeline to complete. However, it generates a huge plot
all_samples.trim.amplicon.regions.heatmap.pdf
This amplicon panel has 345 primers vs the ~200 primers for ARTIC v3. For ARTIC the plot breaks down the the genome in 98 amplicons as the primers have clear names. For the Swift primers the naming convention of the primers might be throwing off the "grouping" of amplicons.

@drpatelh
Copy link
Member

drpatelh commented Apr 1, 2021

Thanks @rrdavis77 ! The left and right primers should really be collapsed before that plot is generated to get the coverage across the entire primer region. Can you try re-running with --amplicon_left_suffix L --amplicon_right_suffix R because those are the bits in the names of the primers that distinguish left from right? Alternatively, you can just replace the suffixes in the Swift v2 primers to match the ARTIC set i.e. _LEFT and _RIGHT and then you will be able to use the default parameters. You will know you have got it right when you only have 1 region per primer pair in the plot.

See parameter docs. Not sure if you will still get the error but worth a shot.

@cjfields
Copy link

cjfields commented Apr 1, 2021

@rrdavis77 I set these in my config file as @drpatelh mentions for Swift v2 primers, which seems to work for me:

params {
    amplicon_bed="/home/groups/hpcbio/data/viralrecon/reference/NC_045512.2.v2.primers.bed"
    amplicon_fasta="/home/groups/hpcbio/data/viralrecon/reference/NC_045512.2.v2.primers.fna"
    genome="NC_045512.2"
    amplicon_left_suffix="F"
    amplicon_right_suffix="R"
    // other settings here ...
}

I did a test run comparing trimming before and after, which worked quite well.

@rrdavis77
Copy link
Author

Still getting the error when running it as so:
nextflow run nf-core/viralrecon --input /home/ryan/SWIFT_test/samplelist.csv --genome 'NC_045512.2' --protocol amplicon --amplicon_bed /home/ryan/SWIFT_test/NC_045512.2.v2.primers.bed --amplicon_fasta /home/ryan/SWIFT_test/NC_045512.2.v2.primers.fna --skip_assembly --skip_markduplicates -profile docker --max_memory '8.GB' --max_cpus 8 --amplicon_left_suffix L --amplicon_right_suffix R

same error and still failing to collapse the primer pairs

@rrdavis77
Copy link
Author

my bad, just looked at trim.amplicon.regions.bed.gz files in mosdepth and realized in wassn't scrubbing the Forward primers because i should have written as --amplicon_left_suffix F --amplicon_right_suffix R

@drpatelh drpatelh reopened this Apr 27, 2021
@drpatelh drpatelh added this to the 2.0 milestone Apr 27, 2021
@drpatelh drpatelh added the bug Something isn't working label Apr 27, 2021
@drpatelh
Copy link
Member

I have pushed a proper fix so the pipeline is able to deal with larger amplicon schemes if required (see 9707937)

This now means we shouldn't get this error because I have provided limitsize = FALSE by default. I have also added an extra parameter to the R plotting script called --regions_prefix that can be used to santise the amplicon names used in the plot, especially when they have long, common prefixes that make the plotting difficult. I'm looking at you Mr SWIFT 👀

Users can provide their own --regions_prefix value dependent on the primer scheme they are using via a custom.config and pass to the pipeline with -c custom.config. We are essentially just appending that argument to the default arguments used by the plotting module as defined here.

params {
  modules {
    'illumina_plot_mosdepth_regions_amplicon' {
      args = '--input_suffix .regions.bed.gz --regions_prefix covid19genome_200-29703_'
    }
  }
}

If the SWIFT protocol was being used below is an example plot BEFORE the custom.config is supplied to the pipeline:

image

AFTER the custom.config is supplied to the pipeline:

image

drpatelh added a commit that referenced this issue Apr 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants