Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with merging paired ends using mergeAmpliconReads() #18

Open
APepey opened this issue Jan 25, 2022 · 1 comment
Open

Error with merging paired ends using mergeAmpliconReads() #18

APepey opened this issue Jan 25, 2022 · 1 comment

Comments

@APepey
Copy link

APepey commented Jan 25, 2022

Hello! Thank you for all your work!
I have a question about one the HaplotypR functions.

I am working on MiSeq data with the HaplotypR package. I have 35 samples with five P. falciparum markers: ama1-d3, cpmp, csp, cpp and msp7.
I could proceed with the following steps using an R code I received from the WEHI in December 2019:

  • demultiplexing by sample
  • demultiplexing by marker
  • read quality per marker

I am then trying to merge paired ends but I am encountering different issues depending on the syntax I am trying:

Syntax 1:

>   merged_reads <- 
+     marker_deplex %>%
+     select(SampleID, SampleName, MarkerID) %>%
+     bind_cols(mergeAmpliconReads(marker_deplex$FileR1, 
+                                                         marker_deplex$FileR2,
+                                                         outputDir = 'merge_reads'))
Processing file 3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_F.fastq.gz and 3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_R.fastq.gz ...1
Warning in if (method == "vsearch") { :
  the condition has length > 1 and only the first element will be used
Warning in system(call, intern = TRUE) :
  running command '"C:/Users/apepey/Documents/R/R-4.1.2/library/Rvsearch/vsearch" --fastq_mergepairs "marker_deplex/3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_F.fastq.gz" --reverse "marker_deplex/3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_R.fastq.gz" --fastqout "merge_reads/3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_merge.fastq.gz" --fastq_truncqual 1 --fastq_maxns 0 --fastq_allowmergestagger' had status 1
Error: Input/Output
  no input files found
  dirPath: merge_reads/3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_merge.fastq.gz
  pattern: character(0)

Syntax 2:

>   merged_reads <- 
+     marker_deplex %>%
+     mergeAmpliconReads(fastqFileR1 = marker_deplex$FileR1, 
+                                         fastqFileR2 = marker_deplex$FileR2,
+                                         outputDir = 'merge_reads')
Processing file 3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_F.fastq.gz and 3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_R.fastq.gz ...1
Warning in if (method == "vsearch") { :
  the condition has length > 1 and only the first element will be used
Warning in if (method == "NGmerge") { :
  the condition has length > 1 and only the first element will be used
Error in FUN(X[[i]], ...) : 
  Merge c("3D7-A", "3D7-A", "3D7-A", "3D7-A", "3D7-A", "3D7-B", "3D7-B", "3D7-B", "3D7-B", "3D7-B", "M0-176", "M0-176", "M0-176", "M0-176", "M0-176", "M0-316", "M0-316", "M0-316", "M0-316", "M0-316", "M0-441", "M0-441", "M0-441", "M0-441", "M0-441", "M0-445", "M0-445", "M0-445", "M0-445", "M0-445", "M0-5", "M0-5", "M0-5", "M0-5", "M0-5", "M0-561", "M0-561", "M0-561", "M0-561", "M0-561", "M0-562", "M0-562", "M0-562", "M0-562", "M0-562", "M0-687", "M0-687", "M0-687", "M0-687", "M0-687", "M0-801", "M0-801", 
"M0-801", "M0-801", "M0-801", "M0-96", "M0-96", "M0-96", "M0-96", "M0-96", "M1-316", "M1-316", "M1-316", "M1-316", "M1-316", "M1-561", "M1-561", "M1-561", "M1-561", "M1-561", "M1-687", "M1-687", "M1-687", "M1-687", "M1-687", "M1-801", "M1-801", "M1-801", "M1-801", "M1-801", "M1-831", "M1-831", "M1-831", "M1-831", "M1-831", "M11-441", "M11-441", "M11-441", "M11-441", "M11-441", "M11-792", "M11-792", "M11-792", "M11-792", "M11-792", "M2-445", "M2-445", "M2-445", "M2-445", "M2-445

Syntax 3:

>   merged_reads <-
+     marker_deplex %>%
+     select(SampleID, SampleName, MarkerID) %>% 
+     bind_cols(mergeAmpliconReads(fastqFileR1 = as.character(marker_deplex$FileR1), 
+                                                         fastqFileR2 = as.character(marker_deplex$FileR2),
+                                                         outputDir = './merge_reads'))
Processing file 3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_F.fastq.gz and 3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_R.fastq.gz ...1
Warning in if (method == "vsearch") { :
  the condition has length > 1 and only the first element will be used
Warning in system(call, intern = TRUE) :
  running command '"C:/Users/apepey/Documents/R/R-4.1.2/library/Rvsearch/vsearch" --fastq_mergepairs "marker_deplex/3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_F.fastq.gz" --reverse "marker_deplex/3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_R.fastq.gz" --fastqout "./merge_reads/3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_merge.fastq.gz" --fastq_truncqual 1 --fastq_maxns 0 --fastq_allowmergestagger' had status 1
Error: Input/Output
  no input files found
  dirPath: ./merge_reads/3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_merge.fastq.gz
  pattern: character(0)

Syntax 4:

>   merged_reads <-
+     marker_deplex %>%
+     mergeAmpliconReads(as.character(marker_deplex$FileR1), 
+                                         as.character(marker_deplex$FileR2),
+                                         outputDir = './merge_reads')
Error in mergeAmpliconReads(., as.character(marker_deplex$FileR1), as.character(marker_deplex$FileR2),  : 
  Vector length of fastqFileR1 and fastqFileR2 not identical

What could I try to merge paired ends?
Any help is appreciated, thank you.
My marker_table is available here: https://gist.github.com/APepey/054e40b9a10f4b25987dc73b3421e2f5

Here is my session info:

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 14393)

Matrix products: default

Random number generation:
 RNG:     Mersenne-Twister 
 Normal:  Inversion 
 Sample:  Rounding 
 
locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] Rvsearch_0.99.1             cowplot_1.1.1               dada2_1.21.0                Rcpp_1.0.7                  HaplotypR_0.3.3            
 [6] devtools_2.4.3              usethis_2.1.5               ShortRead_1.52.0            GenomicAlignments_1.30.0    SummarizedExperiment_1.24.0
[11] Biobase_2.54.0              MatrixGenerics_1.6.0        matrixStats_0.61.0          Rsamtools_2.10.0            GenomicRanges_1.46.1       
[16] Biostrings_2.62.0           GenomeInfoDb_1.30.0         XVector_0.34.0              IRanges_2.28.0              S4Vectors_0.32.3           
[21] BiocParallel_1.28.3         BiocGenerics_0.40.0         pheatmap_1.0.12             forcats_0.5.1               stringr_1.4.0              
[26] dplyr_1.0.7                 purrr_0.3.4                 tidyr_1.1.4                 tibble_3.1.6                ggplot2_3.3.5              
[31] tidyverse_1.3.1             readr_2.1.1                

loaded via a namespace (and not attached):
 [1] colorspace_2.0-2       hwriter_1.3.2          ellipsis_0.3.2         rprojroot_2.0.2        fs_1.5.2               rstudioapi_0.13       
 [7] farver_2.1.0           remotes_2.4.2          bit64_4.0.5            fansi_0.5.0            lubridate_1.8.0        xml2_1.3.3            
[13] cachem_1.0.6           knitr_1.37             pkgload_1.2.4          jsonlite_1.7.3         broom_0.7.11           dbplyr_2.1.1          
[19] png_0.1-7              compiler_4.1.2         httr_1.4.2             backports_1.4.1        assertthat_0.2.1       Matrix_1.4-0          
[25] fastmap_1.1.0          cli_3.1.0              htmltools_0.5.2        prettyunits_1.1.1      tools_4.1.2            gtable_0.3.0          
[31] glue_1.6.0             GenomeInfoDbData_1.2.7 reshape2_1.4.4         cellranger_1.1.0       vctrs_0.3.8            xfun_0.29             
[37] ps_1.6.0               testthat_3.1.1         rvest_1.0.2            lifecycle_1.0.1        zlibbioc_1.40.0        scales_1.1.1          
[43] vroom_1.5.7            hms_1.1.1              parallel_4.1.2         RColorBrewer_1.1-2     yaml_2.2.1             memoise_2.0.1         
[49] latticeExtra_0.6-29    stringi_1.7.6          desc_1.4.0             pkgbuild_1.3.1         rlang_0.4.12           pkgconfig_2.0.3       
[55] bitops_1.0-7           evaluate_0.14          lattice_0.20-45        labeling_0.4.2         bit_4.0.4              tidyselect_1.1.1      
[61] processx_3.5.2         plyr_1.8.6             magrittr_2.0.1         R6_2.5.1               snow_0.4-4             generics_0.1.1        
[67] DelayedArray_0.20.0    DBI_1.1.2              pillar_1.6.4           haven_2.4.3            withr_2.4.3            RCurl_1.98-1.5        
[73] modelr_0.1.8           crayon_1.4.2           utf8_1.2.2             tzdb_0.2.0             rmarkdown_2.11         jpeg_0.1-9            
[79] grid_4.1.2             readxl_1.3.1           callr_3.7.0            reprex_2.0.1           digest_0.6.29          RcppParallel_5.1.5    
[85] munsell_0.5.0          sessioninfo_1.2.2  

Thank you again!

@lerch-a
Copy link
Owner

lerch-a commented Jan 27, 2022

Thanks for the information.
First, I can not give you support for the code you got from WEHI (code using %>%), but I can help you with the merging reads.
The code from WEHI combines multiple step into one step and makes it very difficult understand. It is far better to split up the code if you have issues. So just try to run the following lines individually and also read my instructions to Run HaplotypR on https://github.com/lerch-a/HaplotypR.

procReadsMerge <- mergeAmpliconReads(fastqFileR1 = as.character(marker_deplex$FileR1), fastqFileR2 = as.character(marker_deplex$FileR2), outputDir = 'merge_reads')

merged_reads <- cbind(marker_deplex[,c("SampleID", "SampleName","BarcodePair", "MarkerID")], procReadsMerge)

Make sure that the folder 'merge_reads' exists in getwd() !

If you still have issues, then please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants